In this assignment, you will take the existing from-scratch implementation of the GPT architecture from lab 2 and modify it to implement the BERT architecture with minimal necessary changes. You will validate your implementation by loading pre-trained BERT weights from [Hugging Face](https://huggingface.co) and verifying that it produces the same input-output behaviour as the official BERT model.
In this lab, you will extend the parser from the basic lab to support labelled parsing. This means that your parser should not only predict that *there is* a syntactic relation between two words but the *type* of this relation – for example, subject or object. You will validate your implementation by comparing the performance of your parser to those reported by [Glavaš and Vulić (2021)](http://dx.doi.org/10.18653/v1/2021.eacl-main.270).
## Instructions
## Instructions
1.**Understand the architecture**
1.**Understand the task**
- Read the page on [universal dependencies](https://universaldependencies.org/u/dep/all.html) to get a rough understanding about syntactic relations.
- Read Section 3 of [Glavaš and Vulić (2021)](http://dx.doi.org/10.18653/v1/2021.eacl-main.270) to see how they compute relation scores.
- Read Section 3 of [Glavaš and Vulić (2021)](http://dx.doi.org/10.18653/v1/2021.eacl-main.270) to see how they compute relation scores.
- You also need to understand how to compute the loss for the relation prediction task.
- You also need to understand how to compute the loss for the relation prediction task.