Skip to content

Latest commit

 

History

History
29 lines (22 loc) · 1.29 KB

README.md

File metadata and controls

29 lines (22 loc) · 1.29 KB

Does BERT agree?

Pre-print on arXiv

Learning representations that accurately model semantics is an important goal of natural language processing research. Many semantic phenomena depend on syntactic structure. Recent work examines the extent to which state-of-the-art models for pre-training representations, such as BERT, capture such structure-dependent phenomena, but is largely restricted to one phenomenon in English, number agreement between subjects and verbs. We evaluate BERT's sensitivity to four types of structure dependent agreement relations in a new automatically curated dataset across 26 languages. We show that both the single-language and multilingual BERT models capture syntax-sensitive agreement patterns well in general, but we also highlight the specific linguistic contexts in which its performance degrades.

Contributing

Contributions are welcome! For any bugs, questions, suggested improvements, please start a GitHub issue and we'll take it from there. Alternatively, you can email Geoff Bacon on [email protected].

Authors

Geoff Bacon
Terry Regier
From the Language & Cognition Lab

License

MIT