Lexical Multilingual Zero-Shot Transfer in Low-Resource Settingsin Formal Semantics: History and Challenges
-
10 October 2022
2:00 PM - MFF UK, Malostranské nám. 25, 4th floor, room S1
I will present two recent works centered around multilingual zero-shot transfer, which occurs when models can solve instances without direct supervision in their target language. First, I will present a model capable of filling in eroded parts in ancient cuneiform tablets written thousands of years ago in Akkadian. We find that zero-shot models do better than monolingual models given the limited training data available for this task, and show their effectiveness in automatic and human evaluations. Motivated by these findings, I will present an experiment of zero-shot performance under balanced data conditions which mitigate corpus size confounds. We show that the choice of pretraining languages vastly affects downstream cross-lingual transfer for BERT-based models, and develop a method of quadratic time complexity in the number of pretraining languages to estimate these inter-language relations. Our findings can inform pretraining configurations in future large-scale multilingual language models. This work was recently awarded an outstanding paper award at NAACL 2022.
--------------------------------------
***The talk will be delivered in person (MFF UK, Malostranské nám. 25, 4th floor, room S1) and will be streamed via Zoom. For details how to join the Zoom meeting, please write to sevcikova@ufal.mff.cuni.cz***
Loading map…
Share event