From FusHa to Folk: Exploring Cross-Lingual Transfer in Arabic Language Models
arXiv · · Notable
Summary
This paper explores cross-lingual transfer in Arabic language models, which are typically pretrained on Modern Standard Arabic (MSA) but expected to generalize to diverse dialects. The study uses probing on 3 NLP tasks and representational similarity analysis to assess transfer effectiveness. Results show transfer is uneven across dialects, partially linked to geographic proximity, and models trained on all dialects exhibit negative interference. Why it matters: The findings highlight challenges in cross-lingual transfer for Arabic NLP and raise questions about dialect similarity for model training.
Keywords
Arabic language models · cross-lingual transfer · dialects · NLP · MSA
Get the weekly digest
Top AI stories from the GCC region, every week.