From FusHa to Folk: Exploring Cross-Lingual Transfer in Arabic Language Models
arXiv · · Significant research
Summary
Arabic Language Models (LMs) are primarily pretrained on Modern Standard Arabic (MSA), with an expectation of transferring to diverse Arabic dialects for real-world applications. This work explores cross-lingual transfer in Arabic LMs using probing on three Natural Language Processing (NLP) tasks and representational similarity. The findings indicate that transfer is possible but disproportionate across dialects, with some evidence of negative interference in models trained to support all Arabic dialects. Why it matters: This research highlights crucial challenges for building robust Arabic AI systems that effectively handle the significant linguistic diversity of the Arab world.
Keywords
Arabic Language Models · Cross-Lingual Transfer · Arabic Dialects · Modern Standard Arabic · Natural Language Processing
Get the weekly digest
Top AI stories from the GCC region, every week.