Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect
arXiv · · Significant research
Summary
Researchers developed Atlas-Chat, a collection of LLMs for dialectal Arabic, focusing on Moroccan Arabic (Darija). They constructed an instruction dataset by consolidating existing Darija language resources and translating English instructions. Atlas-Chat models (2B, 9B, 27B) outperform state-of-the-art and Arabic-specialized LLMs like LLaMa, Jais, and AceGPT on Darija NLP tasks. Why it matters: This work addresses the gap in LLM support for low-resource Arabic dialects, providing a methodology for instruction-tuning and benchmarks for future research.
Keywords
Atlas-Chat · Darija · Moroccan Arabic · LLM · Instruction Tuning
Get the weekly digest
Top AI stories from the GCC region, every week.