The paper introduces ALLaM, a series of large language models for Arabic and English, designed to support Arabic Language Technologies. The models are trained with language alignment and knowledge transfer in mind, using a decoder-only architecture. ALLaM achieves state-of-the-art results on Arabic benchmarks like MMLU Arabic and Arabic Exams. Why it matters: This work advances Arabic NLP by providing high-performing LLMs and demonstrating effective techniques for cross-lingual transfer learning and alignment with human preferences.
Researchers from MBZUAI have released MobiLlama, a fully transparent open-source 0.5 billion parameter Small Language Model (SLM). MobiLlama is designed for resource-constrained devices, emphasizing enhanced performance with reduced resource demands. The full training data pipeline, code, model weights, and checkpoints are available on Github.
This paper presents a UI-level evaluation of ALLaM-34B, an Arabic-centric LLM developed by SDAIA and deployed in the HUMAIN Chat service. The evaluation used a prompt pack spanning various Arabic dialects, code-switching, reasoning, and safety, with outputs scored by frontier LLM judges. Results indicate strong performance in generation, code-switching, MSA handling, reasoning, and improved dialect fidelity, positioning ALLaM-34B as a robust Arabic LLM suitable for real-world use.
The Technology Innovation Institute (TII) in Abu Dhabi has launched Falcon 3, a new series of open-source large language models. Falcon 3 models range in size from 1B to 10B parameters and have been trained on 14 trillion tokens. Falcon 3 achieved the top spot on Hugging Face's LLM leaderboard for models under 13 billion parameters. Why it matters: This release democratizes access to high-performance AI by enabling efficient operation on laptops and light infrastructure, solidifying the UAE's position as a leader in open-source AI development.
MBZUAI President Eric Xing led a global collaboration to develop Vicuna, an LLM alternative to GPT-3 addressing the unsustainable costs of training LLMs. OpenAI CEO Sam Altman acknowledged Abu Dhabi's role in the global AI conversation, building off of achievements like Vicuna. Xing and colleagues are publishing research at MLSys 2023 on "cross-mesh resharding" to improve computer communication in deep learning, aiming for low-carbon, affordable, and miniaturized AI. Why it matters: This research signals a push towards sustainable AI development in the region, emphasizing efficiency and reduced environmental impact.
Researchers have introduced LlamaLens, a specialized multilingual LLM designed for analyzing news and social media content. The model addresses domain specificity and multilinguality, with a focus on news and social media in Arabic, English, and Hindi. LlamaLens was evaluated on 18 tasks represented by 52 datasets, outperforming the state-of-the-art on 23 testing sets. Why it matters: This work contributes a valuable resource for multilingual NLP research, particularly in the context of analyzing news and social media content across diverse languages.
The authors introduce Nile-Chat, a collection of LLMs (4B, 3x4B-A6B, and 12B) specifically for the Egyptian dialect, capable of understanding and generating text in both Arabic and Latin scripts. A novel language adaptation approach using the Branch-Train-MiX strategy is used to merge script-specialized experts into a single MoE model. Nile-Chat models outperform multilingual and Arabic LLMs like LLaMa, Jais, and ALLaM on newly introduced Egyptian benchmarks, with the 12B model achieving a 14.4% performance gain over Qwen2.5-14B-Instruct on Latin-script benchmarks; all resources are publicly available. Why it matters: This work addresses the overlooked aspect of adapting LLMs to dual-script languages, providing a methodology for creating more inclusive and representative language models in the Arabic-speaking world.