Search

Results for "small models"

Knowledge distillation and the greening of LLMs

MBZUAI · Invalid Date

Researchers from MBZUAI, University of British Columbia, and Monash University have created LaMini-LM, a collection of small language models distilled from ChatGPT. LaMini-LM is trained on a dataset of 2.58M instructions and can be deployed on consumer laptops and mobile devices. The smaller models perform almost as well as larger counterparts while addressing security concerns. Why it matters: This work enables the deployment of LLMs in resource-constrained environments and enhances data security by reducing reliance on cloud-based LLMs.

Sadeed: Advancing Arabic Diacritization Through Small Language Model

arXiv · Apr 30

The paper introduces Sadeed, a fine-tuned decoder-only language model based on the Kuwain 1.5B Hennara model, for improved Arabic text diacritization. Sadeed is fine-tuned on high-quality diacritized datasets and achieves competitive results compared to larger proprietary models. The authors also introduce SadeedDiac-25, a new benchmark for fairer evaluation of Arabic diacritization across diverse text genres. Why it matters: This work advances Arabic NLP by providing both a competitive diacritization model and a more robust evaluation benchmark, facilitating further research and development in the field.

Alumni Spotlight: Making AI accessible for all

MBZUAI · Invalid Date

MBZUAI alumnus Ahmed Sharshar is developing smaller AI models to make the technology more accessible, especially in resource-constrained environments like Egypt. His master's thesis involved creating an app that assesses lung health using mobile phone video analysis, eliminating the need for traditional medical devices. Sharshar is pursuing his Ph.D. at MBZUAI, focusing on lightweight and energy-efficient models for various applications. Why it matters: Democratizing AI through smaller, efficient models can enable broader applications and innovation across diverse sectors in the Middle East and beyond.

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

arXiv · Feb 26

Researchers from MBZUAI have released MobiLlama, a fully transparent open-source 0.5 billion parameter Small Language Model (SLM). MobiLlama is designed for resource-constrained devices, emphasizing enhanced performance with reduced resource demands. The full training data pipeline, code, model weights, and checkpoints are available on Github.

RightNow-Arabic-0.5B-Turbo: An Open Sub-1B Arabic Language Model via Vocabulary Injection and Edge-First Deployment

arXiv · Apr 10

RightNow-Arabic-0.5B-Turbo is a new 518M-parameter Arabic-specialized decoder LLM, built on Qwen2.5-0.5B, designed to bridge the gap between small multilingual and large Arabic-specialized models. Its development pipeline included adding 27,032 Arabic tokens via vocabulary injection, continued pretraining on 504M Arabic tokens, and fine-tuning with supervised instruction and direct preference optimization. The model achieved a 35.9% mean accuracy on three Arabic benchmarks (COPA-ar, Arabic HellaSwag, ArabicMMLU), outperforming all same-class open models and recovering 67% of SILMA-9B's mean accuracy at 1/18 the parameters, with all code and weights publicly released. Why it matters: This model significantly advances efficient Arabic NLP by providing a powerful, specialized sub-1B LLM suitable for edge deployment, making advanced Arabic AI more accessible and performant on resource-constrained devices.

Falcon 3: UAE’s Technology Innovation Institute Launches World’s most Powerful Small AI Models that can also be run on Light Infrastructures, including Laptops

TII · Mar 17

The Technology Innovation Institute (TII) in Abu Dhabi has launched Falcon 3, a new series of open-source large language models. Falcon 3 models range in size from 1B to 10B parameters and have been trained on 14 trillion tokens. Falcon 3 achieved the top spot on Hugging Face's LLM leaderboard for models under 13 billion parameters. Why it matters: This release democratizes access to high-performance AI by enabling efficient operation on laptops and light infrastructure, solidifying the UAE's position as a leader in open-source AI development.