Skip to content
GCC AI Research

Search

Results for "Nile-Chat"

Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts

arXiv ·

The authors introduce Nile-Chat, a collection of LLMs (4B, 3x4B-A6B, and 12B) specifically for the Egyptian dialect, capable of understanding and generating text in both Arabic and Latin scripts. A novel language adaptation approach using the Branch-Train-MiX strategy is used to merge script-specialized experts into a single MoE model. Nile-Chat models outperform multilingual and Arabic LLMs like LLaMa, Jais, and ALLaM on newly introduced Egyptian benchmarks, with the 12B model achieving a 14.4% performance gain over Qwen2.5-14B-Instruct on Latin-script benchmarks; all resources are publicly available. Why it matters: This work addresses the overlooked aspect of adapting LLMs to dual-script languages, providing a methodology for creating more inclusive and representative language models in the Arabic-speaking world.

Egyptian AI startup Nanovate raises $1m pre-seed funding round - - Disrupt Africa

GCC AI Startup ·

Nanovate, an Egyptian AI startup, has raised $1 million in pre-seed funding. The round was led by বিনিয়োগ, with participation from angel investors. The company plans to use the funds to expand its AI-powered solutions across various sectors. Why it matters: The funding will enable Nanovate to further develop its AI capabilities and expand its reach in the Egyptian market.

Fanar aims to advance Arabic presence in digital space| Gulf Times - Gulf Times

QCRI ·

The provided article content is missing, preventing a factual summary of its details. Information regarding Fanar's initiative to advance Arabic presence in the digital space could not be extracted. Specific actions, partnerships, or funding related to this endeavor are not available. Why it matters: Without the article content, the significance of Fanar's potential contributions to Arabic digital presence cannot be evaluated.

Egyptian Arabic to English Statistical Machine Translation System for NIST OpenMT'2015

arXiv ·

This paper describes the QCRI-Columbia-NYUAD group's Egyptian Arabic-to-English statistical machine translation system submitted to the NIST OpenMT'2015 competition. The system used tools like 3arrib and MADAMIRA for processing and standardizing informal dialectal Arabic. The system was trained using phrase-based SMT with features such as operation sequence model, class-based language model and neural network joint model. Why it matters: The work demonstrates advances in machine translation for dialectal Arabic, a challenging but important area for regional communication and NLP research.

A Panoramic Survey of Natural Language Processing in the Arab World

arXiv ·

This survey paper reviews the landscape of Natural Language Processing (NLP) research and applications in the Arab world. It discusses the unique challenges posed by the Arabic language, such as its morphological complexity and dialectal diversity. The paper also presents a historical overview of Arabic NLP and surveys various research areas, including machine translation, sentiment analysis, and speech recognition. Why it matters: The survey provides a comprehensive resource for researchers and practitioners interested in the current state and future directions of Arabic NLP, a field critical for enabling AI technologies to serve Arabic-speaking communities.