Skip to content
GCC AI Research

Fanar 2.0: Arabic Generative AI Stack

arXiv · · Significant research

Summary

Hamad Bin Khalifa University (HBKU) has released Fanar 2.0, the second generation of Qatar's Arabic-centric Generative AI platform, built entirely at QCRI. The core of Fanar 2.0 is Fanar-27B, which was continually pre-trained from a Gemma-3-27B backbone using 120 billion high-quality tokens and only 256 NVIDIA H100 GPUs. Fanar 2.0 includes capabilities like FanarGuard, Aura, Oryx, Fanar-Sadiq, Fanar-Diwan, and FanarShaheen for moderation, speech recognition, vision understanding, Islamic content, poetry generation, and translation. Why it matters: This shows that sovereign, resource-constrained AI development in the Arabic language is possible, producing competitive systems in the region.

Keywords

Fanar 2.0 · HBKU · QCRI · Arabic NLP · Generative AI

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

Fanar: An Arabic-Centric Multimodal Generative AI Platform

arXiv ·

Hamad Bin Khalifa University's Qatar Computing Research Institute (QCRI) introduced Fanar, an Arabic-centric multimodal generative AI platform featuring the Fanar Star (7B) and Fanar Prime (9B) Arabic LLMs. These models were trained on nearly 1 trillion tokens and are designed to address different prompts through a custom orchestrator. Fanar includes a customized Islamic RAG system, a Recency RAG, bilingual speech recognition, and an attribution service for content verification, sponsored by Qatar's Ministry of Communications and Information Technology. Why it matters: The platform signifies a major step towards sovereign AI development in Qatar, providing advanced Arabic language capabilities and addressing regional needs.

FanarGuard: A Culturally-Aware Moderation Filter for Arabic Language Models

arXiv ·

The paper introduces FanarGuard, a bilingual moderation filter for Arabic and English language models that considers both safety and cultural alignment. A dataset of 468K prompt-response pairs was created and scored by LLM judges on harmlessness and cultural awareness to train the filter. The first benchmark targeting Arabic cultural contexts was developed to evaluate cultural alignment. Why it matters: FanarGuard advances context-sensitive AI safeguards by integrating cultural awareness into content moderation, addressing a critical gap in current alignment techniques.