Skip to content
GCC AI Research

Search

Results for "Doha Historical Dictionary"

Grounding Arabic LLMs in the Doha Historical Dictionary: Retrieval-Augmented Understanding of Quran and Hadith

arXiv ·

Researchers developed a retrieval-augmented generation (RAG) framework to improve Arabic Large Language Models (LLMs) in understanding complex historical and religious texts like the Quran and Hadith. This framework grounds LLMs in the Doha Historical Dictionary of Arabic (DHDA) through hybrid retrieval and intent-based routing. The approach significantly boosted the accuracy of Arabic-native LLMs such as Fanar and ALLaM to over 85%, closing the performance gap with proprietary models like Gemini. Why it matters: This research offers a novel method for enhancing Arabic NLP capabilities for historically nuanced texts, demonstrating the value of integrating diachronic lexicographic resources into RAG systems for deeper language understanding.

Fanar: An Arabic-Centric Multimodal Generative AI Platform

arXiv ·

Hamad Bin Khalifa University's Qatar Computing Research Institute (QCRI) introduced Fanar, an Arabic-centric multimodal generative AI platform featuring the Fanar Star (7B) and Fanar Prime (9B) Arabic LLMs. These models were trained on nearly 1 trillion tokens and are designed to address different prompts through a custom orchestrator. Fanar includes a customized Islamic RAG system, a Recency RAG, bilingual speech recognition, and an attribution service for content verification, sponsored by Qatar's Ministry of Communications and Information Technology. Why it matters: The platform signifies a major step towards sovereign AI development in Qatar, providing advanced Arabic language capabilities and addressing regional needs.

Fanar 2.0 a major leap in Arabic AI technology - The Peninsula Qatar

QCRI ·

Qatar Computing Research Institute (QCRI) has released Fanar 2.0, a new version of its open-source Arabic language processing toolkit. Fanar 2.0 includes improved models for named entity recognition, part-of-speech tagging, and dependency parsing. The toolkit is designed to support researchers and developers working on Arabic NLP applications. Why it matters: This release enhances the accessibility of advanced Arabic NLP tools, crucial for developing AI solutions tailored to the Arabic-speaking world.