Skip to content
GCC AI Research

Search

Results for "Quran"

Quranic Conversations: Developing a Semantic Search tool for the Quran using Arabic NLP Techniques

arXiv ·

Researchers developed a semantic search tool for the Quran using Arabic NLP techniques. The tool was trained on a dataset of over 30 tafsirs (interpretations) of the Quran. Using the SNxLM model and cosine similarity, the tool identifies Quranic verses most relevant to a user's query, achieving a similarity score of up to 0.97. Why it matters: This tool could significantly improve access to the Quran's teachings for Arabic speakers and researchers, providing a valuable resource for religious study and understanding.

QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus

arXiv ·

The Qatar Computing Research Institute (QCRI) has released QASR, a 2,000-hour transcribed Arabic speech corpus collected from Aljazeera news broadcasts. The dataset features multi-dialect speech sampled at 16kHz, aligned with lightly supervised transcriptions and linguistically motivated segmentation. QCRI also released a 130M word dataset to improve language model training. Why it matters: QASR enables new research in Arabic speech recognition, dialect identification, punctuation restoration, and other NLP tasks for spoken data.

Quinoa-quest to feed the world

KAUST ·

A KAUST-led research team sequenced the first high-quality quinoa genome. This achievement may enhance our ability to feed the world's growing population. The research was conducted at King Abdullah University of Science and Technology. Why it matters: This breakthrough in genomics could lead to more resilient and nutritious crops, contributing to global food security efforts.

Assessing Large Language Models on Islamic Legal Reasoning: Evidence from Inheritance Law Evaluation

arXiv ·

The paper introduces a benchmark of 1,000 multiple-choice questions to evaluate LLMs on Islamic inheritance law ('ilm al-mawarith). Seven LLMs were tested, with o3 and Gemini 2.5 achieving over 90% accuracy, while ALLaM, Fanar, LLaMA, and Mistral scored below 50%. Error analysis revealed limitations in handling structured legal reasoning. Why it matters: This research highlights the challenges and opportunities for adapting LLMs to complex, culturally-specific legal domains like Islamic jurisprudence.

New research to boost global date fruit production

KAUST ·

KAUST researchers are undertaking a project to improve global date palm production and protection by studying the date palm genome, collecting samples from ancient palms near Madinah. They aim to develop new breeding strategies for faster, healthier, and more pest-resistant palms. The research involves advanced genome sequencing and the creation of molecular tools to improve date palm agriculture, including rapid sex determination methods and gene editing. Why it matters: This research is critical for enhancing date production in arid regions like Saudi Arabia, which is a major global producer, and for ensuring food security amidst climate challenges.

KAUST and Umm Al-Qura University strengthen academic and technical collaboration

KAUST ·

KAUST and Umm Al-Qura University (UQU) have signed a memorandum of understanding (MoU) to collaborate in education, training, scientific research, and professional development. The MoU includes developing joint training programs, updating curricula, providing consultancy, and organizing workshops. The partnership aims to support academic and technological advancement, enhance national talent, and align with Saudi Vision 2030. Why it matters: This collaboration strengthens Saudi Arabia's knowledge-based economy by integrating KAUST's research environment with another major university.

Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

arXiv ·

This paper introduces an enhanced Dense Passage Retrieval (DPR) framework tailored for Arabic text retrieval. The core innovation is an Attentive Relevance Scoring (ARS) mechanism that improves semantic relevance modeling between questions and passages, replacing standard interaction methods. The method integrates pre-trained Arabic language models and architectural refinements, achieving improved retrieval and ranking accuracy for Arabic question answering. Why it matters: This work addresses the underrepresentation of Arabic in NLP research by providing a novel approach and publicly available code to improve Arabic text retrieval, which can benefit various applications like Arabic search engines and question-answering systems.

SectEval: Evaluating the Latent Sectarian Preferences of Large Language Models

arXiv ·

The paper introduces SectEval, a new benchmark to evaluate sectarian biases in LLMs concerning Sunni and Shia Islam, available in English and Hindi. Results show significant inconsistencies in LLM responses based on language, with some models favoring Shia responses in English but Sunni in Hindi. Location-based experiments further reveal that advanced models adapt their responses based on the user's claimed country, while smaller models exhibit a consistent Sunni-leaning bias.