From Words to Proverbs: Evaluating LLMs Linguistic and Cultural Competence in Saudi Dialects with Absher

arXiv · July 14, 2025 · Significant research

Summary

This paper introduces Absher, a new benchmark for evaluating LLMs' linguistic and cultural competence in Saudi dialects. The benchmark comprises over 18,000 multiple-choice questions spanning six categories, using dialectal words, phrases, and proverbs from various regions of Saudi Arabia. Evaluation of state-of-the-art LLMs reveals performance gaps, especially in cultural inference and contextual understanding, highlighting the need for dialect-aware training.

Keywords

LLM · Arabic · Saudi dialect · benchmark · cultural competence

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

SaudiCulture: A Benchmark for Evaluating Large Language Models Cultural Competence within Saudi Arabia

arXiv · Mar 21

The paper introduces SaudiCulture, a new benchmark for evaluating the cultural competence of LLMs within Saudi Arabia, covering five major geographical regions and diverse cultural domains. The benchmark includes questions of varying complexity and distinguishes between common and specialized regional knowledge. Evaluations of five LLMs (GPT-4, Llama 3.3, FANAR, Jais, and AceGPT) revealed performance declines on region-specific questions, highlighting the need for region-specific knowledge in LLM training.

From Words to Proverbs: Evaluating LLMs Linguistic and Cultural Competence in Saudi Dialects with Absher

Summary

Keywords

Related

SaudiCulture: A Benchmark for Evaluating Large Language Models Cultural Competence within Saudi Arabia