Skip to content
GCC AI Research

Search

Results for "cultural understanding"

Advancing cultural diversity through AI

MBZUAI ·

MBZUAI is conducting research to improve cross-cultural understanding using AI, including studying LLM limitations in recognizing cultural references. They developed "Culturally Yours," a tool that helps users comprehend cultural references in text, and the "All Languages Matter Benchmark" (ALM Bench) to evaluate multimodal LLMs across 100 languages. MBZUAI has also developed LLMs tailored to low-resource languages like Jais (Arabic), Nanda (Hindi), and Sherkala (Kazakh). Why it matters: These initiatives promote inclusivity and ensure AI systems are culturally aware and can serve diverse populations effectively, particularly in the Middle East's multicultural context.

Why AI can describe an image but struggles to understand the culture inside it

MBZUAI ·

A new paper from MBZUAI introduces JEEM, a benchmark dataset for evaluating vision-language models on their understanding of images grounded in four Arabic-speaking societies (Jordan, UAE, Egypt, and Morocco) and their ability to use local dialects. The dataset comprises 2,178 images and 10,890 question-answer pairs reflecting everyday life and culturally specific scenes. Evaluation of several Arabic-capable models (Maya, PALO, Peacock, AIN, AyaV) and GPT-4o revealed that while models can generate fluent language, they struggle with genuine understanding, consistency, and relevance, especially when cultural context is important. Why it matters: This research highlights the challenges of building AI systems that can truly understand and interact with diverse cultures, emphasizing the need for culturally grounded datasets and evaluation metrics.

Commonsense Reasoning in Arab Culture

arXiv ·

A new dataset called ArabCulture is introduced to address the lack of culturally relevant commonsense reasoning resources in Arabic AI. The dataset covers 13 countries across the Gulf, Levant, North Africa, and the Nile Valley, spanning 12 daily life domains with 54 fine-grained subtopics. It was built from scratch by native speakers writing and validating culturally relevant questions. Why it matters: The dataset highlights the need for more culturally aware models and benchmarks tailored to the Arabic-speaking world, moving beyond machine-translated resources.

Why AI can describe an image but struggles to understand the culture inside it

MBZUAI ·

MBZUAI researchers release JEEM, a new benchmark dataset for evaluating vision-language models on Arabic dialects. The dataset covers image captioning and visual question answering tasks using images from Jordan, UAE, Egypt, and Morocco. Results show models struggle with cultural understanding and relevance despite fluent language generation.

What LLMs get wrong about culture — and how to fix them: Two studies from NAACL

MBZUAI ·

MBZUAI researchers presented two studies at NAACL 2025 concerning how LLMs understand cultural differences, with one study winning the SAC award. One study, titled "Reading between the lines: Can LLMs identify cross-cultural communication gaps," assesses GPT-4o's ability to identify cultural references in Goodreads book reviews. The researchers created a benchmark dataset using annotations from 50 evaluators across different cultures to measure the LLM's ability to identify culture-specific items (CSIs). Why it matters: Improving LLMs' cross-cultural understanding is crucial for ensuring these models can be used effectively and equitably across diverse global contexts.

Culturally Yours: A new tool for understanding cultural references in text

MBZUAI ·

MBZUAI researchers have developed "Culturally Yours," a reading assistant that highlights and explains culturally-specific items on webpages to help users understand unfamiliar terms. The tool addresses the "cold-start problem" by asking users for demographic information to personalize the identification of potentially unfamiliar cultural references. It was presented at the 31st International Conference on Computational Linguistics in Abu Dhabi. Why it matters: This tool can help bridge linguistic and cultural gaps, particularly for underrepresented languages and cultures, and aid businesses in reaching diverse audiences.

Measuring cultural commonsense in the Arabic-speaking world with a new benchmark

MBZUAI ·

MBZUAI researchers have created ArabCulture, a new benchmark dataset to measure cultural commonsense reasoning capabilities in Arabic language models. The dataset was built by native Arabic speakers from 13 countries and is the largest of its kind. Testing 31 language models, the researchers found that many systems struggle with understanding cultural concepts across the Arab world. Why it matters: The new benchmark addresses a gap in AI, enabling development of culturally-aware AI systems tailored to the nuances of the Arabic-speaking world.

Teaching language models about Arab culture through cross-cultural transfer

MBZUAI ·

MBZUAI researchers presented a method for cross-cultural transfer learning to improve language models' understanding of diverse Arab cultures. They used in-context learning and demonstration-based reinforcement (DITTO) to transfer cultural knowledge between countries. Experiments showed up to 34% improvement in performance on cultural understanding benchmarks using only a few demonstrations. Why it matters: This research addresses the gap in cultural understanding of Arabic language models, especially for smaller Arab countries, and provides a novel transfer learning approach.