Search

Results for "diversity atlas"

Multimodal single-cell atlas for ancestry-based diversity of immune system

MBZUAI · Invalid Date

The Russian Immune Diversity Atlas project aims to profile immune cells from people of different ancestries at a multiomics level. The goal is to reconstruct a reference atlas of the healthy immune system and investigate its perturbations in Type II Diabetes (T2D). The project seeks to identify novel mechanisms and genetic/epigenetic markers for early T2D diagnostics, prognosis, and therapy as part of the international Human Cell Atlas. Why it matters: Addressing genetic diversity in biomedical research, particularly in the context of the Human Cell Atlas, is crucial for personalized medicine and ensuring that treatments are effective across diverse populations in the Middle East and globally.

Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect

arXiv · Sep 26

Researchers developed Atlas-Chat, a collection of LLMs for dialectal Arabic, focusing on Moroccan Arabic (Darija). They constructed an instruction dataset by consolidating existing Darija language resources and translating English instructions. Atlas-Chat models (2B, 9B, 27B) outperform state-of-the-art and Arabic-specialized LLMs like LLaMa, Jais, and AceGPT on Darija NLP tasks. Why it matters: This work addresses the gap in LLM support for low-resource Arabic dialects, providing a methodology for instruction-tuning and benchmarks for future research.

Advancing cultural diversity through AI

MBZUAI · Invalid Date

MBZUAI is conducting research to improve cross-cultural understanding using AI, including studying LLM limitations in recognizing cultural references. They developed "Culturally Yours," a tool that helps users comprehend cultural references in text, and the "All Languages Matter Benchmark" (ALM Bench) to evaluate multimodal LLMs across 100 languages. MBZUAI has also developed LLMs tailored to low-resource languages like Jais (Arabic), Nanda (Hindi), and Sherkala (Kazakh). Why it matters: These initiatives promote inclusivity and ensure AI systems are culturally aware and can serve diverse populations effectively, particularly in the Middle East's multicultural context.

ArabJobs: A Multinational Corpus of Arabic Job Ads

arXiv · Sep 26

The ArabJobs dataset is a new corpus of over 8,500 Arabic job advertisements collected from Egypt, Jordan, Saudi Arabia, and the UAE. The dataset contains over 550,000 words and captures linguistic, regional, and socio-economic variation in the Arab labor market. It is available on GitHub and can be used for fairness-aware Arabic NLP and labor market research.