Skip to content
GCC AI Research

Search

Results for "SawabAI"

MBZUAI team wins government communication competition

MBZUAI ·

A team of MBZUAI graduate students won first place at the UAE University's University Challenge for their SawabAI project, which addresses AI-generated misinformation about climate change. The winning team included Salem Bin Saqer AlMarri, Hanoona Bangalath, and Muhammad Maaz, all Computer Vision Ph.D. students. SawabAI is envisioned as a platform to evaluate the authenticity and source of information, including text, image, and video, to combat fake news. Why it matters: This win highlights the growing importance of AI in addressing misinformation and promoting sustainability in government communication within the region.

Vicuna, Altman, and the importance of green AI

MBZUAI ·

MBZUAI President Eric Xing led a global collaboration to develop Vicuna, an LLM alternative to GPT-3 addressing the unsustainable costs of training LLMs. OpenAI CEO Sam Altman acknowledged Abu Dhabi's role in the global AI conversation, building off of achievements like Vicuna. Xing and colleagues are publishing research at MLSys 2023 on "cross-mesh resharding" to improve computer communication in deep learning, aiming for low-carbon, affordable, and miniaturized AI. Why it matters: This research signals a push towards sustainable AI development in the region, emphasizing efficiency and reduced environmental impact.

ADAB: Arabic Dataset for Automated Politeness Benchmarking -- A Large-Scale Resource for Computational Sociopragmatics

arXiv ·

The paper introduces ADAB (Arabic Politeness Dataset), a new annotated Arabic dataset for politeness detection collected from online platforms. The dataset covers Modern Standard Arabic and multiple dialects (Gulf, Egyptian, Levantine, and Maghrebi). It contains 10,000 samples across 16 politeness categories and achieves substantial inter-annotator agreement (kappa = 0.703). Why it matters: This dataset addresses the under-explored area of Arabic-language resources for politeness detection, which is crucial for culturally-aware NLP systems.

SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

arXiv ·

The paper introduces SalamahBench, a new benchmark for evaluating the safety of Arabic Language Models (ALMs). The benchmark comprises 8,170 prompts across 12 categories aligned with the MLCommons Safety Hazard Taxonomy. Five state-of-the-art ALMs, including Fanar 1 and 2, ALLaM 2, Falcon H1R, and Jais 2, were evaluated using the benchmark. Why it matters: The benchmark enables standardized, category-aware safety evaluation, highlighting the necessity of specialized safeguard mechanisms for robust harm mitigation in ALMs.

UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

arXiv ·

This paper presents a UI-level evaluation of ALLaM-34B, an Arabic-centric LLM developed by SDAIA and deployed in the HUMAIN Chat service. The evaluation used a prompt pack spanning various Arabic dialects, code-switching, reasoning, and safety, with outputs scored by frontier LLM judges. Results indicate strong performance in generation, code-switching, MSA handling, reasoning, and improved dialect fidelity, positioning ALLaM-34B as a robust Arabic LLM suitable for real-world use.