SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models
arXiv · · Significant research
Summary
The paper introduces SalamahBench, a new benchmark for evaluating the safety of Arabic Language Models (ALMs). The benchmark comprises 8,170 prompts across 12 categories aligned with the MLCommons Safety Hazard Taxonomy. Five state-of-the-art ALMs, including Fanar 1 and 2, ALLaM 2, Falcon H1R, and Jais 2, were evaluated using the benchmark. Why it matters: The benchmark enables standardized, category-aware safety evaluation, highlighting the necessity of specialized safeguard mechanisms for robust harm mitigation in ALMs.
Keywords
Arabic Language Models · Safety · Benchmark · SalamahBench · Evaluation
Get the weekly digest
Top AI stories from the GCC region, every week.