Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs

arXiv · May 23, 2025 · Significant research

Summary

MBZUAI researchers release 'Fann or Flop', a new benchmark for evaluating Arabic poetry understanding in LLMs. The benchmark covers 12 historical eras and 14 poetic genres, assessing semantic understanding, metaphor interpretation, and cultural context. Evaluation of state-of-the-art LLMs reveals challenges in poetic understanding despite strong performance on standard Arabic benchmarks.

Keywords

Arabic poetry · LLM · benchmark · cultural context · semantic understanding

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.