Skip to content
GCC AI Research

Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs

arXiv · · Significant research

Summary

MBZUAI researchers release 'Fann or Flop', a new benchmark for evaluating Arabic poetry understanding in LLMs. The benchmark covers 12 historical eras and 14 poetic genres, assessing semantic understanding, metaphor interpretation, and cultural context. Evaluation of state-of-the-art LLMs reveals challenges in poetic understanding despite strong performance on standard Arabic benchmarks.

Get the weekly digest

Top AI stories from the GCC region, every week.