Search

Results for "Qiyas exam"

The Qiyas Benchmark: Measuring ChatGPT Mathematical and Language Understanding in Arabic

arXiv · Jun 28

Researchers introduce two new benchmarks, derived from the Qiyas exam, to evaluate mathematical reasoning and language understanding in Arabic. They tested ChatGPT-3.5-turbo and ChatGPT-4, which achieved 49% and 64% accuracy respectively. The new benchmarks aim to address the lack of resources for evaluating Arabic language models.