Assessing Large Language Models on Islamic Legal Reasoning: Evidence from Inheritance Law Evaluation
arXiv ·
The paper introduces a benchmark of 1,000 multiple-choice questions to evaluate LLMs on Islamic inheritance law ('ilm al-mawarith). Seven LLMs were tested, with o3 and Gemini 2.5 achieving over 90% accuracy, while ALLaM, Fanar, LLaMA, and Mistral scored below 50%. Error analysis revealed limitations in handling structured legal reasoning. Why it matters: This research highlights the challenges and opportunities for adapting LLMs to complex, culturally-specific legal domains like Islamic jurisprudence.