This paper introduces an AI framework for autonomous assessment of student work, addressing policy gaps in academic practices. A survey of 117 academics from the UK, UAE, and Iraq reveals positive attitudes toward AI in education, particularly for autonomous assessment. The study also highlights a lack of awareness of modern AI tools among experienced academics, emphasizing the need for updated policies and training.
The paper introduces ALPS (Arabic Linguistic & Pragmatic Suite), a diagnostic challenge set for evaluating deep semantics and pragmatics in Arabic NLP. The dataset contains 531 expert-curated questions across 15 tasks and 47 subtasks, designed to test morpho-syntactic dependencies and compositional semantics. Evaluation of 23 models, including commercial, open-source, and Arabic-native models, reveals that models struggle with fundamental morpho-syntactic dependencies, especially those reliant on diacritics. Why it matters: ALPS provides a valuable benchmark for evaluating the linguistic competence of Arabic NLP models, highlighting areas where current models fall short despite achieving high fluency.
This research explores the use of generative AI, specifically ChatGPT, to create student assessments that align with academic accreditation standards, such as those of the National Center for Academic Accreditation in Saudi Arabia and ABET. The study introduces a method for mapping verbs used in questions to educational outcomes, enabling AI to produce and validate accreditation-compliant questions. A survey of faculty members in Saudi universities showed high acceptance rates for AI-generated exam questions and AI assistance in editing existing questions.
This paper presents a comprehensive evaluation of ChatGPT's performance across 44 Arabic NLP tasks using over 60 datasets. The study compares ChatGPT's capabilities in Modern Standard Arabic (MSA) and Dialectal Arabic (DA) against smaller, fine-tuned models. Results show ChatGPT is outperformed by smaller, fine-tuned models and exhibits limitations in handling Arabic dialects compared to MSA. Why it matters: The work highlights the need for further research and development of Arabic-specific NLP models to overcome the limitations of general-purpose models like ChatGPT.