MBZUAI researchers have developed an automatic interview system that uses LLMs to elicit nuanced, role-specific information from job candidates, improving early-stage hiring decisions. The system updates its belief about an applicant's rubric-oriented latent traits in a calibrated way based on their interview performance. Evaluation on simulated interviews showed the system's belief converges towards the simulated applicants' constructed ability levels.
This paper introduces an AI framework for autonomous assessment of student work, addressing policy gaps in academic practices. A survey of 117 academics from the UK, UAE, and Iraq reveals positive attitudes toward AI in education, particularly for autonomous assessment. The study also highlights a lack of awareness of modern AI tools among experienced academics, emphasizing the need for updated policies and training.
A novel agent-based framework called FIRE is introduced for fact-checking long-form text. FIRE iteratively integrates evidence retrieval and claim verification, deciding whether to provide a final answer or generate a subsequent search query. Experiments show FIRE achieves comparable performance to existing methods while reducing LLM costs by 7.6x and search costs by 16.5x.
A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.
This research explores the use of generative AI, specifically ChatGPT, to create student assessments that align with academic accreditation standards, such as those of the National Center for Academic Accreditation in Saudi Arabia and ABET. The study introduces a method for mapping verbs used in questions to educational outcomes, enabling AI to produce and validate accreditation-compliant questions. A survey of faculty members in Saudi universities showed high acceptance rates for AI-generated exam questions and AI assistance in editing existing questions.