Skip to content
GCC AI Research

Search

Results for "verification"

Making LLM accuracy a matter of fact

MBZUAI ·

MBZUAI NLP master's graduate Hasan Iqbal developed OpenFactCheck, a framework for fact-checking and evaluating the factual accuracy of large language models. The framework consists of three modules: ResponseEvaluator, LLMEvaluator, and CheckerEvaluator. OpenFactCheck was published at EMNLP 2024 and accepted at NAACL 2025 and COLING 2025, with Iqbal playing an active role at COLING in Abu Dhabi. Why it matters: The development of automated fact-checking frameworks is crucial for ensuring the reliability and trustworthiness of information generated by increasingly prevalent LLMs, especially in the Arabic-speaking world.

Fact-Checking Complex Claims with Program-Guided Reasoning

arXiv ·

This paper introduces ProgramFC, a fact-checking model that decomposes complex claims into simpler sub-tasks using a library of functions. The model uses LLMs to generate reasoning programs and executes them by delegating sub-tasks, enhancing explainability and data efficiency. Experiments on fact-checking datasets demonstrate ProgramFC's superior performance compared to baseline methods, with publicly available code and data.

Truth from uncertainty: using AI’s internal signals to spot hallucinations

MBZUAI ·

Researchers from MBZUAI developed "uncertainty quantification heads" (UQ heads) to detect hallucinations in language models by probing internal states and estimating the credibility of generated text. UQ heads leverage attention maps and logits to identify potential hallucinations without altering the model's generation process or relying on external knowledge. The team found that UQ heads achieved state-of-the-art performance in claim-level hallucination detection across different domains and languages. Why it matters: This approach offers a more efficient and accurate method for identifying hallucinations, improving the reliability and trustworthiness of language models in various applications.

TOCKIFY TEST

KAUST ·

The provided content mentions KAUST (King Abdullah University of Science and Technology) and its association with King Abdullah bin Abdulaziz Al Saud. It also includes a copyright notice. Why it matters: This is a routine update reflecting KAUST's branding and legal information.

New synthetic-image detector focuses on what makes real images real

MBZUAI ·

MBZUAI researchers developed a new AI-generated image detection method called 'consistency verification' (ConV). Instead of training on labeled real and fake images, ConV identifies structural patterns unique to real photos using a data manifold concept. The system modifies images and uses DINOv2 to measure the difference between original and transformed representations, classifying images based on their proximity to the manifold. Why it matters: This approach offers a more robust way to detect AI-generated images without needing training data from every image generator, addressing a key limitation in the rapidly evolving landscape of AI image synthesis.

Fact checking with ChatGPT

MBZUAI ·

A new paper from MBZUAI researchers explores using ChatGPT to combat the spread of fake news. The researchers, including Preslav Nakov and Liangming Pan, demonstrate that ChatGPT can be used to fact-check published information. Their paper, "Fact-Checking Complex Claims with Program-Guided Reasoning," was accepted at ACL 2023. Why it matters: This research highlights the potential of large language models to address the growing challenge of misinformation, with implications for maintaining information integrity in the digital age.

Detect – Verify – Communicate: Combating Misinformation with More Realistic NLP

MBZUAI ·

Iryna Gurevych from TU Darmstadt discussed challenges in using NLP for misinformation detection, highlighting the gap between current fact-checking research and real-world scenarios. Her team is working on detecting emerging misinformation topics and has constructed two corpora for fact checking using larger evidence documents. They are also collaborating with cognitive scientists to detect and respond to vaccine hesitancy using effective communication strategies. Why it matters: Addressing misinformation is crucial in the Middle East, especially regarding public health and socio-political issues, making advancements in NLP-based fact-checking highly relevant.

Formal Methods for Modern Payment Protocols

MBZUAI ·

Researchers at ETH Zurich have formalized models of the EMV payment protocol using the Tamarin model checker. They discovered flaws allowing attackers to bypass PIN requirements for high-value purchases on EMV cards like Mastercard and Visa. The team also collaborated with an EMV consortium member to verify the improved EMV Kernel C-8 protocol. Why it matters: This research highlights the importance of formal methods in identifying critical vulnerabilities in widely used payment systems, potentially impacting financial security for consumers in the GCC region and worldwide.