MBZUAI researchers release OpenFactCheck, a unified framework to evaluate the factual accuracy of large language models. The framework includes modules for response evaluation, LLM evaluation, and fact-checker evaluation. OpenFactCheck is available as an open-source Python library, a web service, and via GitHub.
A new paper from MBZUAI researchers explores using ChatGPT to combat the spread of fake news. The researchers, including Preslav Nakov and Liangming Pan, demonstrate that ChatGPT can be used to fact-check published information. Their paper, "Fact-Checking Complex Claims with Program-Guided Reasoning," was accepted at ACL 2023. Why it matters: This research highlights the potential of large language models to address the growing challenge of misinformation, with implications for maintaining information integrity in the digital age.
MBZUAI NLP master's graduate Hasan Iqbal developed OpenFactCheck, a framework for fact-checking and evaluating the factual accuracy of large language models. The framework consists of three modules: ResponseEvaluator, LLMEvaluator, and CheckerEvaluator. OpenFactCheck was published at EMNLP 2024 and accepted at NAACL 2025 and COLING 2025, with Iqbal playing an active role at COLING in Abu Dhabi. Why it matters: The development of automated fact-checking frameworks is crucial for ensuring the reliability and trustworthiness of information generated by increasingly prevalent LLMs, especially in the Arabic-speaking world.
This paper introduces ProgramFC, a fact-checking model that decomposes complex claims into simpler sub-tasks using a library of functions. The model uses LLMs to generate reasoning programs and executes them by delegating sub-tasks, enhancing explainability and data efficiency. Experiments on fact-checking datasets demonstrate ProgramFC's superior performance compared to baseline methods, with publicly available code and data.
A new methodology emulating fact-checker criteria assesses news outlet factuality and bias using LLMs. The approach uses prompts based on fact-checking criteria to elicit and aggregate LLM responses for predictions. Experiments demonstrate improvements over baselines, with error analysis on media popularity and region, and a released dataset/code at https://github.com/mbzuai-nlp/llm-media-profiling.
EURECOM researchers developed data-driven verification methods using structured datasets to assess statistical and property claims. The approach translates text claims into SQL queries on relational databases for statistical claims. For property claims, they use knowledge graphs to verify claims and generate explanations. Why it matters: The methods aim to support fact-checkers by efficiently labeling claims with interpretable explanations, potentially combating misinformation in the region and beyond.
A novel agent-based framework called FIRE is introduced for fact-checking long-form text. FIRE iteratively integrates evidence retrieval and claim verification, deciding whether to provide a final answer or generate a subsequent search query. Experiments show FIRE achieves comparable performance to existing methods while reducing LLM costs by 7.6x and search costs by 16.5x.
Researchers from MBZUAI have introduced UrduFactCheck, a new framework for fact-checking in Urdu, along with two datasets: UrduFactBench and UrduFactQA. The framework uses monolingual and translation-based evidence retrieval to address the lack of Urdu resources. Evaluations using twelve LLMs showed that translation-augmented methods improve performance, highlighting challenges for open-source LLMs in Urdu.