New resources for fact-checking LLMs presented at EMNLP

MBZUAI · Notable

Summary

MBZUAI researchers presented new resources at EMNLP for improving the factuality of LLMs, including a web application for fact-checking LLM-generated text and benchmarks for evaluating automated fact-checkers. They found that current automated fact-checkers miss nearly 40% of false claims generated by LLMs. The study breaks down the fact-checking process into eight tasks, including decomposition and decontextualization, to identify where systems fail. Why it matters: This work addresses a critical challenge in the deployment of LLMs by providing tools and methods for improving their reliability and trustworthiness, which is essential for widespread adoption in sensitive applications.

Keywords

LLM · fact-checking · MBZUAI · EMNLP · benchmark

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Towards Real-world Fact-Checking with Large Language Models

MBZUAI · Invalid Date

Iryna Gurevych from TU Darmstadt presented research on using large language models for real-world fact-checking, focusing on dismantling misleading narratives from misinterpreted scientific publications and detecting misinformation via visual content. The research aims to explain why a false claim was believed, why it is false, and why the alternative is correct. Why it matters: Addressing misinformation, especially when supported by seemingly credible sources, is critical for public health, conflict resolution, and maintaining trust in institutions in the Middle East and globally.

Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts

arXiv · Jun 14

A new methodology emulating fact-checker criteria assesses news outlet factuality and bias using LLMs. The approach uses prompts based on fact-checking criteria to elicit and aggregate LLM responses for predictions. Experiments demonstrate improvements over baselines, with error analysis on media popularity and region, and a released dataset/code at https://github.com/mbzuai-nlp/llm-media-profiling.

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

arXiv · Aug 6

MBZUAI researchers release OpenFactCheck, a unified framework to evaluate the factual accuracy of large language models. The framework includes modules for response evaluation, LLM evaluation, and fact-checker evaluation. OpenFactCheck is available as an open-source Python library, a web service, and via GitHub.

The cost of truth: An efficient fact-checking framework | NAACL