MBZUAI Professor Preslav Nakov is researching methods to identify and combat the harmful uses of large language models in generating disinformation. He notes that disinformation, unlike fake news, is weaponized with the intent to persuade, not just to lie. His research focuses on the linguistic differences between human-written and machine-generated disinformation, such as the use of rhetorical devices in human propaganda. Why it matters: As AI-generated content becomes more prevalent, understanding and mitigating its potential for spreading disinformation is critical for maintaining trust and integrity in information ecosystems, especially during major election cycles.
The GenAI Content Detection Task 1 is a shared task on detecting machine-generated text, featuring monolingual (English) and multilingual subtasks. The task, part of the GenAI workshop at COLING 2025, attracted 36 teams for the English subtask and 26 for the multilingual one. The organizers provide a detailed overview of the data, results, system rankings, and analysis of the submitted systems.
A study by MBZUAI's Preslav Nakov and Cornell co-authors examines how to develop systems that detect fake news in a landscape where text is generated by humans and machines. The research, presented at the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, analyzes fake news detectors' ability to identify human- and machine-written content. The study highlights biases in current detectors, which tend to classify machine-written news as fake and human-written news as true. Why it matters: Addressing these biases is crucial as machine-generated content becomes more prevalent in both real and fake news, requiring more nuanced detection methods.
MBZUAI Professor Preslav Nakov is researching methods to combat fake news and online disinformation through NLP techniques. His work focuses on detecting harmful memes and identifying the stance of individuals regarding disinformation. Four of Nakov’s recent papers on these topics were presented at NAACL 2022. Why it matters: This research aims to mitigate the impact of weaponized news and online manipulation, contributing to a more trustworthy information environment in the region and globally.
MBZUAI researchers introduce M4, a multi-generator, multi-domain, and multi-lingual benchmark dataset for detecting machine-generated text. The study reveals challenges in generalizing detection across unseen domains or LLMs, with detectors often misclassifying machine-generated text as human-written. The dataset aims to foster research into more robust detection methods and is available on GitHub.