Skip to content
GCC AI Research

Search

Results for "AI detection"

A mystery fit for a DetectAIve: Classifying machine involvement in writing

MBZUAI ·

Researchers at MBZUAI have developed LLM-DetectAIve, a tool to classify the degree of machine involvement in text generation. The system categorizes text into four types: human-written, machine-generated, machine-written and machine-humanized, and human-written and machine-polished. A demo website allows users to test the tool's ability to detect machine involvement. Why it matters: This research addresses the growing need to identify and classify AI-generated content in academic and professional settings, particularly in light of increasing LLM misuse.

FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

arXiv ·

MBZUAI researchers introduce FAID, a fine-grained AI-generated text detection framework capable of classifying text as human-written, LLM-generated, or collaboratively written. FAID utilizes multi-level contrastive learning and multi-task auxiliary classification to capture authorship and model-specific characteristics, and can identify the underlying LLM family. The framework outperforms existing baselines, especially in generalizing to unseen domains and new LLMs, and includes a multilingual, multi-domain dataset called FAIDSet.

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

arXiv ·

MBZUAI researchers release LLM-DetectAIve, a tool for fine-grained detection of machine-generated text across four categories: human-written, machine-generated, machine-written then humanized, and human-written then machine-polished. The tool aims to address concerns about misuse of LLMs, especially in education and academia, by identifying attempts to obfuscate or polish content. LLM-DetectAIve is publicly accessible with code and a demonstration video provided.

Can we tell when AI wrote that code? This project thinks so, even when the AI tries to hide it

MBZUAI ·

MBZUAI researchers introduced Droid, a resource suite and detector family, at EMNLP 2025 designed to distinguish between AI-generated and human-written code. The project addresses the challenge of identifying AI-generated code in software development, considering the prevalence of AI-suggested code and the risks of obfuscated backdoors and feedback loops. DroidCollection includes over one million code samples across seven programming languages, three coding domains, and outputs from 43 different code models, including human-AI co-authored code and adversarially humanized machine code. Why it matters: This research is crucial for maintaining software security and integrity in the age of AI-assisted coding, providing a robust tool for detecting AI-generated code across diverse languages and domains.

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human

arXiv ·

The GenAI Content Detection Task 1 is a shared task on detecting machine-generated text, featuring monolingual (English) and multilingual subtasks. The task, part of the GenAI workshop at COLING 2025, attracted 36 teams for the English subtask and 26 for the multilingual one. The organizers provide a detailed overview of the data, results, system rankings, and analysis of the submitted systems.

The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text

arXiv ·

This paper analyzes Arabic text generated by LLMs like ALLaM, Jais, Llama, and GPT-4 across academic and social media domains using stylometric analysis. The study found detectable linguistic patterns that differentiate human-written from machine-generated Arabic text. BERT-based detection models achieved up to 99.9% F1-score in formal contexts, though cross-domain generalization remains a challenge. Why it matters: The research lays groundwork for detecting AI-generated misinformation in Arabic, a crucial step for preserving information integrity in Arabic-language contexts.

Facts and fabrications: New insights to improve fake news detection

MBZUAI ·

A study by MBZUAI's Preslav Nakov and Cornell co-authors examines how to develop systems that detect fake news in a landscape where text is generated by humans and machines. The research, presented at the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, analyzes fake news detectors' ability to identify human- and machine-written content. The study highlights biases in current detectors, which tend to classify machine-written news as fake and human-written news as true. Why it matters: Addressing these biases is crucial as machine-generated content becomes more prevalent in both real and fake news, requiring more nuanced detection methods.