This paper introduces DetectLLM-LRR and DetectLLM-NPR, two novel zero-shot methods for detecting machine-generated text using log rank information. Experiments across three datasets and seven language models demonstrate improvements of up to 3.9 AUROC points over state-of-the-art methods. The code and data for both methods are available on Github.
MBZUAI researchers release LLM-DetectAIve, a tool for fine-grained detection of machine-generated text across four categories: human-written, machine-generated, machine-written then humanized, and human-written then machine-polished. The tool aims to address concerns about misuse of LLMs, especially in education and academia, by identifying attempts to obfuscate or polish content. LLM-DetectAIve is publicly accessible with code and a demonstration video provided.
Researchers at MBZUAI have developed LLM-DetectAIve, a tool to classify the degree of machine involvement in text generation. The system categorizes text into four types: human-written, machine-generated, machine-written and machine-humanized, and human-written and machine-polished. A demo website allows users to test the tool's ability to detect machine involvement. Why it matters: This research addresses the growing need to identify and classify AI-generated content in academic and professional settings, particularly in light of increasing LLM misuse.
This paper analyzes Arabic text generated by LLMs like ALLaM, Jais, Llama, and GPT-4 across academic and social media domains using stylometric analysis. The study found detectable linguistic patterns that differentiate human-written from machine-generated Arabic text. BERT-based detection models achieved up to 99.9% F1-score in formal contexts, though cross-domain generalization remains a challenge. Why it matters: The research lays groundwork for detecting AI-generated misinformation in Arabic, a crucial step for preserving information integrity in Arabic-language contexts.
The GenAI Content Detection Task 1 is a shared task on detecting machine-generated text, featuring monolingual (English) and multilingual subtasks. The task, part of the GenAI workshop at COLING 2025, attracted 36 teams for the English subtask and 26 for the multilingual one. The organizers provide a detailed overview of the data, results, system rankings, and analysis of the submitted systems.
MBZUAI researchers introduce M4, a multi-generator, multi-domain, and multi-lingual benchmark dataset for detecting machine-generated text. The study reveals challenges in generalizing detection across unseen domains or LLMs, with detectors often misclassifying machine-generated text as human-written. The dataset aims to foster research into more robust detection methods and is available on GitHub.
MBZUAI researchers introduce FAID, a fine-grained AI-generated text detection framework capable of classifying text as human-written, LLM-generated, or collaboratively written. FAID utilizes multi-level contrastive learning and multi-task auxiliary classification to capture authorship and model-specific characteristics, and can identify the underlying LLM family. The framework outperforms existing baselines, especially in generalizing to unseen domains and new LLMs, and includes a multilingual, multi-domain dataset called FAIDSet.
This study investigates the ability of six large language models, including Jais, Mistral, and GPT-4o, to mimic human emotional expression in English and personality markers in Arabic. The researchers evaluated whether machine classifiers could distinguish between human-authored and AI-generated texts and assessed the emotional/personality traits exhibited by the LLMs. Results indicate that AI-generated texts are distinguishable from human-authored ones, with classification performance impacted by paraphrasing, and that LLMs encode affective signals differently than humans. Why it matters: The findings have implications for authorship attribution, affective computing, and the responsible deployment of AI, especially in under-resourced languages like Arabic.