Search

Results for "authorship attribution"

The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text

arXiv · May 29

This paper analyzes Arabic text generated by LLMs like ALLaM, Jais, Llama, and GPT-4 across academic and social media domains using stylometric analysis. The study found detectable linguistic patterns that differentiate human-written from machine-generated Arabic text. BERT-based detection models achieved up to 99.9% F1-score in formal contexts, though cross-domain generalization remains a challenge. Why it matters: The research lays groundwork for detecting AI-generated misinformation in Arabic, a crucial step for preserving information integrity in Arabic-language contexts.

A mystery fit for a DetectAIve: Classifying machine involvement in writing

MBZUAI · Invalid Date

Researchers at MBZUAI have developed LLM-DetectAIve, a tool to classify the degree of machine involvement in text generation. The system categorizes text into four types: human-written, machine-generated, machine-written and machine-humanized, and human-written and machine-polished. A demo website allows users to test the tool's ability to detect machine involvement. Why it matters: This research addresses the growing need to identify and classify AI-generated content in academic and professional settings, particularly in light of increasing LLM misuse.

FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

arXiv · May 20

MBZUAI researchers introduce FAID, a fine-grained AI-generated text detection framework capable of classifying text as human-written, LLM-generated, or collaboratively written. FAID utilizes multi-level contrastive learning and multi-task auxiliary classification to capture authorship and model-specific characteristics, and can identify the underlying LLM family. The framework outperforms existing baselines, especially in generalizing to unseen domains and new LLMs, and includes a multilingual, multi-domain dataset called FAIDSet.