The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text

arXiv · May 29, 2025 · Significant research

Summary

This paper analyzes Arabic text generated by LLMs like ALLaM, Jais, Llama, and GPT-4 across academic and social media domains using stylometric analysis. The study found detectable linguistic patterns that differentiate human-written from machine-generated Arabic text. BERT-based detection models achieved up to 99.9% F1-score in formal contexts, though cross-domain generalization remains a challenge. Why it matters: The research lays groundwork for detecting AI-generated misinformation in Arabic, a crucial step for preserving information integrity in Arabic-language contexts.

Keywords

Arabic NLP · LLM detection · Stylometric analysis · Misinformation · Jais

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguistic Style in English and Arabic with Six Large Language Models

arXiv · Mar 24

This study investigates the ability of six large language models, including Jais, Mistral, and GPT-4o, to mimic human emotional expression in English and personality markers in Arabic. The researchers evaluated whether machine classifiers could distinguish between human-authored and AI-generated texts and assessed the emotional/personality traits exhibited by the LLMs. Results indicate that AI-generated texts are distinguishable from human-authored ones, with classification performance impacted by paraphrasing, and that LLMs encode affective signals differently than humans. Why it matters: The findings have implications for authorship attribution, affective computing, and the responsible deployment of AI, especially in under-resourced languages like Arabic.

The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text

Summary

Keywords

Related

Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguistic Style in English and Arabic with Six Large Language Models