Researchers including Dr. Najwa Aaraj developed ML-FEED, a new exploit detection framework using pattern-based techniques. The model is 70x faster than LSTMs and 75,000x faster than Transformers in exploit detection tasks, while also being slightly more accurate. The "ML-FEED" paper won best paper at the 2022 IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications. Why it matters: This research enables more efficient real-time security applications and highlights growing AI expertise in the Arab world.
MBZUAI researchers introduced Droid, a resource suite and detector family, at EMNLP 2025 designed to distinguish between AI-generated and human-written code. The project addresses the challenge of identifying AI-generated code in software development, considering the prevalence of AI-suggested code and the risks of obfuscated backdoors and feedback loops. DroidCollection includes over one million code samples across seven programming languages, three coding domains, and outputs from 43 different code models, including human-AI co-authored code and adversarially humanized machine code. Why it matters: This research is crucial for maintaining software security and integrity in the age of AI-assisted coding, providing a robust tool for detecting AI-generated code across diverse languages and domains.
MBZUAI researchers release LLM-DetectAIve, a tool for fine-grained detection of machine-generated text across four categories: human-written, machine-generated, machine-written then humanized, and human-written then machine-polished. The tool aims to address concerns about misuse of LLMs, especially in education and academia, by identifying attempts to obfuscate or polish content. LLM-DetectAIve is publicly accessible with code and a demonstration video provided.
A study compared the vulnerability of C programs generated by nine state-of-the-art Large Language Models (LLMs) using a zero-shot prompt. The researchers introduced FormAI-v2, a dataset of 331,000 C programs generated by these LLMs, and found that at least 62.07% of the generated programs contained vulnerabilities, detected via formal verification. The research highlights the need for risk assessment and validation when deploying LLM-generated code in production environments.
This paper introduces a framework that combines machine learning for multi-class attack detection in IoT/IIoT networks with large language models (LLMs) for attack behavior analysis and mitigation suggestion. The framework uses role-play prompt engineering with RAG to guide LLMs like ChatGPT-o3 and DeepSeek-R1, and introduces new evaluation metrics for quantitative assessment. Experiments using Edge-IIoTset and CICIoT2023 datasets showed Random Forest as the best detection model and ChatGPT-o3 outperforming DeepSeek-R1 in attack analysis and mitigation.
A new methodology emulating fact-checker criteria assesses news outlet factuality and bias using LLMs. The approach uses prompts based on fact-checking criteria to elicit and aggregate LLM responses for predictions. Experiments demonstrate improvements over baselines, with error analysis on media popularity and region, and a released dataset/code at https://github.com/mbzuai-nlp/llm-media-profiling.