MBZUAI researchers Nils Lukas and Toluwani Samuel Aremu will present a paper at ICML 2025 demonstrating the vulnerability of current watermarking techniques in LLMs. Their research shows that adaptive paraphrasers can evade detection from watermarks with negligible impact on text quality, costing less than $10 of GPU compute. The attack involves fine-tuning a small open-weight model to rewrite sentences until surrogate keys no longer trigger detection. Why it matters: This work highlights critical weaknesses in current AI provenance methods, suggesting the need for more robust watermarking techniques to maintain trust in the authenticity of AI-generated content.
A PhD candidate from the University of Waterloo presented on threats from large machine learning systems at MBZUAI. The talk covered data privacy during inference and the misuse of ML systems to generate deepfakes. The speaker also analyzed differential privacy and watermarking as potential solutions. Why it matters: Understanding and mitigating the risks of large ML systems is crucial for responsible AI development and deployment in the region.
Xiuying Chen from KAUST presented her work on improving the trustworthiness of AI-generated text, focusing on accuracy and robustness. Her research analyzes causes of hallucination in language models related to semantic understanding and neglect of input knowledge, and proposes solutions. She also demonstrated vulnerabilities of language models to noise and enhances robustness using augmentation techniques. Why it matters: Improving the reliability of AI-generated text is crucial for its deployment in sensitive domains like healthcare and scientific discovery, where accuracy is paramount.
Researchers at MBZUAI have developed LLM-DetectAIve, a tool to classify the degree of machine involvement in text generation. The system categorizes text into four types: human-written, machine-generated, machine-written and machine-humanized, and human-written and machine-polished. A demo website allows users to test the tool's ability to detect machine involvement. Why it matters: This research addresses the growing need to identify and classify AI-generated content in academic and professional settings, particularly in light of increasing LLM misuse.
Laurent Najman presented the Power Watershed (PW) optimization framework for image and data processing. The PW framework enhances graph-based data processing algorithms like random walker and ratio-cut clustering, leading to faster solutions. It can be adapted for graph-based cost minimization methods and integrated with deep learning networks. Why it matters: This framework could enable more efficient and scalable image and data processing algorithms relevant to computer vision and related fields in the Middle East.
Thamar Solorio from the University of Houston presented preliminary work on multimodal representation learning for detecting objectionable content in videos at MBZUAI. The research investigates two multimodal pretraining mechanisms, finding contrastive learning more effective than unimodal representation prediction. The study also assesses the value of common multimodal corpora for this task. Why it matters: This research contributes to the development of AI techniques for content moderation, an important issue for online platforms in the Middle East and globally.
A talk explores multimodal approaches inspired by user behavior for detecting deepfakes, considering user studies on multicultural deepfakes and the ACM Multimedia 2024 benchmark. The research leverages insights into how different audiences perceive manipulated media. Abhinav Dhall from Flinders University will present findings and future directions in deepfake analysis at MBZUAI. Why it matters: Addressing deepfakes is crucial for maintaining trust in digital content, especially with the increasing sophistication and accessibility of AI-driven manipulation tools.