Skip to content
GCC AI Research

Mind meld: agentic communication through thoughts instead of words

MBZUAI · Significant research

Summary

Researchers from MBZUAI, Carnegie Mellon University, and Meta AI presented a new approach called ThoughtComm at NeurIPS 2025 where AI agents communicate through internal, latent representations instead of natural language. This framework extracts and selectively shares latent "thoughts" from agents' internal states, representing the underlying structure of their reasoning. Results show that agents coordinate more effectively, reach consensus faster, and solve problems more accurately using this method. Why it matters: Bypassing the limitations of natural language in AI communication could lead to more efficient and accurate multi-agent systems, impacting areas like robotics, collaborative AI, and distributed problem-solving.

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

The rise of agentic AI: homegrown Lawa.AI gains momentum

MBZUAI ·

MBZUAI Provost Timothy Baldwin predicts that 2025 will be a breakout year for agentic AI, with 33% of enterprise software applications including agentic AI capabilities by 2028. MBZUAI doctoral students Wafa Alghallabi and Omkar Thawaker have launched Lawa.AI, an AI agent being tested on the university's website to provide faster answers and deeper understanding. Lawa.AI evolved from a research project in multimodal efficiency and LLMs and aims to bridge the gap between people and information in higher education and government. Why it matters: This highlights the UAE's focus on translating AI research into practical applications and the growing importance of agentic AI in various sectors.

From Individual to Society: Social Simulation Driven by LLM-based Agent

MBZUAI ·

Fudan University's Zhongyu Wei presented research on social simulation driven by LLMs, covering individual and large-scale social movement simulation. Wei directs the Data Intelligence and Social Computing Lab (Fudan DISC) and has published extensively on multimodal large models and social computing. His work includes the Volcano multimodal model, DISC-MedLLM, and ElectionSim. Why it matters: Using LLMs for social simulation could provide new tools for understanding and potentially predicting social dynamics in the Arab world.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv ·

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.