Skip to content
GCC AI Research

How MBZUAI built PAN, an interactive, general world model capable of long-horizon simulation

MBZUAI · Significant research

Summary

MBZUAI's Institute of Foundation Models (IFM) has developed PAN, a novel interactive world model capable of long-horizon simulation. PAN uses a Generative Latent Prediction (GLP) architecture, coupling internal latent reasoning with generative supervision in the visual domain. The model evolves an internal latent state conditioned on history and natural language actions, then decodes that state into a video segment using a Causal Swin-DPM mechanism for smooth transitions. Why it matters: PAN represents a significant advancement in AI's ability to simulate and predict evolving environments, enabling more steerable and coherent long-term video generation and opening new possibilities for interactive AI systems.

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

Inside PAN, MBZUAI’s groundbreaking world model

MBZUAI ·

MBZUAI is previewing PAN, a next-generation world model designed to simulate diverse realities and advance machine reasoning. PAN allows researchers to test AI agents in simulated environments before real-world deployment, enabling them to learn from mistakes without real-world consequences. It facilitates complex reasoning about actions, outcomes, and interactions, crucial for reliable AI performance in dynamic environments. Why it matters: PAN represents a significant advancement in AI by enabling comprehensive simulation and testing of AI agents, which can revolutionize fields like disaster management and healthcare where real-world experimentation is risky.

MBZUAI Launches Institute of Foundation Models and Establishes Silicon Valley AI Lab

MBZUAI ·

MBZUAI has launched the Institute of Foundation Models (IFM) with a new Silicon Valley Lab in Sunnyvale, CA, joining existing facilities in Paris and Abu Dhabi. The launch event showcased PAN, a world model for simulating diverse realities with multimodal inputs. The IFM lab is also advancing K2-65B and JAIS AI systems. Why it matters: This expansion enhances MBZUAI's global presence and connects it with a critical AI ecosystem, supporting the UAE's economic diversification through advanced AI technologies.

LLM-BABYBENCH: Understanding and Evaluating Grounded Planning and Reasoning in LLMs

arXiv ·

MBZUAI researchers introduce LLM-BabyBench, a benchmark suite for evaluating grounded planning and reasoning in LLMs. The suite, built on a textual adaptation of the BabyAI grid world, assesses LLMs on predicting action consequences, generating action sequences, and decomposing instructions. Datasets, evaluation harness, and metrics are publicly available to facilitate reproducible assessment.

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

arXiv ·

Researchers at MBZUAI introduce "Interactive Video Reasoning," a new paradigm enabling models to actively "think with videos" by performing iterative visual actions to gather and refine evidence. They developed Video CoM, which reasons through a Chain of Manipulations (CoM), and constructed Video CoM Instruct, an 18K instruction tuning dataset for multi-step manipulation reasoning. The model is further optimized via reinforcement learning with reasoning aware Group Relative Policy Optimization (GRPO), achieving strong results across nine video reasoning benchmarks.