Skip to content
GCC AI Research

Search

Results for "spatial AI"

Spatial AI to help humans and enable robots

MBZUAI ·

Marc Pollefeys from ETH Zurich and Microsoft Spatial AI Lab will discuss building 3D environment representations for assisting humans and robots. The talk covers visual 3D mapping, localization, spatial data access, and navigation using geometry and learning-based methods. It also explores building rich 3D semantic representations for scene interaction via open vocabulary queries leveraging foundation models. Why it matters: Advancements in spatial AI and 3D scene understanding are critical for enabling more capable robots and AI assistants in various applications within the region.

Why 3D spatial reasoning still trips up today’s AI systems

MBZUAI ·

MBZUAI researchers have introduced SURPRISE3D, a benchmark for evaluating 3D spatial reasoning in AI systems, along with a 3D Spatial Reasoning Segmentation (3D-SRS) task. The benchmark includes over 900 indoor scenes and 200,000 language queries paired with 3D masks, emphasizing spatial relationships over object naming. A companion paper, MLLM-For3D, explores adapting 2D multimodal LLMs for 3D reasoning. Why it matters: This work addresses a key limitation in current AI, pushing towards embodied AI that can understand and act in 3D environments based on human-like spatial reasoning.

Modeling Complex Object Changes in Satellite Image Time-Series: Approach based on CSP and Spatiotemporal Graph

arXiv ·

This paper introduces a novel approach for monitoring and analyzing the evolution of complex geographic objects in satellite image time-series. The method uses a spatiotemporal graph and constraint satisfaction problems (CSP) to model and analyze object changes. Experiments on real-world satellite images from Saudi Arabian cities demonstrate the effectiveness of the proposed approach.

Safran and the Technology Innovation Institute intend to lead the next evolution in geospatial intelligence

TII ·

Safran.AI and the Technology Innovation Institute (TII) intend to form a strategic alliance to develop a next-generation Agentic AI geospatial intelligence (GEOINT) platform. The platform will combine Safran.AI’s GEOINT expertise with TII’s expertise in Agentic AI and orchestration platforms, enabling autonomous reasoning and transforming spaceborne imagery into decision-grade intelligence. The collaboration will focus on three major technological streams. Why it matters: This partnership signifies a major advancement in sovereign geospatial intelligence capabilities within the UAE, moving from traditional analysis to autonomous understanding for enhanced national security and decision-making.

Visual SLAM in the era of Deep Learning

MBZUAI ·

Ian Reid, a Professor of Computer Science at the University of Adelaide, gave a talk at MBZUAI on leveraging deep learning to go beyond geometric SLAM. The talk covered using prior domain knowledge to improve map and shape estimation and enabling navigation in unvisited environments. The research aims to turn cameras into devices for flexible, large-scale situational awareness or "Spatial AI" sensors. Why it matters: Integrating deep learning with SLAM could significantly advance robotic navigation and spatial understanding, with applications for autonomous systems in various industries.

Foundations of Multisensory Artificial Intelligence

MBZUAI ·

Paul Liang from CMU presented on machine learning foundations for multisensory AI, discussing a theoretical framework for modality interactions. The talk covered cross-modal attention and multimodal transformer architectures, and applications in mental health, pathology, and robotics. Liang's research aims to enable AI systems to integrate and learn from diverse real-world sensory modalities. Why it matters: This highlights the growing importance of multimodal AI research and its potential for advancements across various sectors in the region, including healthcare and robotics.

Structured World Models for Robots

MBZUAI ·

Krishna Murthy, a postdoc at MIT, researches computational world models to enable robots to understand and operate effectively in the physical world. His work focuses on differentiable computing approaches for spatial perception and interfaces large image, language, and audio models with 3D scenes. Murthy envisions structured world models working with scaling-based approaches to create versatile robot perception and planning algorithms. Why it matters: This research could significantly advance robotics by enabling more sophisticated perception, reasoning, and action capabilities in embodied agents.