MBZUAI researchers have introduced SURPRISE3D, a benchmark for evaluating 3D spatial reasoning in AI systems, along with a 3D Spatial Reasoning Segmentation (3D-SRS) task. The benchmark includes over 900 indoor scenes and 200,000 language queries paired with 3D masks, emphasizing spatial relationships over object naming. A companion paper, MLLM-For3D, explores adapting 2D multimodal LLMs for 3D reasoning. Why it matters: This work addresses a key limitation in current AI, pushing towards embodied AI that can understand and act in 3D environments based on human-like spatial reasoning.
KAUST's Peter Wonka discusses the challenges and advancements in creating data-rich, three-dimensional maps for various applications. His team is working with Boeing on 3D modeling tools for aerospace design. KAUST-funded FalconViz uses UAV drones to create 3D maps of disaster areas for first responders. Why it matters: This highlights KAUST's contribution to cutting-edge 3D modeling and its practical applications in industries like aerospace and disaster response in the region.
This article discusses the evolution of mobile extended reality (MEX) and its potential to revolutionize urban interaction. It highlights the convergence of augmented and virtual reality technologies for mobile usage. A novel approach to 3D models, characterized as urban situated models or “3D-plus-time” (4D.City), is introduced. Why it matters: The development of MEX and 4D.City could significantly enhance user experience and analog-digital convergence in urban environments, offering new possibilities for human-computer interaction.
This paper introduces a self-supervised learning method for point cloud analysis using an upsampling autoencoder (UAE). The model uses subsampling and an encoder-decoder architecture to reconstruct the original point cloud, learning both semantic and geometric information. Experiments show the UAE outperforms existing methods in shape classification, part segmentation, and point cloud upsampling tasks.
KAUST researchers have developed a detailed 3D dynamic model using data from the February 2023 Turkiye earthquake to improve earthquake simulations. The model incorporates 3D fault geometry and Earth structure for realistic simulations of ground shaking. It explains complex ground shaking patterns and the impact of supershear ruptures, which can amplify damage far from the epicenter. Why it matters: This research provides a more accurate understanding of earthquake rupture processes, crucial for seismic hazard assessment and infrastructure planning in seismically active regions like the Middle East.
KAUST researchers used 3D mapping technology via remote control helicopter to survey and create detailed renderings of Jeddah's Al Balad, a UNESCO World Heritage Site. The team, from KAUST's Visual Computer Center and FalconViz, captured high-definition images from about 50 meters above street level. This enabled the creation of accurate 3D models, showing building shifts and potential problems for urban planners. Why it matters: This method provides a rapid and accurate way to document and preserve historical landmarks, especially in areas where traditional surveying is difficult or infeasible, aiding in cultural heritage preservation efforts.
Pascal Fua from EPFL presented an approach to implementing convolutional neural nets that output complex 3D surface meshes. The method overcomes limitations in converting implicit representations to explicit surface representations. Applications include single view reconstruction, physically-driven shape optimization, and bio-medical image segmentation. Why it matters: This research advances geometric deep learning by enabling end-to-end trainable models for 3D surface mesh generation, with potential impact on various applications in computer vision and biomedical imaging in the region.
The paper introduces UAE-3D, a multi-modal VAE for 3D molecule generation that compresses molecules into a unified latent space, maintaining near-zero reconstruction error. This approach simplifies latent diffusion modeling by eliminating the need to handle multi-modality and equivariance separately. Experiments on GEOM-Drugs and QM9 datasets show UAE-3D establishes new benchmarks in de novo and conditional 3D molecule generation, with significant improvements in efficiency and quality.