A researcher at the University of Oxford presented new findings on 3D neural reconstruction. The talk introduced a dataset comprising real-world video captures with perfect 3D models. A novel joint optimization method refines camera poses during the reconstruction process. Why it matters: High-quality 3D reconstruction has broad applicability to robotics and computer vision applications in the region.
This paper introduces ADR-VINS, a monocular visual-inertial state estimation framework based on an Error-State Kalman Filter (ESKF) designed for autonomous drone racing, integrating direct pixel reprojection errors from gate corners as innovation terms. It also introduces ADR-FGO, an offline Factor-Graph Optimization framework for generating high-fidelity reference trajectories for post-flight evaluation in GNSS-denied environments. Validated on the TII-RATM dataset, ADR-VINS achieved an average RMS translation error of 0.134 m and was successfully deployed in the A2RL Drone Championship Season 2. Why it matters: The framework provides a robust and efficient solution for drone state estimation in challenging racing environments, and enables performance evaluation without relying on external localization systems.
Ian Reid, a Professor of Computer Science at the University of Adelaide, gave a talk at MBZUAI on leveraging deep learning to go beyond geometric SLAM. The talk covered using prior domain knowledge to improve map and shape estimation and enabling navigation in unvisited environments. The research aims to turn cameras into devices for flexible, large-scale situational awareness or "Spatial AI" sensors. Why it matters: Integrating deep learning with SLAM could significantly advance robotic navigation and spatial understanding, with applications for autonomous systems in various industries.
Marc Pollefeys from ETH Zurich and Microsoft Spatial AI Lab will discuss building 3D environment representations for assisting humans and robots. The talk covers visual 3D mapping, localization, spatial data access, and navigation using geometry and learning-based methods. It also explores building rich 3D semantic representations for scene interaction via open vocabulary queries leveraging foundation models. Why it matters: Advancements in spatial AI and 3D scene understanding are critical for enabling more capable robots and AI assistants in various applications within the region.
Dr. Xiaoming Liu from Michigan State University discussed computer vision techniques for 3D world understanding at a talk hosted by MBZUAI. The talk covered 3D reconstruction, detection, depth estimation, and velocity estimation, with applications in biometrics and autonomous driving. Dr. Liu also touched on anti-spoofing and fair face recognition research at MSU's Computer Vision Lab. Why it matters: Showcasing international experts and research directions helps to catalyze computer vision and 3D understanding research efforts within the UAE's AI ecosystem.
Gregory Chirikjian presented an overview of research on robot navigation in unstructured environments, using computer vision, sensor tech, ML, and motion planning. The methods use multi-modal observations from RGB cameras, 3D LiDAR, and robot odometry for scene perception, along with deep RL for planning. These methods have been integrated with wheeled, home, and legged robots and tested in crowded indoor scenes, home environments, and dense outdoor terrains. Why it matters: This research pushes the boundaries of robotics in complex environments, paving the way for more versatile and autonomous robots in the Middle East.