Gregory Chirikjian presented an overview of research on robot navigation in unstructured environments, using computer vision, sensor tech, ML, and motion planning. The methods use multi-modal observations from RGB cameras, 3D LiDAR, and robot odometry for scene perception, along with deep RL for planning. These methods have been integrated with wheeled, home, and legged robots and tested in crowded indoor scenes, home environments, and dense outdoor terrains. Why it matters: This research pushes the boundaries of robotics in complex environments, paving the way for more versatile and autonomous robots in the Middle East.
MBZUAI researchers led by Dr. Mohammad Yaqub are developing AI algorithms for real-time medical diagnoses, including tools for multiple sclerosis and congenital heart disease. The team developed ScanNav, an AI fetal anomaly assessment system licensed by GE Healthcare for Voluson SWIFT ultrasound machines. ScanNav assists doctors during anomaly scans after 20 weeks of gestation to check for conditions like heart issues and spina bifida. Why it matters: This research has the potential to significantly improve the speed and accuracy of medical diagnoses in the UAE and beyond, addressing critical gaps in healthcare.
This paper introduces Arabic language integration into Vision-and-Language Navigation (VLN) in robotics, evaluating multilingual SLMs like GPT-4o mini, Llama 3 8B, Phi-3 14B, and Jais using the NavGPT framework. The study uses the R2R dataset to assess the impact of language on navigation reasoning through zero-shot sequential action prediction. Results show the framework enables high-level planning in both English and Arabic, though some models face challenges with Arabic due to reasoning limitations and parsing issues. Why it matters: This work highlights the need to improve language model planning and reasoning for effective navigation, especially to unlock the potential of Arabic-language models in real-world applications.
The paper introduces MedNNS, a neural network search framework designed for medical imaging, addressing challenges in architecture selection and weight initialization. MedNNS constructs a meta-space encoding datasets and models based on their performance using a Supernetwork-based approach, expanding the model zoo size by 51x. The framework incorporates rank loss and Fréchet Inception Distance (FID) loss to capture inter-model and inter-dataset relationships, improving alignment in the meta-space and outperforming ImageNet pre-trained DL models and SOTA NAS methods.
Dr. Jeffrey Schnapp from Harvard University discussed the shift from mobility to movability and human-centric autonomy in robotics at KAUST's 2018 Winter Enrichment Program. He presented Gita, a cargo robot designed to move like humans and support pedestrian lifestyles. Piaggio Fast Forward, Schnapp's company, aims to create robots that coexist with humans and enhance the quality of life in pedestrian-friendly environments. Why it matters: This highlights KAUST's engagement with innovative robotics research and its focus on exploring human-robot interaction for future urban development in Saudi Arabia.
Marc Pollefeys from ETH Zurich and Microsoft Spatial AI Lab will discuss building 3D environment representations for assisting humans and robots. The talk covers visual 3D mapping, localization, spatial data access, and navigation using geometry and learning-based methods. It also explores building rich 3D semantic representations for scene interaction via open vocabulary queries leveraging foundation models. Why it matters: Advancements in spatial AI and 3D scene understanding are critical for enabling more capable robots and AI assistants in various applications within the region.
A researcher at the University of Oxford presented new findings on 3D neural reconstruction. The talk introduced a dataset comprising real-world video captures with perfect 3D models. A novel joint optimization method refines camera poses during the reconstruction process. Why it matters: High-quality 3D reconstruction has broad applicability to robotics and computer vision applications in the region.
Qingbiao Li from the Oxford Robotics Institute is researching decentralized multi-robot coordination using Graph Neural Networks (GNNs). The approach builds an information-sharing mechanism within a decentralized multi-robot system through GNNs and imitation learning. It also uses visual machine learning-assisted navigation with panoramic cameras to guide robots in unseen environments. Why it matters: This research could improve the effectiveness of automated mobile robot systems in urban rail transit and warehousing logistics in the GCC region, where smart city initiatives are growing.