Lorenzo Jamone from Queen Mary University of London presented on cognitive robotics, focusing on tactile exploration and manipulation by robots. The talk covered combining biology, engineering, and AI for advanced robotic systems. Jamone directs the CRISP group and has over 100 publications in cognitive robotics. Why it matters: This highlights the ongoing research into more sophisticated robotic systems that can interact with complex environments, an area crucial for future applications in manufacturing and human-robot collaboration in the GCC.
Krishna Murthy, a postdoc at MIT, researches computational world models to enable robots to understand and operate effectively in the physical world. His work focuses on differentiable computing approaches for spatial perception and interfaces large image, language, and audio models with 3D scenes. Murthy envisions structured world models working with scaling-based approaches to create versatile robot perception and planning algorithms. Why it matters: This research could significantly advance robotics by enabling more sophisticated perception, reasoning, and action capabilities in embodied agents.
Sami Haddadin from the Technical University of Munich (TUM) discusses a shift in robotics towards machines that autonomously develop their own blueprints and controls. He highlights advancements driven by human-centered design, soft control, and model-based machine learning, enabling human-robot collaboration in manufacturing and healthcare. Haddadin also presents progress towards autonomous machine design and modular control architectures for complex manipulation tasks. Why it matters: This research has implications for advancing robotics and AI in the GCC region, especially in manufacturing and healthcare, by enabling safer and more efficient human-robot collaboration.
Yoshihiko Nakamura from the University of Tokyo discusses the computational challenges of humanoid robots, extending beyond sensing and control to understanding human movement, sensation, and relationships. The talk covers recent research on mechanical humanoid robots with a focus on actuators and computational problems related to human movements. Nakamura highlights the need for humanoid robots to interpret human actions and interactions for effective application. Why it matters: Addressing these computational challenges is crucial for developing more sophisticated and human-compatible robots for use in various human-centered applications within the region and globally.
A presentation discusses the evolution of Vision-and-Language Navigation (VLN) from benchmarks like Room-to-Room (R2R). It highlights the role of Large Language Models (LLMs) such as GPT-4 in enabling more natural human-machine interactions. The presentation showcases work using LLMs to decode navigational instructions and improve robotic navigation. Why it matters: This research demonstrates the potential of merging vision, language, and robotics for advanced AI applications in navigation and human-computer interaction.
Gregory Chirikjian presented an overview of research on robot navigation in unstructured environments, using computer vision, sensor tech, ML, and motion planning. The methods use multi-modal observations from RGB cameras, 3D LiDAR, and robot odometry for scene perception, along with deep RL for planning. These methods have been integrated with wheeled, home, and legged robots and tested in crowded indoor scenes, home environments, and dense outdoor terrains. Why it matters: This research pushes the boundaries of robotics in complex environments, paving the way for more versatile and autonomous robots in the Middle East.
Ivan Laptev from INRIA Paris presented a talk at MBZUAI on embodied multi-modal visual understanding, covering advancements in video understanding tasks like question answering and captioning. The talk highlighted recent work on vision-language navigation and manipulation. He argued that detailed understanding of the physical world through vision is still in early stages, discussing open research directions related to robotics and video generation. Why it matters: The discussion of robotics applications and future research directions in embodied AI could influence the direction of AI research and development in the UAE, particularly at MBZUAI.
Tetsunari Inamura's talk explores using VR to collect HRI data and tailor assistive robotic functionalities to individual users. He discusses symbol emergence via multimodal interaction, interactive behavior generation through symbol manipulation, and VR for data collection. The talk emphasizes long-term human capability enhancement and avoiding over-reliance on technology. Why it matters: This research promotes independence and growth in human-robot interactions, potentially revolutionizing assistive technologies in the region.