A presentation discusses the evolution of Vision-and-Language Navigation (VLN) from benchmarks like Room-to-Room (R2R). It highlights the role of Large Language Models (LLMs) such as GPT-4 in enabling more natural human-machine interactions. The presentation showcases work using LLMs to decode navigational instructions and improve robotic navigation. Why it matters: This research demonstrates the potential of merging vision, language, and robotics for advanced AI applications in navigation and human-computer interaction.
MBZUAI researchers presented EXAMS-V, a new benchmark dataset for evaluating the reasoning and processing abilities of vision language models (VLMs). EXAMS-V contains over 20,000 multiple-choice questions across 26 subjects and 11 languages, including Arabic. The dataset presents the questions within images, testing the VLM's ability to integrate visual and textual information. Why it matters: This dataset fills a gap in VLM evaluation, providing a valuable resource for assessing and improving the multimodal reasoning capabilities of these models, particularly in diverse languages like Arabic.
The inaugural ASPIRE Abu Dhabi Autonomous Racing League (A2RL) will take place on April 27th at the Yas Marina Circuit with 8 teams competing for a $2.25 million prize. Teams will use identical Dallara Super Formula SF23 cars autonomized by TII, relying on their coding and AI algorithms to race. The event will feature autonomous cars racing simultaneously and an AI vs Human race with former F1 driver Daniil Kvyat. Why it matters: This event highlights the UAE's commitment to advancing AI and autonomous systems, potentially establishing Abu Dhabi as a hub for autonomous vehicle innovation in extreme conditions.
This paper introduces Arabic language integration into Vision-and-Language Navigation (VLN) in robotics, evaluating multilingual SLMs like GPT-4o mini, Llama 3 8B, Phi-3 14B, and Jais using the NavGPT framework. The study uses the R2R dataset to assess the impact of language on navigation reasoning through zero-shot sequential action prediction. Results show the framework enables high-level planning in both English and Arabic, though some models face challenges with Arabic due to reasoning limitations and parsing issues. Why it matters: This work highlights the need to improve language model planning and reasoning for effective navigation, especially to unlock the potential of Arabic-language models in real-world applications.
Abu Dhabi's Advanced Technology Research Council (ATRC) has launched AI71, a new AI company building on the Falcon generative AI models developed by TII. AI71 will focus on multi-domain specializations, offering AI data control options for companies and countries looking to self-host for greater privacy. The company will be taken to market by ATRC's VentureOne subsidiary, initially targeting the medical, educational, and legal sectors. Why it matters: AI71 aims to establish Abu Dhabi and the UAE as a major AI player by providing decentralized data ownership and promoting broader access to AI technology.