MBZUAI researchers have developed MAviS, a new multimodal dataset, benchmark, and chatbot for fine-grained bird species recognition. MAviS includes images, audio, and text to help models identify subtle differences between species, especially rare and regional varieties. The related study was presented at EMNLP 2025 and selected as a "Senior Area Chair Highlight". Why it matters: This work addresses a key limitation in AI's ability to support biodiversity conservation and ecological monitoring in the region and globally.
The paper details the hardware and software systems of ETH Zurich's Micro Aerial Vehicles (MAVs) used in the 2017 Mohamed Bin Zayed International Robotics Challenge (MBZIRC). The team integrated computer vision, sensor fusion, and control to develop autonomous outdoor platforms. They achieved second place in Challenge 3 and the Grand Challenge, demonstrating autonomous landing in under a minute and a 90%+ visual servoing success rate for object pickups. Why it matters: The work highlights the advanced state of robotics research and development showcased at the MBZIRC, contributing to the growth of autonomous systems in the region.
This paper presents a fully autonomous micro aerial vehicle (MAV) developed to pop balloons using onboard sensing and computing. The system was evaluated at the Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020. The MAV successfully popped all five balloons in under two minutes in each of the three competition runs. Why it matters: This demonstrates the potential of autonomous robotics and computer vision for real-world applications in challenging environments.
The article discusses Team NimbRo's approaches to challenges involving micro aerial vehicles (MAV) at the Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2017. The challenges included landing on a moving vehicle and a treasure hunt task requiring mission planning and multi-robot coordination. The team's system achieved a third place in both subchallenges and contributed to winning the MBZIRC Grand Challenge. Why it matters: This demonstrates advanced robotics capabilities developed and tested in the UAE, pushing the boundaries of autonomous aerial vehicle operation and multi-robot collaboration.
The Abu Dhabi Autonomous Racing League (A2RL) concluded its inaugural autonomous drone championship in Abu Dhabi, featuring 14 international teams. Team MavLab (TU Delft) won the AI Grand Challenge, AI vs Human Showdown, and AI Drag Race, while TII Racing (Technology Innovation Institute, Abu Dhabi) won the AI Multi-Autonomous Drone Race. In the AI vs Human challenge, MavLab's AI-powered drone outpaced a top human pilot in a complex head-to-head race. Why it matters: This event demonstrates the rapid advancements in AI-driven autonomous flight, positioning the UAE as a hub for innovation in aerial robotics and autonomous systems.
The Autonomous Robotics Research Center (ARRC) at TII won the Nanocopter AI Challenge 2022, part of the International Micro Air Vehicle Conference. The challenge involved developing AI-enabled solutions for Bitcraze’s Crazyflie nanocopters to perform vision-based obstacle avoidance. The ARRC team's nano-drone completed a 110m flight in 5 minutes with no crashes in a dynamic environment. Why it matters: This victory demonstrates the growing expertise in autonomous robotics and AI-powered drone technology within the UAE, with potential applications in search and rescue, industrial inspection, and precision agriculture.
KAUST's Machinist Development Apprenticeship Program (MDAP) graduated its second cohort in August 2020, training Saudi nationals in advanced manufacturing technologies. The 18-month program provides in-depth training at the Workshops Core Lab in collaboration with Yanbu Industrial College. Graduates acquire skills to contribute to Saudi Arabia's Vision 2030 in the manufacturing sector. Why it matters: This program addresses the need for skilled local talent in advanced manufacturing, crucial for diversifying the Saudi economy and achieving its Vision 2030 goals.
MBZUAI researchers introduce PG-Video-LLaVA, a large multimodal model with pixel-level grounding capabilities for videos, integrating audio cues for enhanced understanding. The model uses an off-the-shelf tracker and grounding module to localize objects in videos based on user prompts. PG-Video-LLaVA is evaluated on video question-answering and grounding benchmarks, using Vicuna instead of GPT-3.5 for reproducibility.