Skip to content
GCC AI Research

Search

Results for "YOLO"

Drift-Corrected Monocular VIO and Perception-Aware Planning for Autonomous Drone Racing

arXiv ·

This paper details the autonomous drone racing system developed for the Abu Dhabi Autonomous Racing League (A2RL) x Drone Champions League competition. The system uses drift-corrected monocular Visual-Inertial Odometry (VIO) fused with YOLO-based gate detection for global position measurements, managed via Kalman filter. A perception-aware planner generates trajectories balancing speed and gate visibility. Why it matters: The system's podium finishes validate the effectiveness of monocular vision-based autonomous drone flight and showcases advancements in AI-powered robotics within the UAE.

MonoRace: Winning Champion-Level Drone Racing with Robust Monocular AI

arXiv ·

The paper presents MonoRace, an onboard drone racing approach using a monocular camera and IMU. The system combines neural-network-based gate segmentation with a drone model for robust state estimation, along with offline optimization using gate geometry. MonoRace won the 2025 Abu Dhabi Autonomous Drone Racing Competition (A2RL), outperforming AI teams and human world champions, reaching speeds up to 100 km/h. Why it matters: This demonstrates a significant advancement in autonomous drone racing, achieving champion-level performance with a resource-efficient monocular system, validated in a real-world competition setting in the UAE.

Spot-the-Camel: Computer Vision for Safer Roads

arXiv ·

Researchers in Saudi Arabia are applying computer vision techniques to reduce Camel-Vehicle Collisions (CVCs). They tested object detection models including CenterNet, EfficientDet, Faster R-CNN, SSD, and YOLOv8 on the task, finding YOLOv8 to be the most accurate and efficient. Future work will focus on developing a system to improve road safety in rural areas.

From YOLO to VLMs: Advancing Zero-Shot and Few-Shot Detection of Wastewater Treatment Plants Using Satellite Imagery in MENA Region

arXiv ·

A new study compares vision-language models (VLMs) to YOLOv8 for wastewater treatment plant (WWTP) identification in satellite imagery across the MENA region. VLMs like Gemma-3 demonstrate superior zero-shot performance compared to YOLOv8, trained on a dataset of 83,566 satellite images from Egypt, Saudi Arabia, and UAE. The research suggests VLMs offer a scalable, annotation-free alternative for remote sensing of WWTPs.

A Decentralized Multi-Agent Unmanned Aerial System to Search, Pick Up, and Relocate Objects

arXiv ·

This paper presents a decentralized multi-agent unmanned aerial system designed for search, pickup, and relocation of objects. The system integrates multi-agent aerial exploration, object detection/tracking, and aerial gripping. The decentralized system uses global state estimation, reactive collision avoidance, and sweep planning for exploration. Why it matters: The system's successful deployment in demonstrations and competitions like MBZIRC highlights the potential of integrated robotic solutions for complex tasks such as search and rescue in the region.

A vision in color

KAUST ·

Shozo Yokoyama, a biology professor at Emory University specializing in color vision evolution, was interviewed by KAUST. Yokoyama's lab identified amino acids regulating red-green and UV vision in vertebrates. He emphasizes the importance of young scientists developing fresh perspectives on evolution and learning directly from animals. Why it matters: While not directly an AI story, the piece highlights KAUST's broader research focus and its investment in attracting and showcasing international scientific expertise, relevant to building a strong research ecosystem.