Skip to content
GCC AI Research

Search

Results for "Machine Perception"

Computer vision: Teaching computers how to see the world

KAUST ·

KAUST's Visual Computing Center (VCC) is researching computer vision, image processing, and machine learning, with applications in self-driving cars, surveillance, and security. Professor Bernard Ghanem is working on teaching machines to understand visual data semantically, similar to how humans perceive the world. Self-driving cars use visual sensors to interpret traffic signals and detect obstacles, while computer vision also assists governments and corporations with security applications like facial recognition and detecting unattended luggage. Why it matters: Advancements in computer vision at KAUST can contribute to innovations in autonomous vehicles and enhance security measures in the region.

Beyond self-driving simulations: teaching machines to learn

KAUST ·

KAUST researchers in the Image and Video Understanding Lab are applying machine learning to computer vision for automated navigation, including self-driving cars and UAVs. They tested their algorithms on KAUST roads, aiming to replicate the brain's efficiency in tasks like activity and object recognition. The team is also exploring the possibility of creative algorithms that can transfer skills without direct training. Why it matters: This research contributes to the advancement of autonomous systems and explores the fundamental questions of replicating human intelligence in machines within the GCC region.

Structured World Models for Robots

MBZUAI ·

Krishna Murthy, a postdoc at MIT, researches computational world models to enable robots to understand and operate effectively in the physical world. His work focuses on differentiable computing approaches for spatial perception and interfaces large image, language, and audio models with 3D scenes. Murthy envisions structured world models working with scaling-based approaches to create versatile robot perception and planning algorithms. Why it matters: This research could significantly advance robotics by enabling more sophisticated perception, reasoning, and action capabilities in embodied agents.

A unified theory of all things visual

MBZUAI ·

MBZUAI Professor Fahad Khan is working on a unified theory of machine visual intelligence. His goal is to enable AI systems to better understand and function in complex, chaotic visual environments. The aim is to improve real-world applications like smart cities, personalized healthcare, and autonomous vehicles. Why it matters: This research could significantly advance AI's ability to perceive and interact with the real world, especially in challenging environments common in the developing world.

Machine Learning Integration for Signal Processing

TII ·

Technology Innovation Institute's (TII) Directed Energy Research Center (DERC) is integrating machine learning (ML) techniques into signal processing to accelerate research. One project used convolutional neural networks to predict COVID-19 pneumonia from chest x-rays with 97.5% accuracy. DERC researchers also demonstrated that ML-based signal and image processing can retrieve up to 68% of text information from electromagnetic emanations. Why it matters: This adoption of ML for signal processing at TII highlights the potential for advanced AI techniques to enhance research and security applications in the UAE.

Foundations of Multisensory Artificial Intelligence

MBZUAI ·

Paul Liang from CMU presented on machine learning foundations for multisensory AI, discussing a theoretical framework for modality interactions. The talk covered cross-modal attention and multimodal transformer architectures, and applications in mental health, pathology, and robotics. Liang's research aims to enable AI systems to integrate and learn from diverse real-world sensory modalities. Why it matters: This highlights the growing importance of multimodal AI research and its potential for advancements across various sectors in the region, including healthcare and robotics.

Robot Navigation in the Wild

MBZUAI ·

Gregory Chirikjian presented an overview of research on robot navigation in unstructured environments, using computer vision, sensor tech, ML, and motion planning. The methods use multi-modal observations from RGB cameras, 3D LiDAR, and robot odometry for scene perception, along with deep RL for planning. These methods have been integrated with wheeled, home, and legged robots and tested in crowded indoor scenes, home environments, and dense outdoor terrains. Why it matters: This research pushes the boundaries of robotics in complex environments, paving the way for more versatile and autonomous robots in the Middle East.

To Make Just-Noticeable Difference (JND) Computable toward Visual Intelligence

MBZUAI ·

A professor from Nanyang Technological University (NTU), Singapore gave a talk at MBZUAI about "Just-Noticeable Difference (JND)" models in visual intelligence. The talk covered visual JND models, research and applications, and future opportunities for JND modeling. JND can help tackle big data challenges with limited resources by focusing on user-centric and green systems. Why it matters: Exploring JND could lead to advancements in AI applications related to visual signal processing, image synthesis, and generative AI in the region.