KAUST's Visual Computing Center (VCC) is researching computer vision, image processing, and machine learning, with applications in self-driving cars, surveillance, and security. Professor Bernard Ghanem is working on teaching machines to understand visual data semantically, similar to how humans perceive the world. Self-driving cars use visual sensors to interpret traffic signals and detect obstacles, while computer vision also assists governments and corporations with security applications like facial recognition and detecting unattended luggage. Why it matters: Advancements in computer vision at KAUST can contribute to innovations in autonomous vehicles and enhance security measures in the region.
This paper proposes a smart dome model for mosques that uses AI to control dome movements based on weather conditions and overcrowding. The model utilizes Congested Scene Recognition Network (CSRNet) and fuzzy logic techniques in Python to determine when to open and close the domes to maintain fresh air and sunlight. The goal is to automatically manage dome operation based on real-time data, specifying the duration for which the domes should remain open each hour.
KAUST's Visual Computing Center (VCC) hosted an Open House event on March 28, showcasing its interdisciplinary research in visual computing. Demonstrations included a virtual reality driving simulator by FalconViz, intended for driver education in Saudi Arabia. Researchers also presented a drone trained to autonomously navigate race courses and a neural network for autonomous driving using image-based technology without GPS. Why it matters: The VCC's work highlights KAUST's role in advancing visual computing applications relevant to Saudi Arabia, from driver training to autonomous systems.
Dr. Andrew Bastawrous, CEO/co-founder of Peek, discussed his work on mobile eye clinics at KAUST. He developed Peek Acuity and Peek Retina, which turn smartphones into tools for detecting visual impairment. The technology uses smartphone screens and camera clip-ons to image inside the eye. Why it matters: This low-cost mobile ophthalmic tool has the potential to prevent and treat vision loss in underserved communities.
KAUST computer scientist Mohamed Elhoseiny and his VISION CAIR team developed Creative Walk Adversarial Networks (CWAN) for novel art generation. CWAN learns from existing art styles and deviates using 'random walk deviation' methods. Human evaluators preferred CWAN-generated art compared to other methods like StyleGAN2. Why it matters: The research demonstrates AI's potential as a valuable tool for artists, enabling the creation of unique and meaningful art, and explores more effective emotional language in image captioning.
MBZUAI graduate Ahmed Sharshar developed a computer vision application that assesses lung health from a video of a person breathing, estimating Forced Vital Capacity (FVC), Forced Expiratory Volume in 1 second (FEV1), and Peak Expiratory Flow (PEF). The model achieved up to 100% accuracy using thermal video data from 60 participants. Sharshar aims to create lightweight models applicable in developing countries without high-end GPUs. Why it matters: This research showcases the potential of AI to democratize healthcare access through non-invasive, accessible diagnostic tools.
A proposed recognition system aims to identify missing persons, deceased individuals, and lost objects during the Hajj and Umrah pilgrimages in Saudi Arabia. The system intends to leverage facial recognition and object identification to manage the large crowds expected in the coming decade, estimated to reach 20 million pilgrims. It will be integrated into the CrowdSensing system for crowd estimation, management, and safety.
This seminar explores vision systems through self-supervised representation learning, addressing challenges and solutions in mainstream vision self-supervised learning methods. It discusses developing versatile representations across modalities, tasks, and architectures to propel the evolution of the vision foundation model. Tong Zhang from EPFL, with a background from Beihang University, New York University, and Australian National University, will lead the talk. Why it matters: Advancing vision foundation models is crucial for expanding AI applications, especially in the Middle East where computer vision can address challenges in areas like urban planning, agriculture, and environmental monitoring.