MBZUAI faculty, researchers, and students will present 34 papers at the 35th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023). Fahad Khan is a co-author on 11 accepted papers, while Salman Khan and Shijian Lu have 10 and 9 papers, respectively. One paper focuses on person image synthesis via a denoising diffusion model, and another introduces PromptCAL for generalized novel category discovery. Why it matters: This large volume of acceptances at a top-tier conference highlights MBZUAI's growing prominence and research contributions in computer vision, with potential impact across various industries from online retail to autonomous driving.
MBZUAI faculty and students presented 31 papers at the 2022 Conference on Computer Vision and Pattern Recognition (CVPR), including 6 oral presentations. Professors Fahad Khan and Shijian Lu had 9 and 8 papers accepted respectively. Researchers collaborated with 57 institutions across 16 countries. Why it matters: MBZUAI's strong showing at a top-tier CV conference demonstrates the rapid growth and international collaboration of AI research in the UAE.
Researchers from MBZUAI and other institutions have developed a new framework called STEREO to improve the safety of text-to-image diffusion models. STEREO uses a two-stage approach: STE (Search Thoroughly Enough) based on adversarial training and REO (Robustly Erase Once) for batch concept erasure. This framework aims to enhance safety without significantly impacting the model's performance on normal queries. Why it matters: The framework addresses vulnerabilities in AI image generation, reducing the creation of inappropriate images while preserving performance on harmless queries.
KAUST's Visual Computing Center (VCC) is researching computer vision, image processing, and machine learning, with applications in self-driving cars, surveillance, and security. Professor Bernard Ghanem is working on teaching machines to understand visual data semantically, similar to how humans perceive the world. Self-driving cars use visual sensors to interpret traffic signals and detect obstacles, while computer vision also assists governments and corporations with security applications like facial recognition and detecting unattended luggage. Why it matters: Advancements in computer vision at KAUST can contribute to innovations in autonomous vehicles and enhance security measures in the region.
Pong C Yuen from Hong Kong Baptist University will present a talk on remote photoplethysmography (rPPG) detection. The talk will review the development of rPPG detection, share recent research, and discuss future directions. rPPG is a technology for non-contact computer vision and healthcare applications like heart rate estimation. Why it matters: Advancements in rPPG could enable new remote patient monitoring and diagnostic tools in the region, reducing the need for physical contact.
Axel Sauer from the University of Tübingen presented research on scaling Generative Adversarial Networks (GANs) using pretrained representations. The work explores shaping GANs into causal structures, training them up to 40 times faster, and achieving state-of-the-art image synthesis. The presentation mentions "Counterfactual Generative Networks", "Projected GANs", "StyleGAN-XL”, and “StyleGAN-T". Why it matters: Scaling GANs and improving their training efficiency is crucial for advancing image and video synthesis, with implications for various applications in computer vision, graphics, and robotics.
Researchers at MBZUAI, IBM Research, and other institutions have developed EarthDial, a new vision-language model (VLM) specifically designed to process geospatial data from remote sensing technologies. EarthDial handles data in multiple modalities and resolutions, processing images captured at different times to observe environmental changes. The model outperformed others on over 40 tasks including image classification, object detection, and change detection. Why it matters: This unified model bridges the gap between generic VLMs and domain-specific models, enabling complex geospatial data analysis for applications like disaster assessment and climate monitoring in the region.
MBZUAI had 30 papers accepted at the International Conference on Computer Vision (ICCV) in Paris, out of 8,260 submissions. Visiting Professor Ivan Laptev served as one of the ICCV Program Chairs. Two papers from MBZUAI researchers focused on analyzing moving images, with one introducing Video-FocalNets for action analysis and the other exploring the transfer of knowledge from still image analysis to video. Why it matters: MBZUAI's strong presence at ICCV demonstrates its growing prominence in the global computer vision research landscape.