Skip to content
GCC AI Research

Search

Results for "webcam"

High-quality Neural Reconstruction in Real-world Scenes

MBZUAI ·

A researcher at the University of Oxford presented new findings on 3D neural reconstruction. The talk introduced a dataset comprising real-world video captures with perfect 3D models. A novel joint optimization method refines camera poses during the reconstruction process. Why it matters: High-quality 3D reconstruction has broad applicability to robotics and computer vision applications in the region.

Real-time Few-shot Realistic Avatars

MBZUAI ·

Ekaterina Radionova from Smarter AI (formerly Samsung AI Center) presented an approach to generating lifelike real-time avatars. The work focuses on generating high-quality video with authentic facial features to support online generation. Radionova's master's degree is from Skoltech on Data Science program and Bachelor degree at Moscow Institute of Physics and Technology on Applied Math. Why it matters: Achieving realistic real-time avatars is critical for applications in online communication, entertainment, and virtual reality within the region.

Metaverse healthcare in red, green, and blue

MBZUAI ·

Researchers at MBZUAI developed a method to measure vital signs using webcams by analyzing color intensity changes in facial blood flow. They built a digital twin system that uses machine learning to combine heart rate, respiratory rate, and blood oxygen level measures. The system displays real-time vital sign information, enabling remote patient triage. Why it matters: This research contributes to the advancement of telemedicine, potentially improving healthcare access in underserved regions and aligning with UN Sustainable Development Goal #3.

Cross-modal understanding and generation of multimodal content

MBZUAI ·

Nicu Sebe from the University of Trento presented recent work on video generation, focusing on animating objects in a source image using external information like labels, driving videos, or text. He introduced a Learnable Game Engine (LGE) trained from monocular annotated videos, which maintains states of scenes, objects, and agents to render controllable viewpoints. Why it matters: This talk highlights advancements in cross-modal AI, potentially enabling new applications in gaming, simulation, and content creation within the region.