Skip to content
GCC AI Research

Search

Results for "Motion Capture"

TII's Secure Systems Research Center in Abu Dhabi Announces Launch of First Motion Capture Facility Outside United States

TII ·

TII's Secure Systems Research Center (SSRC) in Abu Dhabi has launched a motion capture (MOCAP) facility for testing drones in augmented reality, the first such facility outside the US. SSRC's MOCAP facility will simulate environments like Abu Dhabi city to enable high-precision ground truth for experiments. The facility will allow the modelling and operation of a cloud-based secure autonomous system of drones. Why it matters: This positions the UAE as a leader in drone security research, enabling advanced testing and development of secure drone systems for various critical applications.

Is Human Motion a Language without Words?

MBZUAI ·

This article previews a talk by Gül Varol from Ecole des Ponts ParisTech on bridging natural language and 3D human motions. The talk will cover text-to-motion synthesis using generative models and text-to-motion retrieval models based on the ACTOR, TEMOS, TMR, TEACH, and SINC papers. Varol's research interests include video representation learning, human motion synthesis, and sign languages. Why it matters: Research in this area could enable more intuitive human-computer interaction and new applications in areas like virtual reality and robotics.

Amplifying the Invisible: The Impact of Video Motion Magnification in Healthcare, Engineering, and Beyond

MBZUAI ·

Video motion magnification amplifies subtle movements in video footage, making the imperceptible visible across various fields. In healthcare, it allows non-invasive monitoring of vital signs and micro-expressions. In engineering, it helps detect structural vibrations in infrastructure, while also being used in sports science, security, and robotics. Why it matters: The technology's ability to reveal hidden details has the potential to revolutionize diagnostics, monitoring, and decision-making in diverse sectors across the Middle East.

Reconstruction and Animation of Realistic Head Avatars

MBZUAI ·

Egor Zakharov from ETH Zurich AIT lab will present research on creating controllable and detailed 3D head avatars using data from consumer-grade devices. The presentation will cover high-fidelity image-based facial reconstruction/animation and video-based reconstruction of detailed structures like hairstyles. He will showcase integrating human-centric assets into virtual environments for real-time telepresence and entertainment. Why it matters: This research contributes to advancements in digital human modeling and telepresence, with applications in communication and gaming within the region.

VideoMolmo: Spatio-Temporal Grounding Meets Pointing

arXiv ·

Researchers from MBZUAI have introduced VideoMolmo, a large multimodal model for spatio-temporal pointing conditioned on textual descriptions. The model incorporates a temporal module with an attention mechanism and a temporal mask fusion pipeline using SAM2 for improved coherence across video sequences. They also curated a dataset of 72k video-caption pairs and introduced VPoS-Bench, a benchmark for evaluating generalization across real-world scenarios, with code and models publicly available.

How AI is building a whole new you

MBZUAI ·

MBZUAI researchers are working on digital twin technology that can replicate human beings in detail, with real-time data flow between the physical and virtual. This project aims to extend digital twins from objects to organic entities like humans, plants and animals. The technology mines data from cameras, sensors, wearables, and other sources to predict health issues before they arise. Why it matters: This research has the potential to transform healthcare by enabling the prediction and prevention of health issues.

High-quality Neural Reconstruction in Real-world Scenes

MBZUAI ·

A researcher at the University of Oxford presented new findings on 3D neural reconstruction. The talk introduced a dataset comprising real-world video captures with perfect 3D models. A novel joint optimization method refines camera poses during the reconstruction process. Why it matters: High-quality 3D reconstruction has broad applicability to robotics and computer vision applications in the region.

Cross-modal understanding and generation of multimodal content

MBZUAI ·

Nicu Sebe from the University of Trento presented recent work on video generation, focusing on animating objects in a source image using external information like labels, driving videos, or text. He introduced a Learnable Game Engine (LGE) trained from monocular annotated videos, which maintains states of scenes, objects, and agents to render controllable viewpoints. Why it matters: This talk highlights advancements in cross-modal AI, potentially enabling new applications in gaming, simulation, and content creation within the region.