Researchers at KAUST have developed a new method called Deep State Identifier for extracting information from videos for reinforcement learning. The method learns to predict returns from video-encoded episodes and identifies critical states using mask-based sensitivity analysis. Experiments demonstrate the method's potential for understanding and improving agent behavior in DRL.
KAUST's Image and Video Understanding Lab is developing machine learning algorithms for computer vision and object tracking, with applications in video content search and UAV navigation. Their algorithms can detect specific activities in videos, helping platforms detect unwanted content and deliver relevant ads. The object tracking algorithm is also used to empower UAVs, enabling them to follow objects autonomously. Why it matters: This research enhances video content analysis and UAV capabilities, positioning KAUST as a leader in computer vision and AI applications within the region.
KAUST led a session at the World Economic Forum's Meeting of the New Champions in Dalian, China, focusing on sustainability science. President Tony Chan and faculty members Peiying Hong, Mohamed Eddaoudi, and Derya Baran presented KAUST's research in water reuse, carbon capture, and transparent solar cells. Derya Baran highlighted KAUST spinoff iyris, which aims to turn windows into solar power plants. Why it matters: This showcases KAUST's role as an innovative hub for global research and education, particularly in green technologies, and highlights the university's commitment to addressing environmental challenges.
A new paper at ICCV 2025, co-authored by MBZUAI Ph.D. student Dmitry Demidov, introduces Dense-WebVid-CoVR, a 1.6-million sample benchmark for composed video retrieval (CoVR). The benchmark features longer, context-rich descriptions and modification texts, generated using Gemini Pro and GPT-4o, with manual verification. The paper also presents a unified fusion approach that jointly reasons across video and text inputs, improving performance on fine-grained edit details. Why it matters: This work advances video search capabilities by enabling more human-like queries, which is crucial for creative and analytic workflows that require nuanced video retrieval.
MBZUAI researchers introduce VideoMathQA, a new benchmark for evaluating mathematical reasoning in videos, requiring models to interpret visual information, text, and spoken cues. The dataset spans 10 mathematical domains with videos ranging from 10 seconds to over 1 hour, and includes multi-step reasoning annotations. The benchmark aims to evaluate temporal cross-modal reasoning and highlights the limitations of existing approaches in complex video-based mathematical problem solving.
The UAE government has issued a warning to the public regarding the dangers of misleading AI-generated videos, particularly those used to spread rumors and false information. Authorities emphasized the importance of verifying the credibility of video content before sharing it on social media. The warning highlights potential legal consequences for individuals involved in creating or disseminating such content. Why it matters: This proactive stance reflects growing concerns in the UAE about the misuse of AI-driven technologies and its commitment to combatting disinformation.
KAUST doctoral student Royale Hardenstine is conducting whale shark research in the Red Sea. The research is captured in a video produced by KAUST. Why it matters: This highlights KAUST's ongoing research efforts in marine biology and Red Sea ecosystems.