Video search gets closer to how humans look for clips

MBZUAI · Significant research

Summary

A new paper at ICCV 2025, co-authored by MBZUAI Ph.D. student Dmitry Demidov, introduces Dense-WebVid-CoVR, a 1.6-million sample benchmark for composed video retrieval (CoVR). The benchmark features longer, context-rich descriptions and modification texts, generated using Gemini Pro and GPT-4o, with manual verification. The paper also presents a unified fusion approach that jointly reasons across video and text inputs, improving performance on fine-grained edit details. Why it matters: This work advances video search capabilities by enabling more human-like queries, which is crucial for creative and analytic workflows that require nuanced video retrieval.

Keywords

video retrieval · composed video retrieval · CoVR · MBZUAI · ICCV

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.