Skip to content
GCC AI Research

I see what you’re saying: the Abu Dhabi AI researchers making video dubbing sync

MBZUAI · Notable

Summary

Researchers at MBZUAI have developed Auto-DUB, a system using deep learning, NLP, and CV to improve audio-visual dubbing, particularly for educational videos. The three-step process generates subtitles, creates an audio representation, and synchronizes the audio with lip movements. The system aims to overcome language barriers in e-learning by providing accurate translations and lip-synced audio. Why it matters: This research addresses a critical need in online education by making content more accessible to non-native English speakers, potentially expanding access to global educational resources in the Arab world.

Keywords

MBZUAI · dubbing · e-learning · NLP · CV

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

MBZUAI team wins top prize at inaugural Arabic Natural Language Processing Conference

MBZUAI ·

An MBZUAI team won the best paper award at the inaugural Arabic Natural Language Processing Conference for their work on processing Arabic speech. Their study establishes a new approach to tackle the complexities of spoken Arabic, which differs significantly from text-based language models. The team's approach aims to advance new tools for Arabic speakers by addressing challenges like intonation and the continuous nature of speech. Why it matters: This award highlights the importance of specialized research in Arabic NLP, as mainstream LLMs often face limitations in accurately processing the nuances of Arabic speech.

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

arXiv ·

MBZUAI researchers introduce PG-Video-LLaVA, a large multimodal model with pixel-level grounding capabilities for videos, integrating audio cues for enhanced understanding. The model uses an off-the-shelf tracker and grounding module to localize objects in videos based on user prompts. PG-Video-LLaVA is evaluated on video question-answering and grounding benchmarks, using Vicuna instead of GPT-3.5 for reproducibility.

20 million words and counting: UAE’s grand plan to power Arabic with AI - Gulf Business

WAM ·

The UAE government is developing large language models (LLMs) specifically for the Arabic language, with a target training dataset of 20 million words. This initiative aims to overcome the underrepresentation of Arabic in existing AI models. The project seeks to enhance AI's ability to understand and generate nuanced Arabic content. Why it matters: A national Arabic LLM can enable culturally relevant AI applications across various sectors in the region, from education to government services.