Skip to content
GCC AI Research

Weekly Digest

Nov 24 – Nov 30, 2025

Top Stories

Video-R2: Reinforcing Consistent and Grounded Reasoning in Multimodal Language Models

arXiv · · CV RL

Researchers at MBZUAI have introduced Video-R2, a reinforcement learning approach to improve the consistency and visual grounding of reasoning in multimodal language models. Video-R2 combines timestamp-aware supervised fine-tuning with Group Relative Policy Optimization (GRPO) guided by a Temporal Alignment Reward (TAR). The model demonstrates higher Think Answer Consistency (TAC), Video Attention Score (VAS), and accuracy across multiple benchmarks, showing improved temporal alignment and reasoning coherence for video understanding.

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

arXiv · · CV RL

Researchers at MBZUAI introduce "Interactive Video Reasoning," a new paradigm enabling models to actively "think with videos" by performing iterative visual actions to gather and refine evidence. They developed Video CoM, which reasons through a Chain of Manipulations (CoM), and constructed Video CoM Instruct, an 18K instruction tuning dataset for multi-step manipulation reasoning. The model is further optimized via reinforcement learning with reasoning aware Group Relative Policy Optimization (GRPO), achieving strong results across nine video reasoning benchmarks.

Mastiska Raises $10M Seed For Sovereign AI Chip Play in UAE - EE Times

GCC AI Funding · · Funding Infrastructure

UAE-based sovereign AI chip company Mastiska has raised a $10 million seed round. The funding will be used to develop sovereign AI chips tailored to the specific needs of the UAE and wider region. Mastiska aims to address concerns around data privacy and security by providing locally-controlled AI infrastructure. Why it matters: This investment signals the UAE's commitment to building its own AI capabilities and reducing reliance on foreign technology.

FanarGuard: A Culturally-Aware Moderation Filter for Arabic Language Models

arXiv · · NLP LLM

The paper introduces FanarGuard, a bilingual moderation filter for Arabic and English language models that considers both safety and cultural alignment. A dataset of 468K prompt-response pairs was created and scored by LLM judges on harmlessness and cultural awareness to train the filter. The first benchmark targeting Arabic cultural contexts was developed to evaluate cultural alignment. Why it matters: FanarGuard advances context-sensitive AI safeguards by integrating cultural awareness into content moderation, addressing a critical gap in current alignment techniques.

More This Week