Middle East AI

Weekly Digest

Oct 6 – Oct 12, 2025

Top Stories

MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning

arXiv · · CV LLM

Researchers introduce MATRIX, a vision-centric agent tuning framework for robust tool-use reasoning in VLMs. The framework includes M-TRACE, a dataset of 28.5K multimodal tasks with 177K verified trajectories, and Pref-X, a set of 11K automatically generated preference pairs. Experiments show MATRIX consistently outperforms open- and closed-source VLMs across three benchmarks.