MBZUAI Provost Timothy Baldwin predicts that 2025 will be a breakout year for agentic AI, with 33% of enterprise software applications including agentic AI capabilities by 2028. MBZUAI doctoral students Wafa Alghallabi and Omkar Thawaker have launched Lawa.AI, an AI agent being tested on the university's website to provide faster answers and deeper understanding. Lawa.AI evolved from a research project in multimodal efficiency and LLMs and aims to bridge the gap between people and information in higher education and government. Why it matters: This highlights the UAE's focus on translating AI research into practical applications and the growing importance of agentic AI in various sectors.
MBZUAI researchers introduce PG-Video-LLaVA, a large multimodal model with pixel-level grounding capabilities for videos, integrating audio cues for enhanced understanding. The model uses an off-the-shelf tracker and grounding module to localize objects in videos based on user prompts. PG-Video-LLaVA is evaluated on video question-answering and grounding benchmarks, using Vicuna instead of GPT-3.5 for reproducibility.
LAraBench introduces a benchmark for Arabic NLP and speech processing, evaluating LLMs like GPT-3.5-turbo, GPT-4, BLOOMZ, Jais-13b-chat, Whisper, and USM. The benchmark covers 33 tasks across 61 datasets, using zero-shot and few-shot learning techniques. Results show that SOTA models generally outperform LLMs in zero-shot settings, though larger LLMs with few-shot learning reduce the gap. Why it matters: This benchmark helps assess and improve the performance of LLMs on Arabic language tasks, highlighting areas where specialized models still excel.
This paper introduces a predictive analysis of Arabic court decisions, utilizing 10,813 real commercial court cases. The study evaluates LLaMA-7b, JAIS-13b, and GPT3.5-turbo models under zero-shot, one-shot, and fine-tuned training paradigms, also experimenting with summarization and translation. GPT-3.5 models significantly outperformed others, exceeding JAIS model performance by 50%, while also demonstrating the unreliability of most automated metrics. Why it matters: This research bridges computational linguistics and Arabic legal analytics, offering insights for enhancing judicial processes and legal strategies in the Arabic-speaking world.
PwC has published a report offering strategic guidance to CEOs on navigating the landscape of artificial intelligence. The report likely outlines frameworks for determining where companies should proactively invest and innovate ('lead'), adopt standard industry practices ('lag'), or deprioritize ('exit') specific AI initiatives. It probably addresses critical aspects such as resource allocation, risk management, and competitive differentiation through AI adoption. Why it matters: This strategic counsel can assist businesses in the Middle East in formulating robust AI strategies, optimizing their investments, and enhancing their market competitiveness.
This paper presents a UI-level evaluation of ALLaM-34B, an Arabic-centric LLM developed by SDAIA and deployed in the HUMAIN Chat service. The evaluation used a prompt pack spanning various Arabic dialects, code-switching, reasoning, and safety, with outputs scored by frontier LLM judges. Results indicate strong performance in generation, code-switching, MSA handling, reasoning, and improved dialect fidelity, positioning ALLaM-34B as a robust Arabic LLM suitable for real-world use.
Saudi Arabia has officially designated 2026 as the "Year of Artificial Intelligence," according to a recent announcement. This initiative aims to highlight the Kingdom's commitment to AI development and its potential impact on various sectors. The Saudi government plans to organize events, conferences, and workshops throughout the year to promote AI awareness and adoption. Why it matters: This designation signals Saudi Arabia's intent to become a regional leader in AI and a hub for AI innovation, potentially attracting investment and talent to the country.