Search

Results for "Atlas-Chat"

Atlas-Chat: Adapting Large Language Models for Low-Resource Moroccan Arabic Dialect

arXiv · Sep 26

Researchers developed Atlas-Chat, a collection of LLMs for dialectal Arabic, focusing on Moroccan Arabic (Darija). They constructed an instruction dataset by consolidating existing Darija language resources and translating English instructions. Atlas-Chat models (2B, 9B, 27B) outperform state-of-the-art and Arabic-specialized LLMs like LLaMa, Jais, and AceGPT on Darija NLP tasks. Why it matters: This work addresses the gap in LLM support for low-resource Arabic dialects, providing a methodology for instruction-tuning and benchmarks for future research.

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

arXiv · Nov 24

Researchers at MBZUAI have developed GeoChat, a new vision-language model (VLM) specifically designed for remote sensing imagery. GeoChat addresses the limitations of general-domain VLMs in accurately interpreting high-resolution remote sensing data, offering both image-level and region-specific dialogue capabilities. The model is trained on a novel remote sensing multimodal instruction-following dataset and demonstrates strong zero-shot performance across tasks like image captioning and visual question answering.

UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

arXiv · Aug 24

This paper presents a UI-level evaluation of ALLaM-34B, an Arabic-centric LLM developed by SDAIA and deployed in the HUMAIN Chat service. The evaluation used a prompt pack spanning various Arabic dialects, code-switching, reasoning, and safety, with outputs scored by frontier LLM judges. Results indicate strong performance in generation, code-switching, MSA handling, reasoning, and improved dialect fidelity, positioning ALLaM-34B as a robust Arabic LLM suitable for real-world use.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv · Dec 18

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

arXiv · Jun 8

Video-ChatGPT is a new multimodal model that combines a video-adapted visual encoder with a large language model (LLM) to enable detailed video understanding and conversation. The authors introduce a new dataset of 100,000 video-instruction pairs for training the model. They also develop a quantitative evaluation framework for video-based dialogue models.

MBZUAI Strengthens Research Ties with France’s École Polytechnique

MBZUAI · Invalid Date

MBZUAI and École Polytechnique are deepening research collaboration through a Collaborative Research Agreement, focusing on large language models, foundation models for reasoning, and AI applications in biology, health, and AI safety. The partnership builds on a previous MoU and Scholars Exchange Program Agreement between the two institutions. MBZUAI's France Lab has developed Atlas-Chat, a family of open-source LLMs for the Moroccan Arabic dialect Darija, with models including Atlas-Chat-2B and Atlas-Chat-9B. Why it matters: This collaboration strengthens the AI ecosystems in both France and the UAE, fostering joint research efforts and supporting the next generation of AI researchers and innovators, with a specific focus on Arabic NLP.