Skip to content
GCC AI Research

Large Language Models for the Real World: Explorations of Sparse, Cross-lingual Understanding and Instruction-Tuned LLMs

MBZUAI · Notable

Summary

Veselin Stoyanov of Tome (previously Facebook AI) gave a talk at MBZUAI on challenges to LLM adoption. The talk covered sparse models, multilingual LLMs, and instruction finetuning. Stoyanov previously led development of RoBERTa, XLM-R, and MultiRay. Why it matters: This talk highlights MBZUAI's role as a forum for discussing key challenges and advancements in large language models, with implications for Arabic NLP and regional AI development.

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

Towards Inclusive NLP: Assessing Compressed Multilingual Transformers across Diverse Language Benchmarks

arXiv ·

This paper benchmarks multilingual and monolingual LLM performance across Arabic, English, and Indic languages, examining model compression effects like pruning and quantization. Multilingual models outperform language-specific counterparts, demonstrating cross-lingual transfer. Quantization maintains accuracy while promoting efficiency, but aggressive pruning compromises performance, particularly in larger models. Why it matters: The findings highlight strategies for scalable and fair multilingual NLP, addressing hallucination and generalization errors in low-resource languages.

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

arXiv ·

A new survey paper provides a deep dive into post-training methodologies for Large Language Models (LLMs), analyzing their role in refining LLMs beyond pretraining. It addresses key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs, and highlights emerging directions in model alignment, scalable adaptation, and inference-time reasoning. The paper also provides a public repository to continually track developments in this fast-evolving field.

Towards Real-world Fact-Checking with Large Language Models

MBZUAI ·

Iryna Gurevych from TU Darmstadt presented research on using large language models for real-world fact-checking, focusing on dismantling misleading narratives from misinterpreted scientific publications and detecting misinformation via visual content. The research aims to explain why a false claim was believed, why it is false, and why the alternative is correct. Why it matters: Addressing misinformation, especially when supported by seemingly credible sources, is critical for public health, conflict resolution, and maintaining trust in institutions in the Middle East and globally.

Fined tuned across languages: improving LLM instruction following beyond English

MBZUAI ·

MBZUAI researchers created Bactrian-X, a new dataset to improve LLM instruction following in low-resource languages. The dataset leverages instruction tuning, pairing instructions in various languages with expected responses. Bactrian-X builds upon existing open-source instruction tuning models. Why it matters: This work aims to democratize access to LLMs by enabling users to interact with them in their native languages, even when English proficiency is limited.