Large Language Models for the Real World: Explorations of Sparse, Cross-lingual Understanding and Instruction-Tuned LLMs

MBZUAI · Notable

Summary

Veselin Stoyanov of Tome (previously Facebook AI) gave a talk at MBZUAI on challenges to LLM adoption. The talk covered sparse models, multilingual LLMs, and instruction finetuning. Stoyanov previously led development of RoBERTa, XLM-R, and MultiRay. Why it matters: This talk highlights MBZUAI's role as a forum for discussing key challenges and advancements in large language models, with implications for Arabic NLP and regional AI development.

Keywords

LLM · sparse models · multilingual · instruction finetuning · MBZUAI

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Towards Inclusive NLP: Assessing Compressed Multilingual Transformers across Diverse Language Benchmarks

arXiv · Jul 25

This paper benchmarks multilingual and monolingual LLM performance across Arabic, English, and Indic languages, examining model compression effects like pruning and quantization. Multilingual models outperform language-specific counterparts, demonstrating cross-lingual transfer. Quantization maintains accuracy while promoting efficiency, but aggressive pruning compromises performance, particularly in larger models. Why it matters: The findings highlight strategies for scalable and fair multilingual NLP, addressing hallucination and generalization errors in low-resource languages.

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

arXiv · Feb 28

A new survey paper provides a deep dive into post-training methodologies for Large Language Models (LLMs), analyzing their role in refining LLMs beyond pretraining. It addresses key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs, and highlights emerging directions in model alignment, scalable adaptation, and inference-time reasoning. The paper also provides a public repository to continually track developments in this fast-evolving field.

Towards Real-world Fact-Checking with Large Language Models

MBZUAI · Invalid Date

Iryna Gurevych from TU Darmstadt presented research on using large language models for real-world fact-checking, focusing on dismantling misleading narratives from misinterpreted scientific publications and detecting misinformation via visual content. The research aims to explain why a false claim was believed, why it is false, and why the alternative is correct. Why it matters: Addressing misinformation, especially when supported by seemingly credible sources, is critical for public health, conflict resolution, and maintaining trust in institutions in the Middle East and globally.

Fined tuned across languages: improving LLM instruction following beyond English