Performance Prediction via Bayesian Matrix Factorisation for Multilingual Natural Language Processing Tasks

MBZUAI · Notable

Summary

A new Bayesian matrix factorization approach is explored for performance prediction in multilingual NLP, aiming to reduce the experimental burden of evaluating various language combinations. The approach outperforms state-of-the-art methods in NLP benchmarks like machine translation and cross-lingual entity linking. It also avoids hyperparameter tuning and provides uncertainty estimates over predictions. Why it matters: Accurate performance prediction methods accelerate multilingual NLP research by reducing computational costs and improving experimental efficiency, especially valuable for Arabic NLP tasks.

Keywords

Bayesian matrix factorization · performance prediction · multilingual NLP · machine translation · cross-lingual entity linking

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Language Models' Factuality Depends on the Language of Inquiry

arXiv · Feb 25

Researchers introduce a benchmark to evaluate the factual recall and knowledge transferability of multilingual language models across 13 languages. The study reveals that language models often fail to transfer knowledge between languages, even when they possess the correct information in one language. The benchmark and evaluation framework are released to drive future research in multilingual knowledge transfer.

Towards Inclusive NLP: Assessing Compressed Multilingual Transformers across Diverse Language Benchmarks

arXiv · Jul 25

This paper benchmarks multilingual and monolingual LLM performance across Arabic, English, and Indic languages, examining model compression effects like pruning and quantization. Multilingual models outperform language-specific counterparts, demonstrating cross-lingual transfer. Quantization maintains accuracy while promoting efficiency, but aggressive pruning compromises performance, particularly in larger models. Why it matters: The findings highlight strategies for scalable and fair multilingual NLP, addressing hallucination and generalization errors in low-resource languages.

Predicting and Explaining Cross-lingual Zero-shot and Few-shot Transfer in LLMs

MBZUAI · Invalid Date

Project LITMUS explores predicting cross-lingual transfer accuracy in multilingual language models, even without test data in target languages. The goal is to estimate model performance in low-resource languages and optimize training data for desired cross-lingual performance. This research aims to identify factors influencing cross-lingual transfer, contributing to linguistically fair MMLMs. Why it matters: Improving cross-lingual transfer is vital for creating more equitable and effective multilingual AI systems, especially for languages with limited resources.