Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation
MBZUAI releases Bactrian-X, a multilingual parallel dataset of 3.4 million instruction-response pairs across 52 languages. They trained low-rank adaptation (LoRA) adapters using this dataset, creating lightweight, replaceable components for large language models. Experiments show the LoRA-based models outperform vanilla and existing instruction-tuned models in multilingual settings.