Skip to content
GCC AI Research

Search

Results for "fine-tuning"

Parameter-Efficient Fine-Tuning for NLP Models

MBZUAI ·

The article discusses parameter-efficient fine-tuning methods for large NLP models, highlighting their importance due to the increasing size and computational demands of state-of-the-art language models. It provides an overview of these methods, presenting them in a unified view to emphasize their similarities and differences. Indraneil, a PhD candidate at TU Darmstadt's UKP Lab, is researching parameter-efficient fine-tuning, sparsity, and conditional computation methods to improve LLM performance in multilingual, multi-task settings. Why it matters: Efficient fine-tuning techniques are crucial for democratizing access to and accelerating the deployment of large language models in the region and beyond.

Fine-tuning Text-to-Image Models: Reinforcement Learning and Reward Over-Optimization

MBZUAI ·

The article discusses research on fine-tuning text-to-image diffusion models, including reward function training, online reinforcement learning (RL) fine-tuning, and addressing reward over-optimization. A Text-Image Alignment Assessment (TIA2) benchmark is introduced to study reward over-optimization. TextNorm, a method for confidence calibration in reward models, is presented to reduce over-optimization risks. Why it matters: Improving the alignment and fidelity of text-to-image models is crucial for generating high-quality content, and addressing over-optimization enhances the reliability of these models in creative applications.

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

arXiv ·

Researchers from the National Center for AI in Saudi Arabia investigated the sensitivity of Large Language Model (LLM) leaderboards to minor benchmark perturbations. They found that small changes, like choice order, can shift rankings by up to 8 positions. The study recommends hybrid scoring and warns against over-reliance on simple benchmark evaluations, providing code for further research.

Nexus at ArAIEval Shared Task: Fine-Tuning Arabic Language Models for Propaganda and Disinformation Detection

arXiv ·

This paper describes the Nexus team's participation in the ArAIEval shared task focused on detecting propaganda and disinformation in Arabic. The team fine-tuned transformer models and experimented with zero- and few-shot learning using GPT-4. Nexus's system achieved 9th place in subtask 1A and 10th place in subtask 2A. Why it matters: The work contributes to the important goal of automatically identifying and mitigating the spread of disinformation in Arabic content, which is critical for maintaining societal trust and informed public discourse.

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

arXiv ·

A new survey paper provides a deep dive into post-training methodologies for Large Language Models (LLMs), analyzing their role in refining LLMs beyond pretraining. It addresses key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs, and highlights emerging directions in model alignment, scalable adaptation, and inference-time reasoning. The paper also provides a public repository to continually track developments in this fast-evolving field.

QU-NLP at QIAS 2025 Shared Task: A Two-Phase LLM Fine-Tuning and Retrieval-Augmented Generation Approach for Islamic Inheritance Reasoning

arXiv ·

The QU-NLP team presented their approach to the QIAS 2025 shared task on Islamic Inheritance Reasoning, fine-tuning the Fanar-1-9B model using LoRA and integrating it into a RAG pipeline. Their system achieved an accuracy of 0.858 on the final test, outperforming models like GPT 4.5, LLaMA, and Mistral in zero-shot settings. The system particularly excelled in advanced reasoning, achieving 97.6% accuracy. Why it matters: This demonstrates the effectiveness of domain-specific fine-tuning and retrieval augmentation for Arabic LLMs in complex reasoning tasks, even surpassing frontier models.

Saudi-Dialect-ALLaM: LoRA Fine-Tuning for Dialectal Arabic Generation

arXiv ·

This paper introduces Saudi-Dialect-ALLaM, a LoRA fine-tuned version of the Saudi Arabian foundation model ALLaM-7B-Instruct-preview, designed to improve the generation of Saudi dialects (Najdi and Hijazi). The model is trained on a private dataset of 5,466 synthetic instruction-response pairs, with two variants explored: Dialect-Token and No-Token training. Results indicate that the Dialect-Token model achieves superior dialect control and fidelity compared to generic instruction models, although the dataset and model weights are not released.