Search

Results for "supervised fine-tuning"

Parameter-Efficient Fine-Tuning for NLP Models

MBZUAI · Invalid Date

The article discusses parameter-efficient fine-tuning methods for large NLP models, highlighting their importance due to the increasing size and computational demands of state-of-the-art language models. It provides an overview of these methods, presenting them in a unified view to emphasize their similarities and differences. Indraneil, a PhD candidate at TU Darmstadt's UKP Lab, is researching parameter-efficient fine-tuning, sparsity, and conditional computation methods to improve LLM performance in multilingual, multi-task settings. Why it matters: Efficient fine-tuning techniques are crucial for democratizing access to and accelerating the deployment of large language models in the region and beyond.

Fine-tuning Text-to-Image Models: Reinforcement Learning and Reward Over-Optimization

MBZUAI · Invalid Date

The article discusses research on fine-tuning text-to-image diffusion models, including reward function training, online reinforcement learning (RL) fine-tuning, and addressing reward over-optimization. A Text-Image Alignment Assessment (TIA2) benchmark is introduced to study reward over-optimization. TextNorm, a method for confidence calibration in reward models, is presented to reduce over-optimization risks. Why it matters: Improving the alignment and fidelity of text-to-image models is crucial for generating high-quality content, and addressing over-optimization enhances the reliability of these models in creative applications.

Severity-Aware Weighted Loss for Arabic Medical Text Generation

arXiv · Apr 7

Researchers proposed a severity-aware weighted loss method to fine-tune Arabic language models for medical text generation, prioritizing severe clinical cases. This approach utilizes soft severity probabilities, derived from an AraBERT-based classifier, to dynamically scale token-level loss contributions during optimization on the MAQA dataset. The method consistently improved performance across ten Arabic LLMs, with AraGPT2-Base increasing from 54.04% to 66.14% and AraGPT2-Medium from 59.16% to 67.18%. Why it matters: This novel fine-tuning strategy addresses a critical limitation in medical AI by enhancing the safety and reliability of Arabic medical large language models, particularly in high-stakes clinical scenarios.