Search

Results for "knowledge distillation"

Knowledge distillation and the greening of LLMs

MBZUAI · Invalid Date

Researchers from MBZUAI, University of British Columbia, and Monash University have created LaMini-LM, a collection of small language models distilled from ChatGPT. LaMini-LM is trained on a dataset of 2.58M instructions and can be deployed on consumer laptops and mobile devices. The smaller models perform almost as well as larger counterparts while addressing security concerns. Why it matters: This work enables the deployment of LLMs in resource-constrained environments and enhances data security by reducing reliance on cloud-based LLMs.

Distillation Policy Optimization

arXiv · Feb 1

The paper introduces a novel actor-critic framework called Distillation Policy Optimization that combines on-policy and off-policy data for reinforcement learning. It incorporates variance reduction mechanisms like a unified advantage estimator (UAE) and a residual baseline. The empirical results demonstrate improved sample efficiency for on-policy algorithms, bridging the gap with off-policy methods.

Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation

arXiv · Dec 15

Researchers at MBZUAI have demonstrated a method called "Data Laundering" to artificially boost language model benchmark scores using knowledge distillation. The technique covertly transfers benchmark-specific knowledge, leading to inflated accuracy without genuine improvements in reasoning. The study highlights a vulnerability in current AI evaluation practices and calls for more robust benchmarks.

Domain Adaptable Fine-Tune Distillation Framework For Advancing Farm Surveillance

arXiv · Feb 10

The paper introduces a framework for camel farm monitoring using a combination of automated annotation and fine-tune distillation. The Unified Auto-Annotation framework uses GroundingDINO and SAM to automatically annotate surveillance video data. The Fine-Tune Distillation framework then fine-tunes student models like YOLOv8, transferring knowledge from a larger teacher model, using data from Al-Marmoom Camel Farm in Dubai.

Aligning Dense Retrievers with LLM Utility via Distillation

arXiv · Apr 24

Researchers proposed Utility-Aligned Embeddings (UAE), a new framework designed to enhance Retrieval-Augmented Generation (RAG) by merging the precision of LLM re-ranking with the efficiency of dense vector retrieval. UAE trains a bi-encoder to imitate an LLM utility distribution using a Utility-Modulated InfoNCE objective, injecting graded utility signals directly into the embedding space. On the QASPER benchmark, UAE improved retrieval Recall@1 by 30.59% and was over 180 times faster than efficient LLM re-ranking methods while preserving competitive performance. Why it matters: This approach offers a practical way to significantly improve the accuracy and speed of RAG systems by providing more reliable contexts at scale without heavy computational cost.