Aligning Dense Retrievers with LLM Utility via Distillation

arXiv · April 24, 2026

Summary

Researchers proposed Utility-Aligned Embeddings (UAE), a new framework designed to enhance Retrieval-Augmented Generation (RAG) by merging the precision of LLM re-ranking with the efficiency of dense vector retrieval. UAE trains a bi-encoder to imitate an LLM utility distribution using a Utility-Modulated InfoNCE objective, injecting graded utility signals directly into the embedding space. On the QASPER benchmark, UAE improved retrieval Recall@1 by 30.59% and was over 180 times faster than efficient LLM re-ranking methods while preserving competitive performance. Why it matters: This approach offers a practical way to significantly improve the accuracy and speed of RAG systems by providing more reliable contexts at scale without heavy computational cost.

Keywords

Dense Retrieval · RAG · LLM · Embeddings · Information Retrieval

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.