Aligning Dense Retrievers with LLM Utility via Distillation
arXiv ·
Summary
Researchers proposed Utility-Aligned Embeddings (UAE), a new framework designed to enhance Retrieval-Augmented Generation (RAG) by merging the precision of LLM re-ranking with the efficiency of dense vector retrieval. UAE trains a bi-encoder to imitate an LLM utility distribution using a Utility-Modulated InfoNCE objective, injecting graded utility signals directly into the embedding space. On the QASPER benchmark, UAE improved retrieval Recall@1 by 30.59% and was over 180 times faster than efficient LLM re-ranking methods while preserving competitive performance. Why it matters: This approach offers a practical way to significantly improve the accuracy and speed of RAG systems by providing more reliable contexts at scale without heavy computational cost.
Keywords
Dense Retrieval · RAG · LLM · Embeddings · Information Retrieval
Get the weekly digest
Top AI stories from the GCC region, every week.