Aligning Dense Retrievers with LLM Utility via DistillationAligning Dense Retrievers with LLM Utility via Distillation

arXiv · April 24, 2026

Summary

Researchers proposed Utility-Aligned Embeddings (UAE), a new framework to improve dense vector retrieval for Retrieval-Augmented Generation (RAG) by aligning it with LLM utility. UAE trains a bi-encoder to imitate an LLM's utility distribution, derived from perplexity reduction, using a Utility-Modulated InfoNCE objective. On the QASPER benchmark, UAE achieved a 30.59% improvement in Recall@1 and was over 180 times faster than efficient LLM re-ranking methods while preserving competitive performance. Why it matters: This approach offers a significant leap in RAG efficiency and accuracy, providing a method to align retrieval with generative utility without test-time LLM inference, which could enable more scalable and precise LLM applications.

Keywords

Dense Retrieval · RAG · LLM · Distillation · Utility-Aligned Embeddings

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.