MIRA: A Novel Framework for Fusing Modalities in Medical RAG

arXiv · July 10, 2025 · Significant research

Summary

MBZUAI researchers have introduced MIRA, a novel framework for improving the factual accuracy of multimodal large language models in medical applications. MIRA uses calibrated retrieval to manage factual risk and integrates image embeddings with a medical knowledge base for efficient reasoning. Evaluated on medical VQA and report generation benchmarks, MIRA achieves state-of-the-art results, with code available on GitHub.

Keywords

MLLM · RAG · Medical AI · Multimodal · Retrieval-Augmented Generation

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering

arXiv · Jun 28

This paper introduces MOTOR, a multimodal retrieval and re-ranking approach for medical visual question answering (MedVQA) that uses grounded captions and optimal transport to capture relationships between queries and retrieved context, leveraging both textual and visual information. MOTOR identifies clinically relevant contexts to augment VLM input, achieving higher accuracy on MedVQA datasets. Empirical analysis shows MOTOR outperforms state-of-the-art methods by an average of 6.45%.

MIRA: A Novel Framework for Fusing Modalities in Medical RAG

Summary

Keywords

Related

MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering