Search

Results for "MedPromptX"

MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis

arXiv · Mar 22

The paper introduces MedPromptX, a clinical decision support system using multimodal large language models (MLLMs), few-shot prompting (FP), and visual grounding (VG) for chest X-ray diagnosis, integrating imagery with EHR data. MedPromptX refines few-shot data dynamically for real-time adjustment to new patient scenarios and narrows the search area in X-ray images. The study introduces MedPromptX-VQA, a new visual question answering dataset, and demonstrates state-of-the-art performance with an 11% improvement in F1-score compared to baselines.

A multimodal approach for developing medical diagnoses with AI

MBZUAI · Invalid Date

MBZUAI doctoral student Mai A. Shaaban and colleagues developed MedPromptX, a system that analyzes chest X-rays and patient data to aid lung disease diagnoses. MedPromptX uses multimodal large language models with visual grounding and few-shot prompting, trained on a new dataset of 6,000 patient records (MedPromptX-VQA) derived from MIMIC-IV and MIMIC-CXR. The system addresses the challenge of incomplete electronic health records by leveraging the knowledge embedded in large language models to interpret lab results. Why it matters: This research advances AI-driven medical diagnostics by integrating diverse data sources and addressing data gaps, potentially leading to quicker and more accurate diagnoses.

BiMediX: Bilingual Medical Mixture of Experts LLM

arXiv · Feb 20

MBZUAI researchers introduce BiMediX, a bilingual (English and Arabic) mixture of experts LLM for medical applications. The model is trained on BiMed1.3M, a new 1.3 million bilingual instruction dataset and outperforms existing models like Med42 and Jais-30B on medical benchmarks. Code and models are available on Github.

BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

arXiv · Dec 10

MBZUAI releases BiMediX2, a bilingual (Arabic-English) Bio-Medical Large Multimodal Model, along with the BiMed-V dataset (1.6M samples) and BiMed-MBench evaluation benchmark. BiMediX2 supports multi-turn conversation in Arabic and English and handles diverse medical imaging modalities. The model achieves state-of-the-art results on medical LLM and LMM benchmarks, outperforming existing methods and GPT-4 in specific evaluations.

MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering

arXiv · Jun 28

This paper introduces MOTOR, a multimodal retrieval and re-ranking approach for medical visual question answering (MedVQA) that uses grounded captions and optimal transport to capture relationships between queries and retrieved context, leveraging both textual and visual information. MOTOR identifies clinically relevant contexts to augment VLM input, achieving higher accuracy on MedVQA datasets. Empirical analysis shows MOTOR outperforms state-of-the-art methods by an average of 6.45%.