Search

Results for "MedAgentSim"

The diagnosis game: A simulated hospital environment to measure AI agents’ diagnostic abilities

MBZUAI · Invalid Date

MBZUAI researchers developed MedAgentSim, a simulated hospital environment to evaluate AI diagnostic abilities. The simulation uses LLM-powered agents to mimic doctor-patient conversations, providing a dynamic assessment of diagnostic skills. The system includes doctor, patient, and evaluator agents that interact within the simulated hospital, making real-time decisions. Why it matters: This research offers a more realistic evaluation of AI in clinical settings, addressing limitations of current benchmarks and potentially improving AI's use in healthcare.

From Individual to Society: Social Simulation Driven by LLM-based Agent

MBZUAI · Invalid Date

Fudan University's Zhongyu Wei presented research on social simulation driven by LLMs, covering individual and large-scale social movement simulation. Wei directs the Data Intelligence and Social Computing Lab (Fudan DISC) and has published extensively on multimodal large models and social computing. His work includes the Volcano multimodal model, DISC-MedLLM, and ElectionSim. Why it matters: Using LLMs for social simulation could provide new tools for understanding and potentially predicting social dynamics in the Arab world.

Multi-agent Time-based Decision-making for the Search and Action Problem

arXiv · Feb 27

This paper introduces a decentralized multi-agent decision-making framework for search and action problems under time constraints, treating time as a budgeted resource where actions have costs and rewards. The approach uses probabilistic reasoning to optimize decisions, maximizing reward within the given time. Evaluated in a simulated search, pick, and place scenario inspired by the Mohamed Bin Zayed International Robotics Challenge (MBZIRC), the algorithm outperformed benchmark strategies. Why it matters: The framework's validation in a Gazebo environment signals potential for real-world robotic applications, particularly in time-sensitive and cooperative tasks within the robotics domain in the UAE.

Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records

arXiv · Sep 12

Researchers address the challenge of limited Arabic medical dialogue data by generating 80,000 synthetic question-answer pairs using ChatGPT-4o and Gemini 2.5 Pro, expanding an initial dataset of 20,000 records. They fine-tuned five LLMs, including Mistral-7B and AraGPT2, and evaluated performance using BERTScore and expert review. Results showed that training with ChatGPT-4o-generated data led to higher F1-scores and fewer hallucinations across models. Why it matters: This demonstrates the potential of synthetic data augmentation to improve domain-specific Arabic language models, particularly for low-resource medical NLP applications.

Learning to Cooperate in Multi-Agent Systems

MBZUAI · Invalid Date

Dr. Yali Du from King's College London will give a presentation on learning to cooperate in multi-agent systems. Her research focuses on enabling cooperative and responsible behavior in machines using reinforcement learning and foundation models. She will discuss enhancing collaboration within social contexts, fostering human-AI coordination, and achieving scalable alignment. Why it matters: This highlights the growing importance of research into multi-agent systems and human-AI interaction, crucial for developing AI that integrates effectively and ethically into society.

A new model for drug development

MBZUAI · Invalid Date

MBZUAI's Professor Le Song is developing an AI-driven simulation to model the human body at societal, organ, tissue, cellular, and molecular levels. The goal is to reduce the time and cost associated with bringing new medicines to market by removing the need for wet lab biological research. Song aims to create a comprehensive model using machine learning. Why it matters: This research could revolutionize drug discovery in the region by accelerating the development process and reducing reliance on traditional research methods.