Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records

arXiv · September 12, 2025 · Significant research

Summary

Researchers address the challenge of limited Arabic medical dialogue data by generating 80,000 synthetic question-answer pairs using ChatGPT-4o and Gemini 2.5 Pro, expanding an initial dataset of 20,000 records. They fine-tuned five LLMs, including Mistral-7B and AraGPT2, and evaluated performance using BERTScore and expert review. Results showed that training with ChatGPT-4o-generated data led to higher F1-scores and fewer hallucinations across models. Why it matters: This demonstrates the potential of synthetic data augmentation to improve domain-specific Arabic language models, particularly for low-resource medical NLP applications.

Keywords

Arabic NLP · medical chatbot · synthetic data · ChatGPT-4o · Gemini

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models

arXiv · Nov 20

A new study introduces Sporo AraSum, a language model designed for Arabic clinical documentation, and compares it to JAIS using synthetic datasets and modified PDQI-9 metrics. Sporo AraSum significantly outperformed JAIS in quantitative AI metrics and qualitative attributes related to accuracy, utility, and cultural competence. The model addresses the nuances of Arabic while reducing AI hallucinations, making it suitable for Arabic-speaking healthcare. Why it matters: The model offers a more culturally and linguistically sensitive solution for Arabic clinical documentation, potentially improving healthcare workflows and patient outcomes in the region.

Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic

arXiv · Dec 5

The paper introduces Arabic Stable LM, a 1.6B parameter Arabic-centric language model, in both base and chat versions. The Arabic Stable LM 1.6B chat model achieves strong results on several benchmarks, outperforming models with up to 8x more parameters. The study also demonstrates the benefit of incorporating synthetic instruction tuning data through a large synthetic dialogue dataset. Why it matters: This work makes Arabic LLMs more accessible by reducing the parameter size while maintaining strong performance, facilitating deployment in resource-constrained environments.

Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records

Summary

Keywords

Related

Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models

Arabic Stable LM: Adapting Stable LM 2 1.6B to Arabic