Skip to content
GCC AI Research

Search

Results for "chatbot"

Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records

arXiv ·

Researchers address the challenge of limited Arabic medical dialogue data by generating 80,000 synthetic question-answer pairs using ChatGPT-4o and Gemini 2.5 Pro, expanding an initial dataset of 20,000 records. They fine-tuned five LLMs, including Mistral-7B and AraGPT2, and evaluated performance using BERTScore and expert review. Results showed that training with ChatGPT-4o-generated data led to higher F1-scores and fewer hallucinations across models. Why it matters: This demonstrates the potential of synthetic data augmentation to improve domain-specific Arabic language models, particularly for low-resource medical NLP applications.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv ·

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.