Skip to content
GCC AI Research

Search

Results for "chatbot"

A Cross-cultural Corpus of Annotated Verbal and Nonverbal Behaviors in Receptionist Encounters

arXiv ·

Researchers created a cross-cultural corpus of annotated verbal and nonverbal behaviors in receptionist interactions. The corpus includes native speakers of American English and Arabic role-playing scenarios at university reception desks in Doha, Qatar, and Pittsburgh, USA. The manually annotated nonverbal behaviors include gaze direction, hand gestures, torso positions, and facial expressions. Why it matters: This resource can be valuable for the human-robot interaction community, especially for building culturally aware AI systems.

Collaboration releases Vicuna – environmentally friendly, cost-effective rival to ChatGPT

MBZUAI ·

Researchers from MBZUAI, UC Berkeley, CMU, Stanford, and UC San Diego collaborated to create Vicuna, an open-source chatbot that costs $300 to train, unlike ChatGPT which costs over $4 million. Vicuna achieves 90% of ChatGPT's subjective language quality while being far more energy-efficient and can run on a single GPU. It was fine-tuned from Meta AI’s LLaMA model using user-shared conversations and has gained significant traction on GitHub. Why it matters: This research demonstrates that high-quality chatbots can be developed at a fraction of the cost and environmental impact, opening up new possibilities for sustainable AI development in the region.

LLM-based Multi-class Attack Analysis and Mitigation Framework in IoT/IIoT Networks

arXiv ·

This paper introduces a framework that combines machine learning for multi-class attack detection in IoT/IIoT networks with large language models (LLMs) for attack behavior analysis and mitigation suggestion. The framework uses role-play prompt engineering with RAG to guide LLMs like ChatGPT-o3 and DeepSeek-R1, and introduces new evaluation metrics for quantitative assessment. Experiments using Edge-IIoTset and CICIoT2023 datasets showed Random Forest as the best detection model and ChatGPT-o3 outperforming DeepSeek-R1 in attack analysis and mitigation.

Meeting unmet legal needs with NLP

MBZUAI ·

Justice Connect, an Australian charity, collaborated with MBZUAI's Prof. Timothy Baldwin to improve their legal intake tool using NLP. The tool helps route legal requests, but users struggled to identify the relevant area of law, leading to delays and frustration. By applying NLP, the collaboration aims to help users more easily navigate the tool and access appropriate legal resources. Why it matters: This project demonstrates how NLP can be applied to improve access to justice and address unmet legal needs, particularly for those unfamiliar with legal terminology.

Automated Generation of Personalized Pedagogical Interventions in Intelligent Tutoring Systems

MBZUAI ·

Ekaterina Kochmar from the University of Bath presented the Korbit Intelligent Tutoring System (ITS), an AI-powered dialogue-based platform providing personalized learning experiences. A comparative study showed that students using Korbit achieved 2-2.5 times higher learning gains and higher completion rates compared to a traditional MOOC platform. Kochmar is also a co-founder and CSO of Korbit AI. Why it matters: The research highlights the potential of AI to deliver personalized education and significantly improve learning outcomes in online STEM education, an area of focus for many GCC universities.

Scaling Arabic Medical Chatbots Using Synthetic Data: Enhancing Generative AI with Synthetic Patient Records

arXiv ·

Researchers address the challenge of limited Arabic medical dialogue data by generating 80,000 synthetic question-answer pairs using ChatGPT-4o and Gemini 2.5 Pro, expanding an initial dataset of 20,000 records. They fine-tuned five LLMs, including Mistral-7B and AraGPT2, and evaluated performance using BERTScore and expert review. Results showed that training with ChatGPT-4o-generated data led to higher F1-scores and fewer hallucinations across models. Why it matters: This demonstrates the potential of synthetic data augmentation to improve domain-specific Arabic language models, particularly for low-resource medical NLP applications.