Thamar Solorio of MBZUAI served as general chair of EMNLP 2024, which hosted over 4,000 attendees. MBZUAI researchers presented nearly 50 studies, including one co-authored by Solorio and Monojit Choudhury that received an Outstanding Paper Award. Key themes included cultural awareness, machine-generated content detection, and LLM empathy and cultural representation. Why it matters: MBZUAI's strong presence at EMNLP highlights its growing influence in the international NLP research community and its focus on culturally aware AI.
Thamar Solorio from the University of Houston will discuss machine learning approaches for spontaneous human language processing. The talk will cover adapting multilingual transformers to code-switching data and using data augmentation for domain adaptation in sequence labeling tasks. Solorio will also provide an overview of other research projects at the RiTUAL lab, focusing on the scarcity of labeled data. Why it matters: This presentation addresses key challenges in Arabic NLP related to data scarcity, which is a persistent obstacle in developing effective AI applications for the region.
The first Workshop on Language Models for Low-Resource Languages (LoResLM 2025) was held in Abu Dhabi as part of COLING 2025. It provided a forum for researchers to share work on language models for low-resource languages. The workshop accepted 35 papers from 52 submissions, covering diverse languages and research areas.
NYUAD and MBZUAI co-hosted the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP) in Abu Dhabi from December 7-11. EMNLP is a top-tier NLP and AI conference organized by the ACL special interest group on linguistic data (SIGDAT). MBZUAI's Natural Language Processing Department is actively developing NLP datasets and methods to solve social problems. Why it matters: Hosting EMNLP in the UAE highlights the growing importance of NLP research in the region and the increasing contributions of local institutions like MBZUAI to the field.
Dr. Teresa Lynn from Dublin City University (DCU) discussed the challenges in developing NLP tools for Irish, a low-resource language facing digital extinction. She highlighted the lack of speech and language applications and fundamental language resources for Irish. Lynn also mentioned her work at DCU on the GaelTech project and her involvement in the European Language Equality project. Why it matters: The development of NLP tools for low-resource languages like Irish is crucial for preserving linguistic diversity and preventing digital marginalization in the AI era.