Skip to content
GCC AI Research

Search

Results for "COLING"

MBZUAI welcomes the world to Abu Dhabi as COLING 2025 opens

MBZUAI ·

The 31st International Conference on Computational Linguistics (COLING 2025) is being held in Abu Dhabi from January 18-24, hosted by MBZUAI. The conference features paper presentations, demonstrations, keynote speeches, workshops, and tutorials, with over 1,500 attendees. MBZUAI faculty and students contributed 22 papers to the conference, including research on fact-checking and cross-cultural content. Why it matters: Hosting COLING 2025 highlights the UAE's growing role as a hub for AI and NLP research, particularly in Arabic language processing.

Leading natural language processing conference to take place in Abu Dhabi

MBZUAI ·

The 31st International Conference on Computational Linguistics (COLING 2025) will be held in Abu Dhabi in January 2025, hosted by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). COLING is a major biennial NLP and AI conference that brings together leaders from research centers, academia, and industry. The conference will feature keynote talks, presentations, workshops, and tutorials, with 1,500 expected participants. Why it matters: Hosting COLING underscores the UAE's growing role in AI and NLP research and provides a platform to address regional linguistic challenges and advance AI technologies.

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human

arXiv ·

The GenAI Content Detection Task 1 is a shared task on detecting machine-generated text, featuring monolingual (English) and multilingual subtasks. The task, part of the GenAI workshop at COLING 2025, attracted 36 teams for the English subtask and 26 for the multilingual one. The organizers provide a detailed overview of the data, results, system rankings, and analysis of the submitted systems.

Towards Real-world Fact-Checking with Large Language Models

MBZUAI ·

Iryna Gurevych from TU Darmstadt presented research on using large language models for real-world fact-checking, focusing on dismantling misleading narratives from misinterpreted scientific publications and detecting misinformation via visual content. The research aims to explain why a false claim was believed, why it is false, and why the alternative is correct. Why it matters: Addressing misinformation, especially when supported by seemingly credible sources, is critical for public health, conflict resolution, and maintaining trust in institutions in the Middle East and globally.

Transformer Models: from Linguistic Probing to Outlier Weights

MBZUAI ·

Giovanni Puccetti from ISTI-CNR presented research on linguistic probing of language models like BERT and RoBERTa. The research investigates the ability of these models to encode linguistic properties, linking this ability to outlier parameters. Preliminary work on fine-tuning LLMs in Italian and detecting synthetic news generation was also presented. Why it matters: Understanding the inner workings and linguistic capabilities of LLMs is crucial for improving their reliability and adapting them to diverse languages like Arabic.

SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs

arXiv ·

The Qatar Computing Research Institute (QCRI) has released SpokenNativQA, a multilingual spoken question-answering dataset for evaluating LLMs in conversational settings. The dataset contains 33,000 naturally spoken questions and answers across multiple languages, including low-resource and dialect-rich languages. It aims to address the limitations of text-based QA datasets by incorporating speech variability, accents, and linguistic diversity. Why it matters: This benchmark enables more robust evaluation of LLMs in speech-based interactions, particularly for Arabic dialects and other low-resource languages.

Addressing NLP problems in low resource settings

MBZUAI ·

Thamar Solorio from the University of Houston will discuss machine learning approaches for spontaneous human language processing. The talk will cover adapting multilingual transformers to code-switching data and using data augmentation for domain adaptation in sequence labeling tasks. Solorio will also provide an overview of other research projects at the RiTUAL lab, focusing on the scarcity of labeled data. Why it matters: This presentation addresses key challenges in Arabic NLP related to data scarcity, which is a persistent obstacle in developing effective AI applications for the region.