The paper introduces MultiProSE, the first multi-label Arabic dataset for propaganda, sentiment, and emotion detection. It extends the existing ArPro dataset with sentiment and emotion annotations, resulting in 8,000 annotated news articles. Baseline models, including GPT-4o-mini and BERT-based models, were developed for each task, and the dataset, guidelines, and code are publicly available. Why it matters: This resource enables further research into Arabic language models and a better understanding of opinion dynamics within Arabic news media.
Michael Hickner, an Associate Professor from Penn State University, visited KAUST as part of the CRDF-KAUST-OSR Visiting Scholar Fellowship Program. Hickner specializes in Materials Science and Engineering, Chemistry, and Chemical Engineering. The visit was documented with photos by Meres J. Weche. Why it matters: Such programs foster international collaboration and knowledge exchange in science and engineering between KAUST and other leading institutions.
This paper introduces a new task: detecting propaganda techniques in code-switched text. The authors created and released a corpus of 1,030 English-Roman Urdu code-switched texts annotated with 20 propaganda techniques. Experiments show the importance of directly modeling multilinguality and using the right fine-tuning strategy for this task.
Nicu Sebe from the University of Trento presented recent work on video generation, focusing on animating objects in a source image using external information like labels, driving videos, or text. He introduced a Learnable Game Engine (LGE) trained from monocular annotated videos, which maintains states of scenes, objects, and agents to render controllable viewpoints. Why it matters: This talk highlights advancements in cross-modal AI, potentially enabling new applications in gaming, simulation, and content creation within the region.
Natasa Przulj at the Barcelona Supercomputing Center is developing an AI framework that fuses multi-omic data to improve precision medicine. The framework uses graph-regularized non-negative matrix tri-factorization (NMTF) and network science algorithms for patient stratification, biomarker prediction, and drug repurposing. It is applied to diseases like cancer, Covid-19, and Parkinson's. Why it matters: This research can enable more personalized and effective treatments by leveraging complex biological data to understand disease mechanisms and tailor therapies.
MBZUAI has appointed Professor Timothy Baldwin as Associate Provost and acting chair of its new NLP Department. Baldwin will focus on strengthening the curriculum and building a world-class faculty team. He previously spent 17 years at the University of Melbourne. Why it matters: The recruitment signals MBZUAI's commitment to becoming a leading center for NLP research and education in the region.
MBZUAI Professor Timothy Baldwin delivered the presidential keynote at the 60th Annual Meeting of the Association for Computational Linguistics (ACL). Baldwin also published three papers at the conference, including work on biomedical literature summarization, NLP for Indonesian languages, and understanding procedural texts. The papers address challenges such as reducing human effort in reviewing medical documents and digitally preserving Indonesian indigenous languages. Why it matters: Baldwin's contributions and leadership role at ACL highlight the growing prominence of MBZUAI and GCC-based researchers in the global NLP community.
KAUST researchers developed a statistical approach to improve the identification of cancer-related protein mutations by reducing false positives. The method uses Bayesian statistics to analyze protein domain data from tumor samples, accounting for potential errors due to limited data. The team tested their method on prostate cancer data, successfully identifying a known cancer-linked mutation in the DNA binding protein cd00083. Why it matters: This enhances the reliability of cancer research at the molecular level, potentially accelerating the discovery of new therapeutic targets.