MBZUAI researchers have developed MAviS, a new multimodal dataset, benchmark, and chatbot for fine-grained bird species recognition. MAviS includes images, audio, and text to help models identify subtle differences between species, especially rare and regional varieties. The related study was presented at EMNLP 2025 and selected as a "Senior Area Chair Highlight". Why it matters: This work addresses a key limitation in AI's ability to support biodiversity conservation and ecological monitoring in the region and globally.
MBZUAI researchers received high honors at EMNLP 2025 for two research papers, placing them in the top 2% of accepted work. One paper, MAviS, is a multimodal AI system that identifies bird species by combining images, sounds, and text. The other award-winning paper focuses on uncertainty in LLM-as-a-Judge. Why it matters: The recognition highlights MBZUAI's growing influence in NLP and multimodal AI research, particularly in domain-specific applications like biodiversity conservation.
This paper introduces a convolutional transformer model for classifying tomato maturity, along with a new UAE-sourced dataset, KUTomaData, for training segmentation and classification models. The model combines CNNs and transformers and was tested against two public datasets. Results showed state-of-the-art performance, outperforming existing methods by significant margins in mAP scores across all three datasets.
This paper introduces a hybrid deep learning and machine learning pipeline for classifying construction and demolition waste. A dataset of 1,800 images from UAE construction sites was created, and deep features were extracted using a pre-trained Xception network. The combination of Xception features with machine learning classifiers achieved up to 99.5% accuracy, demonstrating state-of-the-art performance for debris identification.
This paper introduces Adaptive Entropy-aware Optimization (AEO), a new framework to tackle Multimodal Open-set Test-time Adaptation (MM-OSTTA). AEO uses Unknown-aware Adaptive Entropy Optimization (UAE) and Adaptive Modality Prediction Discrepancy Optimization (AMP) to distinguish unknown class samples during online adaptation by amplifying the entropy difference between known and unknown samples. The study establishes a new benchmark derived from existing datasets with five modalities and evaluates AEO's performance across various domain shift scenarios, demonstrating its effectiveness in long-term and continual MM-OSTTA settings.