Skip to content
GCC AI Research

Search

Results for "LibMultiLabel"

Algorithms and Software for Text Classification

MBZUAI ·

The article discusses the challenges in effectively applying text classification techniques, despite the availability of tools like LibMultiLabel. It highlights the importance of guiding users to appropriately use machine learning methods due to considerations in practical applications such as evaluation criteria and data strategies. The piece also mentions a panel discussion hosted by MBZUAI in collaboration with the Manara Center for Coexistence and Dialogue. Why it matters: This signals ongoing efforts within the UAE AI ecosystem to address practical challenges and promote responsible AI usage in NLP applications.

A Culturally-diverse Multilingual Multimodal Video Benchmark & Model

arXiv ·

A new benchmark, ViMUL-Bench, is introduced to evaluate video LLMs across 14 languages, including Arabic, with a focus on cultural inclusivity. The benchmark includes 8k manually verified samples across 15 categories and varying video durations. A multilingual video LLM, ViMUL, is also presented, along with a training set of 1.2 million samples, with both to be publicly released.

Short course on the development of open-source machine learning packages

MBZUAI ·

MBZUAI is hosting a short course on developing open-source machine learning packages. The course will be led by Chih-Jen Lin, an affiliated professor at MBZUAI and distinguished professor at National Taiwan University, who has developed widely used ML packages like LIBSVM and LibMultiLabel. The course will cover topics such as starting a project, choosing functionalities, and identifying research problems from user feedback. Why it matters: This course can help improve the quality and usability of open-source machine learning tools coming from the region's research institutions.

Performance Prediction via Bayesian Matrix Factorisation for Multilingual Natural Language Processing Tasks

MBZUAI ·

A new Bayesian matrix factorization approach is explored for performance prediction in multilingual NLP, aiming to reduce the experimental burden of evaluating various language combinations. The approach outperforms state-of-the-art methods in NLP benchmarks like machine translation and cross-lingual entity linking. It also avoids hyperparameter tuning and provides uncertainty estimates over predictions. Why it matters: Accurate performance prediction methods accelerate multilingual NLP research by reducing computational costs and improving experimental efficiency, especially valuable for Arabic NLP tasks.

When disagreement becomes a signal for AI models

MBZUAI ·

A new paper coauthored by researchers at The University of Melbourne and MBZUAI explores disagreement in human annotation for AI training. The paper treats disagreement as a signal (human label variation or HLV) rather than noise, and proposes new evaluation metrics based on fuzzy set theory. These metrics adapt accuracy and F-score to cases where multiple labels may plausibly apply, aligning model output with the distribution of human judgments. Why it matters: This research addresses a key challenge in NLP by accounting for the inherent ambiguity in human language, potentially leading to more robust and human-aligned AI systems.