Search

Results for "zero-shot learning"

On Transferability of Machine Learning Models

MBZUAI · Invalid Date

This article discusses domain shift in machine learning, where testing data differs from training data, and methods to mitigate it via domain adaptation and generalization. Domain adaptation uses labeled source data and unlabeled target data. Domain generalization uses labeled data from single or multiple source domains to generalize to unseen target domains. Why it matters: Research in mitigating domain shift enhances the robustness and applicability of AI models in diverse real-world scenarios.

Representation learning for deep clustering and few-shot learning

MBZUAI · Invalid Date

Michael Kampffmeyer from UiT The Arctic University of Norway presented a talk at MBZUAI on representation learning for deep clustering and few-shot learning. The talk covered deep clustering in multi-view settings and the influence of geometrical representation properties on few-shot classification performance. He specifically discussed embedding representations on the hypersphere and its connection to the hubness phenomenon. Why it matters: This highlights MBZUAI's role in hosting discussions on advanced machine learning topics like few-shot learning, which are crucial for addressing data scarcity challenges in the region and beyond.

Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization

arXiv · Jan 23

This paper introduces Adaptive Entropy-aware Optimization (AEO), a new framework to tackle Multimodal Open-set Test-time Adaptation (MM-OSTTA). AEO uses Unknown-aware Adaptive Entropy Optimization (UAE) and Adaptive Modality Prediction Discrepancy Optimization (AMP) to distinguish unknown class samples during online adaptation by amplifying the entropy difference between known and unknown samples. The study establishes a new benchmark derived from existing datasets with five modalities and evaluates AEO's performance across various domain shift scenarios, demonstrating its effectiveness in long-term and continual MM-OSTTA settings.

A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos

arXiv · Dec 18

A new benchmark, LongShOTBench, is introduced for evaluating multimodal reasoning and tool use in long videos, featuring open-ended questions and diagnostic rubrics. The benchmark addresses the limitations of existing datasets by combining temporal length and multimodal richness, using human-validated samples. LongShOTAgent, an agentic system, is also presented for analyzing long videos, with both the benchmark and agent demonstrating the challenges faced by state-of-the-art MLLMs.

Teaching machines what they don’t know: a new approach to open-world object detection

MBZUAI · Invalid Date

MBZUAI researchers are presenting a new approach to open-world object detection at the AAAI conference. The method enables machines to distinguish between known and unknown objects in images, and then learn to classify the unknown objects. PhD student Sahal Shaji Mullappilly is the lead author of the study, titled "Semi-Supervised Open-World Detection". Why it matters: This research addresses a key limitation in current object detection systems, allowing for more adaptable and robust AI in real-world applications.

A new approach to improve vision-language models

MBZUAI · Invalid Date

MBZUAI researchers have developed a new approach to enhance the generalizability of vision-language models when processing out-of-distribution data. The study, led by Sheng Zhang and involving multiple MBZUAI professors and researchers, addresses the challenge of AI applications needing to manage unforeseen circumstances. The new method aims to improve how these models, which combine natural language processing and computer vision, handle new information not used during training. Why it matters: Improving the adaptability of vision-language models is critical for real-world AI applications like autonomous driving and medical imaging, especially in diverse and changing environments.