Middle East AI

This Week arXiv

Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization

arXiv · · Significant research

Summary

This paper introduces Adaptive Entropy-aware Optimization (AEO), a new framework to tackle Multimodal Open-set Test-time Adaptation (MM-OSTTA). AEO uses Unknown-aware Adaptive Entropy Optimization (UAE) and Adaptive Modality Prediction Discrepancy Optimization (AMP) to distinguish unknown class samples during online adaptation by amplifying the entropy difference between known and unknown samples. The study establishes a new benchmark derived from existing datasets with five modalities and evaluates AEO's performance across various domain shift scenarios, demonstrating its effectiveness in long-term and continual MM-OSTTA settings.

Keywords

test-time adaptation · multimodal · entropy optimization · domain shift · open-set

Get the weekly digest

Top AI stories from the GCC region, every week.

Related

Universal Adversarial Examples in Remote Sensing: Methodology and Benchmark

arXiv ·

This paper introduces a novel black-box adversarial attack method, Mixup-Attack, to generate universal adversarial examples for remote sensing data. The method identifies common vulnerabilities in neural networks by attacking features in the shallow layer of a surrogate model. The authors also present UAE-RS, the first dataset of black-box adversarial samples in remote sensing, to benchmark the robustness of deep learning models against adversarial attacks.

MOTOR: Multimodal Optimal Transport via Grounded Retrieval in Medical Visual Question Answering

arXiv ·

This paper introduces MOTOR, a multimodal retrieval and re-ranking approach for medical visual question answering (MedVQA) that uses grounded captions and optimal transport to capture relationships between queries and retrieved context, leveraging both textual and visual information. MOTOR identifies clinically relevant contexts to augment VLM input, achieving higher accuracy on MedVQA datasets. Empirical analysis shows MOTOR outperforms state-of-the-art methods by an average of 6.45%.

Provable Unrestricted Adversarial Training without Compromise with Generalizability

arXiv ·

This paper introduces Provable Unrestricted Adversarial Training (PUAT), a novel adversarial training approach. PUAT enhances robustness against both unrestricted and restricted adversarial examples while improving standard generalizability by aligning the distributions of adversarial examples, natural data, and the classifier's learned distribution. The approach uses partially labeled data and an augmented triple-GAN to generate effective unrestricted adversarial examples, demonstrating superior performance on benchmarks.

VENOM: Text-driven Unrestricted Adversarial Example Generation with Diffusion Models

arXiv ·

The paper introduces VENOM, a text-driven framework for generating high-quality unrestricted adversarial examples using diffusion models. VENOM unifies image content generation and adversarial synthesis into a single reverse diffusion process, enhancing both attack success rate and image quality. The framework incorporates an adaptive adversarial guidance strategy with momentum to ensure the generated adversarial examples align with the distribution of natural images.