Adversarial Training: Improvements and Applications

MBZUAI · Notable

Summary

This article discusses adversarial training (AT) as a method to improve the robustness of machine learning models against adversarial attacks. AT aims to correctly classify data and ensure no data fall near decision boundaries, simulating adversarial attacks during training. Dr. Jingfeng Zhang from RIKEN-AIP will present on improvements to AT and its application in evaluating and enhancing the reliability of ML methods. Why it matters: As ML models become more prevalent in real-world applications in the GCC region, ensuring their robustness against adversarial attacks is crucial for maintaining their reliability and security.

Keywords

adversarial training · machine learning · RIKEN-AIP · robustness · adversarial attacks

Read original article →

Get the weekly digest

Top AI stories from the GCC region, every week.

Provable Unrestricted Adversarial Training without Compromise with Generalizability

arXiv · Jan 22

This paper introduces Provable Unrestricted Adversarial Training (PUAT), a novel adversarial training approach. PUAT enhances robustness against both unrestricted and restricted adversarial examples while improving standard generalizability by aligning the distributions of adversarial examples, natural data, and the classifier's learned distribution. The approach uses partially labeled data and an augmented triple-GAN to generate effective unrestricted adversarial examples, demonstrating superior performance on benchmarks.

ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models

arXiv · Jul 8

The paper introduces ScoreAdv, a novel approach for generating natural adversarial examples (UAEs) using diffusion models. It incorporates an adversarial guidance mechanism and saliency maps to shift the sampling distribution and inject visual information. Experiments on ImageNet and CelebA datasets demonstrate state-of-the-art attack success rates, image quality, and robustness against defenses.

Adversarial Training: Improvements and Applications

Summary

Keywords

Related

Provable Unrestricted Adversarial Training without Compromise with Generalizability

ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models