ICLR Poster ILA-DA: Improving Transferability of Intermediate Level Attack with Data Augmentation

Virtual presentation / poster accept

ILA-DA: Improving Transferability of Intermediate Level Attack with Data Augmentation

Chiu Wai Yan · Tsz Him Cheung · Dit-Yan Yeung

Keywords: [ Deep Learning and representational learning ] [ adversarial examples ] [ data augmentation ] [ Adversarial Transferability ]

[ Abstract ]

[ Slides] [ Poster] [ OpenReview]

Abstract:

Adversarial attack aims to generate deceptive inputs to fool a machine learning model. In deep learning, an adversarial input created for a specific neural network can also trick other neural networks. This intriguing property is known as black-box transferability of adversarial examples. To improve black-box transferability, a previously proposed method called Intermediate Level Attack (ILA) fine-tunes an adversarial example by maximizing its perturbation on an intermediate layer of the source model. Meanwhile, it has been shown that simple image transformations can also enhance attack transferability. Based on these two observations, we propose ILA-DA, which employs three novel augmentation techniques to enhance ILA. Specifically, we propose (1) an automated way to apply effective image transformations, (2) an efficient reverse adversarial update technique, and (3) an attack interpolation method to create more transferable adversarial examples. Shown by extensive experiments, ILA-DA greatly outperforms ILA and other state-of-the-art attacks by a large margin. On ImageNet, we attain an average attack success rate of 84.5%, which is 19.5% better than ILA and 4.7% better than the previous state-of-the-art across nine undefended models. For defended models, ILA-DA also leads existing attacks and provides further gains when incorporated into more advanced attack methods.

Chat is not available.