ICLR Poster Rethinking CNN’s Generalization to Backdoor Attack from Frequency Domain

Poster

Rethinking CNN’s Generalization to Backdoor Attack from Frequency Domain

Quanrui Rao · Lin Wang · Wuying Liu

Halle B #212

[ Abstract ]

Fri 10 May 1:45 a.m. PDT — 3:45 a.m. PDT

Abstract:

Convolutional neural network (CNN) is easily affected by backdoor injections, whose models perform normally on clean samples but produce specific outputs on poisoned ones. Most of the existing studies have focused on the effect of trigger feature changes of poisoned samples on model generalization in spatial domain. We focus on the mechanism of CNN memorize poisoned samples in frequency domain, and find that CNN generate generalization to poisoned samples by memorizing the frequency domain distribution of trigger changes. We also explore the influence of trigger perturbations in different frequency domain components on the generalization of poisoned models from visible and invisible backdoor attacks, and prove that high-frequency components are more susceptible to perturbations than low-frequency components. Based on the above fundings, we propose a universal invisible strategy for visible triggers, which can achieve trigger invisibility while maintaining raw attack performance. We also design a novel frequency domain backdoor attack method based on low-frequency semantic information, which can achieve 100\% attack accuracy on multiple models and multiple datasets, and can bypass multiple defenses.

Live content is unavailable. Log in and register to view live content