Over the past few years, convolutional neural networks (CNNs) have proved to reach superhuman performance in visual recognition tasks. However, CNNs can easily be fooled by adversarial examples (AEs), i.e., maliciously crafted images that force the networks to predict an incorrect output while being extremely similar to those for which a correct output is predicted. Regular AEs are not robust to input image transformations, which can then be used to detect whether an AE is presented to the network. Nevertheless, it is still possible to generate AEs that are robust to such transformations. This article extensively explores the detection of AEs via image transformations and proposes a novel methodology, called defense perturbation, to detect robust AEs with the same input transformations the AEs are robust to. Such a defense perturbation is shown to be an effective counter-measure to robust AEs. Furthermore, multinetwork AEs are introduced. This kind of AEs can be used to simultaneously fool multiple networks, which is critical in systems that use network redundancy, such as those based on architectures with majority voting over multiple CNNs. An extensive set of experiments based on state-of-the-art CNNs trained on the Imagenet dataset is finally reported.
Detecting Adversarial Examples by Input Transformations, Defense Perturbations, and Voting
Nesti, Federico
;Biondi, Alessandro;Buttazzo, Giorgio
2021-01-01
Abstract
Over the past few years, convolutional neural networks (CNNs) have proved to reach superhuman performance in visual recognition tasks. However, CNNs can easily be fooled by adversarial examples (AEs), i.e., maliciously crafted images that force the networks to predict an incorrect output while being extremely similar to those for which a correct output is predicted. Regular AEs are not robust to input image transformations, which can then be used to detect whether an AE is presented to the network. Nevertheless, it is still possible to generate AEs that are robust to such transformations. This article extensively explores the detection of AEs via image transformations and proposes a novel methodology, called defense perturbation, to detect robust AEs with the same input transformations the AEs are robust to. Such a defense perturbation is shown to be an effective counter-measure to robust AEs. Furthermore, multinetwork AEs are introduced. This kind of AEs can be used to simultaneously fool multiple networks, which is critical in systems that use network redundancy, such as those based on architectures with majority voting over multiple CNNs. An extensive set of experiments based on state-of-the-art CNNs trained on the Imagenet dataset is finally reported.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.