Attacks and Defence for Adversarial Examples in Convolutional Neural Network
Attacks and Defence for Adversarial Examples in Convolutional Neural Network
Convolutional neural network (CNN) is extremely susceptible to the adversarial example. They are imperceptible patterns that fool CNN and result in its failure to correctly classify or recognise them. The addition of adversarial noise to images, videos or speech files is targeted in such a manner that CNN produces a wrong result. One can poison the database by adding the adversarial example in the training database or can tamper with the physical world object so that CNN fails to correctly classify it. The examples of the systems which can be attacked with mal-intentions are face recognition systems, autonomous cars. For example in the physical world; patches on the stop sign result in failure of CNN and may cause an accident. The attacks can be classified as white box and black box depending on the amount of available information. In whilte box attack complete information about the CNN architecture and related parameters are available to the attackers, while in the black box attack no information is available to the attacker. The talk will review some of the state of the art attacks.
The defense against the adversarial examples can be addressed by the following;
1. During CNN learning phase by giving adversarial training, gradient hiding or blocking the transferability;
2. One can also design robust CNN by adjusting architecture to immunize the adversarial noise;
3. The use of preprocessing filters to remove adversarial noise;
4. Detection of adversarial examples through feature squeezing.
One of the defense techniques is to detect adversarial images by observing the outputs of a CNN-based system when noise removal filters are applied. Such operation-oriented characteristics enable us to detect the adversarial example. In this talk, I will show state-of-the-art techniques for attacks and defence adversarial examples