Deep neural networks (DNNs) offer superior performance in machine learning tasks such as image recognition, speech recognition, pattern analysis, and intrusion detection. In this paper, we propose a one-pixel adversarial example that is safe for friendly deep neural networks.
By modifying only one pixel, our proposed method generates a one-pixel-safe adversarial example that can be misclassified by an enemy classifier and correctly classified by a friendly classifier.
To verify the performance of the proposed method, we used the CIFAR-10 dataset, ResNet model classifiers, and the Tensorflow library in our experiments.
Results show that the proposed method modified only one pixel to achieve success rates of 13.5% and 26.0% in targeted and untargeted attacks, respectively.
The success rate is slightly lower than that of the conventional one-pixel method, which has success rates of 15% and 33.5% in targeted and untargeted attacks, respectively; however, this method protects 100\% of the friendly classifiers.
In addition, if the proposed method modifies five pixels, this method can achieve success rates of 20.5% and 52.0% in targeted and untargeted attacks, respectively.