In recent years, deep neural networks (DNNs) have proven their efficiency for computer vision applications, such as image classification and segmentation. The use of this technique has become essential in almost all fields. However, recent studies have shown the vulnerability of these networks to slight distortions or additive perturbations in the input images, raising concerns about the robustness and safety of DNNs.
Two main strategies are followed to deal with these issues. The first approach tries to find more robust network architectures. Researchers keep proposing more powerful and more robust deep architectures, such as LeNet, AlexNet, VGG, ResNet, etc., based on convolutional layers, pooling layers, and activation functions. The other approach studies more robust training processes to make the network less vulnerable to input distortions or adversarial attacks. Recently, a Chinese research team proposed ImageNet-CS, a Scaling-distortion dataset created by scaling a subset of the ImageNet Challenge dataset. The target of this work is to investigate the impact of scaled images on the performance of DNNs. In addition, they tested some training techniques like Augmix, Revisiting, and Normalizer Free on the proposed ImageNet-CS to demonstrate the impact of the training strategy on the model robustness.
The creatures of, ImageNet-CS, the new image dataset, are based on ImageNet-C with several scaling operations, which itself comes from the very famous ImageNet database. Many well-known deep networks give a poor performance on the 15 different automated generated perturbations that are present in Imagenet-C. For this reason, the authors chose it as the starting dataset to generate ImageNet-CS. In the first step, 50 images were randomly selected from Imagenet’s categories to create a clean sample dataset, which includes 50,000 raw images in total. Second, ten different degrees of magnification were used to acquire the 500,000 pictures in the Imagenet-CS data set. The image will only represent a portion of the object as the magnification grows, but the object’s semantic content will remain unaltered.
The generation of the scaled image is made in two steps:
1) Cut the image to (X – M) × (B – N) pixels, where X and Y are the length and width of the raw picture, while M is the size of pixels clipped in the long direction and N in the wide direction.
2) upscale the image to its original size using the bilinear interpolation algorithm.
Six best-known deep architectures were used to see their performance against the generated ImageNet-SC compared to their classification accuracy of the images without any scaling. The outcomes demonstrate that performance was negatively impacted by upscaling for all employed architectures. This phenomenon shows that, despite the minimal change in the semantic properties of the images, the scaling operation on the images nevertheless has a considerable impact on the functionality of DNNs. In addition, the ResNet50 network was chosen to evaluate the effect of two training strategies, Augmix and Normalizer Free, on the model’s accuracy against ImageNet-CS. In both situations, the outcomes are superior to those of standard ResNet50 training.
Since DNNs have become widely used in our time, it is essential to study the robustness of deep learning models to make them more robust and accurate to perturbations. In the article we have just seen, we have shown interesting results concerning the impact of image scaling transformation on the performance of DNN models. The accuracy of these gratings drops significantly as the scaling magnification increases. The article’s authors presented the performances of the classification of six architectures of the most used under different scale distortions. The interesting results obtained aim to motivate the study of the sensitivity to the scale of other studies of architectures in the future.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'Impact of Scaled Image on Robustness of Deep Neural Networks'. All Credit For This Research Goes To Researchers on This Project. Check out the paper. Please Don't Forget To Join Our ML Subreddit
Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor's degree in physical science and a master's degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep