An Introduction to Saliency Maps in Deep Learning

The use of deep learning in machine learning has revolutionized the way visual problems are addressed. Elements like convolutional neural networks (CNN) have become the standard design for image identification and computer vision applications. The training of such neural networks demands a large amount of image data. Apart from the object we’re interested in, these images frequently contain a bunch of other things that are not required and act as noise to our dataset. Saliency Maps accomplish the goal of focusing on critical pixels while training images of a specific item while ignoring the rest of the image’s background.

The saliency map is a way to measure the spatial support of a particular class in each image. Saliency maps in deep learning were first witnessed by researchers at the University of Oxford’s paper ‘Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps’.  saliency map is built by computing the gradient of each image’s output against its input.

The saliency map’s objective is to use a scalar variable to describe the conspicuity at every point in the visual field and guide the selection of attended sites based on saliency’s spatial distribution. The input is represented by a combination of feature maps, which are modeled as dynamical neural networks.

Images are processed using saliency maps to distinguish visual features. Colored photos, for example, are converted to black-and-white pictures so that the strongest colors can be identified. Two other examples are the infrared to detect temperature (red is hot, blue is cold) and the night vision to identify light sources (green is bright and black is dark).

To create a saliency map of an image, first, we extract the image’s basic properties such as color, orientation, and intensity. Then, these processed photos are used to create Gaussian pyramids to produce a features map. Finally, the saliency map is constructed by taking the average of all the feature maps.

Saliency maps have a lot of applications in Image processing and computer vision-centric projects. It can be used in Image cropping, Medical imaging, image captioning, video surveillance, traffic light detection, etc.

References:

  • https://arxiv.org/pdf/1312.6034.pdf
  • https://analyticsindiamag.com/what-are-saliency-maps-in-deep-learning/
  • https://www.geeksforgeeks.org/what-is-saliency-map/#:~:text=Saliency%20Map%20is%20an%20image,is%20generally%20a%20grayscale%20image.
  • https://towardsdatascience.com/practical-guide-for-visualizing-cnns-using-saliency-maps-4d1c2e13aeca#:~:text=A%20saliency%20map%20is%20a,the%20output%20over%20the%20input.