Chris McCormick    About    Membership    Blog Archive

Become an NLP expert with videos & code for BERT and beyond → Join NLP Basecamp now!

Filter Masks

Filter masks are fundamental to the implementation of image filters, which are used in many computer vision algorithms.

A filter mask is just a way of representing an operation to be performed on each pixel of an image which incorporates the values of neighboring pixels. It can be represented as a matrix, so it gives us a way to write mathematical equations representing the operation.

Reference

  1. University of Central Florida (UCF) Lecture on YouTube: Lecture 02 – Filtering * The discussion of Convolution and Correlation run from 17:45 to 22:00 in the video.
  2. University of  Nevada, Reno (UNR) Lecture Slides: Image Processing Fundamentals
  3. ImageMagick documentation on Convolution.

Correlation

The discussion of filter masks starts at about 17:45 in Lecture 2 from Dr. Shah. He makes a jump here from the concept of the derivative mask to correlation which can be confusing. Correlation is a broader concept than just image derivatives, it can be used to apply other kinds of filters as well.

We are working towards the broader concept of image filters. With an image filter, you apply a transformation to an image in which the value of each pixel is changed by considering the values of its neighbors.

He uses the below equation to express this mathematically, where f is the image and h is the mask to be applied. However, f and h are reversed on the right hand side of this equation.

Image filter equation

I prefer this version of the equation from the ‘Image Processing Fundamentals’ slides:

Image filter equation 2

To compute the output value for the pixel at x, y:  For each position in the mask (for each column s of the mask, and each row t of the mask), multiply the mask value w(s, t) times the corresponding neighbor of pixel x, y in the original image (the neighbor at x + s, y + t.) Take the sum of all of those products, and that is the pixel value at x, y in the result image.

The symbol between f and h denotes correlation.

[19:26]

The mask is also referred to as a Kernel.

Convolution

The convolution is similar, except that you flip the mask matrix vertically then horizontally before applying it.

Neither the video lecture nor the slides explicitly explain the practical importance of the distinction between convolution and correlation.  What’s more, all of the applications of image masks that I have seen (Gaussian filter, and Laplacian of Gaussian filter), it is always convolution which is used, even though the masks are symmetrical and rotating them has no effect!

Fortunately, I found this site which does explain the distinction. The distinction is only important when you have an asymmetrical mask, as you might find, for example, in a Convolutional Neural Network. When the mask is asymmetrical, then only the convolution has the Commutative and Associative Properties that you expect from multiplication–the correlation does not.