Convolution

A fundamental image processing operation that slides a kernel (small matrix) across an image, computing weighted sums at each position for blurring, edge detection, and sharpening.

Convolution is the core operation of spatial image filtering: a small matrix called a kernel slides across the image, computing a weighted sum of surrounding pixels at each position. Nearly all spatial filters - blur, sharpen, edge detection, emboss - are implemented as convolutions.

The convolution procedure:

Slide the kernel (e.g. 3×3) across the image from top-left to bottom-right, one pixel at a time
At each position, multiply each kernel element by the corresponding pixel value
Sum all products to produce the output pixel value
Repeat for every pixel in the image

Representative kernels and their effects:

Box filter: All elements equal 1/9 in a 3×3 kernel. Produces simple averaging blur
Gaussian filter: Weights follow a Gaussian distribution. Creates natural blur and is optimal for noise reduction
Sobel filter: Detects horizontal and vertical edges by approximating first-order derivatives
Laplacian filter: Detects edges in all directions via second-order derivatives. Zero-crossings indicate edge locations
Unsharp mask: Subtracts a blurred version from the original to enhance sharpness. A staple in print and photography

Computationally, applying an N×N kernel to an M×M image requires O(M²N²) operations. For large kernels, converting to frequency domain (FFT) multiplication is faster. Separable kernels like Gaussian can be decomposed into two 1D passes, reducing complexity to O(M²N).

Deep learning CNNs (Convolutional Neural Networks) use the same convolution operation, but learn kernel values automatically through training rather than manual design.

Convolution

Related Terms

Related Articles