JA EN

Pooling

A downsampling operation that reduces the spatial dimensions of feature maps by aggregating values within local regions, lowering computation while adding translation invariance.

Pooling is a spatial downsampling operation in CNNs that reduces feature map dimensions by summarizing values within a fixed-size window. By collapsing local regions into single values, pooling decreases computation while introducing translation invariance to small spatial shifts.

The most common configuration is 2x2 max pooling with stride 2, selecting the maximum in each non-overlapping region. This halves width and height, reducing spatial area to one quarter. VGG-16 applies max pooling five times, shrinking 224x224 input to 7x7 before fully connected layers.

Modern architectures increasingly replace pooling with strided convolutions, though GAP remains standard as a classifier head. For segmentation, techniques like pooling index storage and atrous convolutions preserve spatial precision.

Related Terms

Related Articles