Pooling

Pooling#

Pooling layers are commonly used in Convolutional Neural Networks (CNNs) for several reasons.

Downsampling and Dimensionality Reduction

Pooling layers help reduce the spatial dimensions (width and height) of the input volume. By downsampling the feature maps, pooling layers decrease the computational load in the subsequent layers of the network.
Translation Invariance

Pooling layers provide a certain degree of translation invariance, making the network less sensitive to the exact position of features in the input. This is particularly useful in tasks like image classification, where the location of an object within an image may vary.
Local Feature Extraction

Pooling focuses on local features and helps retain the most essential information from a region. By selecting the maximum or average value from a pool of neighboring pixels, pooling layers capture the dominant features in a local region.

The pooling operation has the same parameters as convolution:

Unlike for convolution, the stride of a poolin kernel is usually equal to the kernel size (default value).

Max pooling operation returns the maximum value of each window.

Exercise

What will the output of the max pooling layer in the previous example if stride is \(1\)?

A very similar operation to max pooling, calculates mean instead of maximum.