JA EN

Activation Function

A non-linear function applied to each neuron's output in a neural network, enabling the model to learn complex patterns beyond linear transformations.

An activation function is a non-linear transformation applied to a neuron's linear output z = Wx + b. Without it, stacking layers collapses into a single linear transformation, making non-linear problems unsolvable. The choice directly impacts training speed and final accuracy.

In computer vision, ReLU is the de facto standard for hidden layers. Defined as f(x) = max(0, x), it passes positive values unchanged and zeros out negatives. Compared to sigmoid and tanh, ReLU avoids gradient saturation and is computationally cheap.

For super-resolution and generation, output layers use tanh (range -1 to 1) or sigmoid (range 0 to 1) to constrain pixel values. The principle: ReLU variants for hidden layers, task-specific functions for outputs.

Related Terms

Related Articles