Batch Normalization : Définition et explication complète

Explication détaillée

Batch Normalization

Introduction

Batch Normalization is a technique used in training deep neural networks to improve their performance and stability. It addresses the issue of internal covariate shift by standardizing the inputs to each layer within a mini-batch.

How It Works

The process involves normalizing the output of each layer to have a mean of zero and a standard deviation of one. This is done for each mini-batch during training and helps in maintaining a consistent distribution of inputs, which allows for faster convergence and reduces the sensitivity to hyperparameter tuning.

Benefits

Batch Normalization offers several advantages: it allows for higher learning rates, reduces overfitting by acting as a regularizer, and enables the use of saturating nonlinearities like sigmoid and tanh more effectively. These benefits result in improved training speed and model accuracy.

Implementation

The technique is implemented by adding a Batch Norm layer before or after each activation function in a neural network.
During training, the layer normalizes each feature independently across a mini-batch, followed by a scale and shift operation to preserve the representation capacity of the network.

Conclusion

Overall, Batch Normalization has become a standard practice in building deep learning models, significantly contributing to the engineering of deeper and more robust networks that are easier to train.

Batch Normalization

Définition rapide

Explication détaillée