What is Batch Normalization?
It is a way to perform Regularization. Batch normalization normalizes using the mean and variance of the mini-batch, and scaling and shifting are performed using $\gamma$ values and $\beta$ values. At this time, $\gamma$ and $\beta$ are learnable through back propagation.
Now, let’s see what happens in batch normalization.
Input : Values of $x$ over a mini-batch: $B = {x_{1}, \ \ldots,\ x_{m}}$
Parameters to be learned : $\gamma,\ \beta$
Output : {$y_{i}\ =\ BN_{\gamma,\beta}(x_{i})$}
$\mu_{B} \leftarrow \frac{1}{m}\sum_{i=1}^{m}x_{i}$ : Mini-batch mean
$\sigma^{2}{B} \leftarrow \frac{1}{m}\sum{i=1}^{m}(x_{i}-\mu_{B})^{2}$
$\hat{x}{i} \leftarrow \frac{x{i}-\mu_{B}}{\sqrt{\sigma^{2}_{B}+\epsilon}}$
$y_{i} \leftarrow \gamma\hat{x}{i} + \beta \equiv BN{\gamma,\beta}(x_{i})$