Machine learning, Google Cloud Professional Services
Feb 26 · 5 min read
How to use Batch Normalization with
TensorFlow and tf.keras to train deep
neural networks faster
Perhaps the most powerful tool for combatting the vanishing and
exploding gradients issue is Batch Normalization. Batch Normalization
works like this: for each unit in a given layer, rst compute the z score,
and then apply a linear transformation using two trained variables
and . Batch Normalization is typically done prior to the non-linear
activation function (see below gure), however applying it after the
activation function can also be bene cial. Check out this lecture for
more detail of how the technique works.
During backpropagation gradients tend to get smaller at lower layers, slowing down weight updates
and thus training. Batch Normalization helps combat the so-called vanishing gradients.
1. tf.keras.layers.BatchNormalization
2. tf.layers.batch_normalization
3. tf.nn.batch_normalization
In TensorFlow, Batch Normalization can be implemented as an additional layer using tf.keras.layers.
print(extra_ops)
Batch Normalization on MNIST
Below, I apply Batch Normalization to the prominent MNIST dataset
using TensorFlow. Check out the code here. MNIST is an easy dataset to
analyze and doesn’t require many layers to achieve low classi cation
error. However, we can still build a deep network and observe how
Batch Normalization a ects convergence.
1 def train_and_evaluate(output_dir):
2 features = [tf.feature_column.numeric_column(key='image
3 classifier = tf.estimator.Estimator(model_fn=dnn_custom
4 model_dir=output_di
5 params={'features':
6 'batch_norm
7 'activation
8 'hidden_uni
9 'learning_r
10
11 train spec = tf estimator TrainSpec(input fn=train inpu
1 # def ml‑engine function
2 submitMLEngineJob() {
3 gcloud ml‑engine jobs submit training $JOBNAME \
4 ‑‑package‑path=$(pwd)/mnist_classifier/trainer \
5 ‑‑module‑name trainer.task \
6 ‑‑region $REGION \
7 ‑‑staging‑bucket=gs://$BUCKET \
8 ‑‑scale‑tier=BASIC \
9 ‑‑runtime‑version=1.4 \
10 ‑‑ \
11 ‑‑outdir $OUTDIR \
12 ‑‑hidden_units $net \
13 ‑‑num_steps 1000 \
14 $batchNorm
15 }
16
17 # launch jobs in parallel
18 export PYTHONPATH=${PYTHONPATH}:${PWD}/mnist_classifier
19 for batchNorm in '' '‑‑use_batch_normalization'
20 do
21 net=''
22 for layer in 500 400 300 200 100 50 25;