[Tensorflow 101] What does it mean to reduce axis?
In Tensorflow code, you may have seen “reduce_*” many times. When I first used tf.reduce_sum, I thought, if it’s a sum, just say sum! Why do you have to put the prefix “reduce_” in front of every command?
Soon I realized it was because every time you do a sum, max, mean, etc., it inherently reduces the dimension of the tensor. For example, if you have 5 different numbers [1,2,3,4,5] and if you sum them, you get a single number, 15.
Let’s look at some real life example in the tensorflow repository.
def _scale_l2(x, norm_length):
alpha = tf.reduce_max(tf.abs(x), (1, 2), keep_dims=True) + 1e-12
l2_norm = alpha * tf.sqrt(
tf.reduce_sum(tf.pow(x / alpha, 2), (1, 2), keep_dims=True) + 1e-6)
x_unit = x / l2_norm
return norm_length * x_unit
It took sometime for me to be able to quickly visualize which dimensions are getting reduced and see how dimensions in reduce_max, reduce_sum exactly behave. I decided to write this post because many people are actually suffering from juggling dimensions/axes but the tensorflow tutorial assumes you are already familiar with this way of thinking. Having a good command of tensor shape/dimensions/indexing will save you a lot of debugging later.
So, what does (1,2) mean in the above lines?
To understand how Tensorflow treats dimensions, first read my blog Numpy Sum Axis Intuition because axes in numpy, tensorflow, pytorch all behave in the same way.
TLDR: The way to understand the “axis” of numpy/Tensorflow is: it collapses the specified axis.
In deep learning models, the shape of the tensor is usually (batch_size, time_steps, dimensions).
Let’s say we have a (3,2,5) dimension tensor.
# Let's initialize the tensor.
In [3]: x = tf.constant([[[1,2,3,4,5], [4,5,6,7,8]]…