I have wanted to jump on the Deep Learning bandwagon for quite some time. As an avid R user I have been very interested in using MXNET, but I have found it very troublesome to couple the program with my GPU. In lieu of that, I decided to tackle installing TensorFlow on my Windows machine. This was fairly easy. All that I had to do was:

  1. Download/Install Anaconda 3
  2. Create an environment (which I named tensorflow)
  3. Install TensorFlow & Jupyter into that environment

I had some trouble at first activating the environment but then learned that the problem was not mine, but with Windows Powershell. There are also a few warning messages but I was able to work through a simple tutorial without any trouble. This tutorial can be found here.

The first step is start up the environment. This was accomplished by simply typing

activate tensorflow

because ‘source’ is only relevant to Mac/Linux. Then, within the environment,  I fired up jupyter with

jupyter notebook 

because ‘ipython’ has been deprecated and will give you strange results. Finally, I began writing a new notebook based on the tutorial. The first step is to input TensorFlow!

import tensorflow as tf

I ran the following code to test that tensorflow did import successfully.

## hello = tf.constant('Hello, TensorFlow!')
## sess = tf.Session()
## print(sess.run(hello))

## a = tf.constant(10)
## b = tf.constant(32)
## print(sess.run(a+b))

But now we need data. The MNIST dataset is an industry standard.

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Data downloaded without a hitch!
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

What I find very interesting, and only made sense to me after I began to understand how backpropagation works, is that TensorFlow is organized as a computationl graph. So, instead of just defining a bunch of variables and executing them sequentially, a user defines tensor ‘nodes’ which are then connected together. The entire computational graph is then run with a call to sess(). To get started, we first define some placeholder variables.

## x is our data
x = tf.placeholder(tf.float32,[None, 784])

## W are our weights and b is our intercept
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
## y is the generated outpu
y = tf.nn.softmax(tf.matmul(x, W) + b)

Now we define a ‘placeholder’ for our actual output. We’ll compare the actual output to the output we generated and compare the two. The comparison is accomplished by the softmax_cross_entropy_with_logits function.

## Define another placeholder for correct y
y_ = tf.placeholder(tf.float32,[None, 10])

## cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

What we want to do is minimize this cross entropy, using minimize().

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

And…apparently there is some additional initializing that needs to be done.

init = tf.global_variables_initializer()
Graph initialized!
sess = tf.Session()

Loop through training

for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
You will notice that ‘i’ is not actually used by sess.run() or mnist.train.next_batch(). Apparently this function keeps track of which batch number it is on automatically. Interesting.

Determine Accuracy

After training the model, we need to find out how well it performs on new data. So, we add additional nodes to our existing computational graph / create a new computational graph where tf.equal() -> tf.reduce_mean. We then feed a dictionary of images and labels into it, which gets transformed into measure y and actual y_, and the accuracy is calculated and printed. You can see that the accuracy is a little less than 91%. How cool!

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))