git.png

 

Introduction

 

 

Why do we need tensorflow? Why are people crazy about it? In a way, it is lazy computing and offers flexibility in the way you run your code. What is this thing with flexbility and laze computing? We are glad, you asked!

Lazy Computing: TensorFlow is a way of representing computation without actually performing it until asked. The first step to learn Tensorflow is to understand its main key feature, the "computational graph" approach. Basically, all Tensorflow codes contain two important parts:

Part 1: building the GRAPH, it represents the data flow of the computations

Part 2: running a SESSION, it executes the operations in the graph

First step you create the graph i.e. what you want to do with the data, then you run it seperately using a session (don't struggle to wrap your head around it, it will come to you eventually).

Flexibility: When you create a graph, you are not bound to run the whole graph and can control the parts of the graph that are executed separately. This provides a huge flexibility with your models.

Bonus: One of the biggest advantages of TensorFlow is its visualizations of the computation graph. Its called TensorBoard and will be discussed in future. Now that we have discussed what and why about TensorFlow, lets dive in to the actual thing.

TensorFlow separates the definition of computations from their execution. These two parts are explained in more detail in the following sections. Before that, remember that the first step is to import the Tensorflow library!

In [6]:
import tensorflow as tf
 
 

This gives Python access to all of TensorFlow's classes, methods, and symbols. Using this command, TensorFlow library will be imported under the alias tf so that later we can use it instead of typing the whole term tensorflow each time.

What is a Tensor? TensorFlow programs use a data structure called tensor to represent all the data. Any type of data you plan to use for your model can be stored in Tensors. Simply put, a Tensor is a multi-dimensional array (0-D tensor: scalar, 1-D tensor: vector, 2-D tensor: matrix, and so on). Hence, TensorFlow is simply referring to the flow of the Tensors in the computational graph. Fig1. A sample computational graph in TensorFlow (Source: TensorFlow website)

 

GRAPH

The biggest idea about Tensorflow is that all the numerical computations are expressed as a computational graph. In other words, the backbone of any Tensorflow program is a Graph. Anything that happens in your model is represented by the computational graph. This makes it, the to go place for anything related to your model. Quoted from the TensorFlow website, "A computational graph (or graph in short) is a series of TensorFlow operations arranged into a graph of nodes". Basically, it means a graph is just an arrangement of nodes that represent the operations in your model.

So First let's see what does a node and operation mean? The best way to explain it is by looking at a simple example. Suppose we want to write the code for function $f(x,y)=x^2y+y+2$. The Graph in TensorFlow will be something like:

Fig2. Schematic of the constructed computational graph in TensorFlow

The graph is composed of a series of nodes connected to each other by edges (from the image above). Each node in the graph is called op (short for operation). So we'll have one node for each operation; either for operations on tensors (like math operations) or generating tensors (like variables and constants). Each node takes zero or more tensors as inputs and produces a tensor as an output.

Now Let's build a simple computational graph.

Example 1:

Let's start with a basic arithmatic operation like addition to demonstrate a graph. The code adds two values, say a=2 and b=3, using TensorFlow. To do so, we need to call tf.add(). From here on, we recommend you to check out the documentation of each method/class to get a clear idea of what it can do(documentation can be found at tensorflow.org or you can just use google to get to the required page in the documentation). The tf.add() has three arugments 'x', 'y', and 'name' where x and y are the values to be added together and name is the operation name, i.e. the name associated to the addition node on the graph.

If we call the operation "Add", the code will be as follows:

In [2]:
import tensorflow as tf
a = 2
b = 3
c = tf.add(a, b, name='Add')
print(c)
 
 
Tensor("Add:0", shape=(), dtype=int32)
 
 

The generated graph and variables are:

*Note: The graph is generated using Tensorboard. As discussed earlier, it is a visualization tool for the graph and will be discussed in detail in future.

Fig3. Left: generated graph visualized in Tensorboard, Right: generated variables (screenshot captured from PyCharm debugger when running in debug mode)

This code creates two input nodes (for inputs a=2 and b=3) and one output node for the addition operation (named Add). When we print out the variable c (i.e. the output Tensor of the addition operation), it prints out the Tensor information; its name (Add), shape (() means scalar), and type (32-bit integer). However, It does not spit out the result (2+3=5). Why?!

Remember earlier in this post, we talked about the two parts of a TensorFlow code. First step is to create a graph and to actually evaluate the nodes, we must run the computational graph within a Session. In simple words, the written code only generates the graph which only determines the expected sizes of Tensors and operations to be executed on them. However, it doesn't assign a numeric value to any of the Tensors i.e. TensorFlow does not execute the graph unless it is specified to do so with a session. Hence, to assign these values and make them flow through the graph, we need to create and run a session.

Therefore a TensorFlow Graph is something like a function definition in Python. It WILL NOT do any computation for you (just like a function definition will not have any execution result). It ONLY defines computation operations.

 

Session

To compute anything, a graph must be launched in a session. Technically, session places the graph ops on hardware such as CPUs or GPUs and provides methods to execute them. In our example, to run the graph and get the value for c the following code will create a session and execute the graph by running 'c':

In [3]:
sess = tf.Session()
print(sess.run(c))
sess.close()
 
 
5
 
 

This code creates a Session object (assigned to sess), and then (the second line) invokes its run method to run enough of the computational graph to evaluate c. This means that it only runs that part of the graph which is necessary to get the value of c (remember the flexibility of using TensorFlow? In this simple example, it runs the whole graph). Remember to close the session at the end of the session. That is done using the last line in the above code.

The following code does the same thing and is more commonly used. The only difference is that there is no need to close the session at the end as it gets closed automatically.

In [4]:
with tf.Session() as sess:
    print(sess.run(c))
 
 
5
 
 

Now let's look at the created graph one more time. Don't you see anything weird?

Fig4. The generated graph visualized by Tensorboard

Exactly! What is x and y?! Where did these two thing come from? We didn't define any x or y variables!

Well... To explain clearly, let's make up two names; say "Python-name" and "TensorFlow-name". In this piece of code, we generated 3 variables (look at the right panel of Fig. 3) with "Python-name"s of a, b, and c. Here, a and b are Python variables, thus have no "TensorFlow-name"; while c is a Tensor with Add as its "TensorFlow-name".

Clear? Okay, let's get back to our question, what is x and y then?

In an ideal Tensorflow case, tf.add() receives two Tensors with defined "TensorFlow-name" as input (these names are separate from Python-name). For example, by writing $c = tf.add(a, b, name='Add')$, we're actually creating a variable (or Tensor) with c as its Python-name and Add as the TensorFlow-name.

In the above code, we passed two Python variables (a=2 and b=3) which only have Python-names (a and b), but they have no TensorFlow-names. TensorFlow uses the TensorFlow-names for visualizing the graphs. Since a and b have no TensorFlow-names, it uses some default names, x and y.

*Note: This name mismatch can easily be solved by using tf.constant() for creating the input variables as Tensors instead of simply using Python variables (a=2, b=3). This is explained thoroughly in the next tutorial where we talk about TensorFlow DataTypes.

For now, we'll continue using Python variables and change the Python variable names a and b into x and y to solve the name mismatch temporarily.

Now let's look at a more complicated example.

Example 2:

Creating a graph with multiple math operations

In [5]:
import tensorflow as tf
x = 2
y = 3
add_op = tf.add(x, y, name='Add')
mul_op = tf.multiply(x, y, name='Multiply')
pow_op = tf.pow(add_op, mul_op, name='Power')
useless_op = tf.multiply(x, add_op, name='Useless')

with tf.Session() as sess:
    pow_out, useless_out = sess.run([pow_op, useless_op])
 
 

The created graph and the defined variables (Tensors and Python variables) are:

Fig5. Left: generated graph visualized in Tensorboard, Right: generated variables (screenshot captured from PyCharm debugger when running in debug mode)

I called one of the operations useless_op because it's output is not used by other operations. Lets talk about an IMPORTANT point. Given this graph, if we fetch the pow_op operation, it will first run the add_op and mul_op to get their output tensor and then run pow_op on them to compute the required output value. In other words useless_op will not be executed as it's output tensor is not used in executing the pow_op operation.

This is one of the advantages of defining a graph and running a session on it! It helps running only the required operations of the graph and skip the rest (remember flexibility). This specially saves a significant amount of time for us when dealing with huge networks with hundreds and thousands of operations.

In the above code, in the defined session, we're fetching the value of two tensors (i.e. output tensors of pow_op and useless_op) at the same time. This will run the whole graph to get the required output tensors.

I hope this post has helped you to understand the concept of Graph and Session in TensorFlow. Thank you so much for reading! If you have any questions, feel free to leave a comment in our webpage. You can also send us feedback through the contacts page.

 

© 2018 Easy-TensorFlow team. All Rights Reserved.