Visualize Machine Learning metrics with Tensorflow and Tensorboard
It can be very time-consuming to read individual results numerically, particularly if the model has been training for many epochs.
Tensorboard can offer a remedy for this. It is a convenient and graphical tool for:
- Visually tracking model performance in real time
- Reviewing historical model performances
- Comparing the performance of different model architectures and hyper parameter settings applied to train the same data
In this post I want to show how Tensorboard enables you to, epoch over epoch, visually track your model’s cost (loss) and accuracy (acc) across both your training data and your validation (val) data.
Simple computational graph in Tensorboard
But let’s start with the basic framework Tensorboard needs in order to show the visualization of the computational graphs: the Tensorflow code.
Every Tensorflow code contains two essential elements:
- build a graph that represents the data flow of the computation
- run a session that executes the computation from the graph and provides the corresponding resources (CPU, GPU, TPU).
What does such a graph look like?
A simple example:
f(x,y) = x * y + 10
The computational graph for the above function looks like this in Tensorboard:
One can take a closer look at the elements by double clicking and opening them:
The example can be created e.g. in a Jupyter notebook with the following code. I am using Tensorflow V2 – that’s why the compatibility namespace is considered. If you are using V1 just skip the „combat.V1.„-string:
import tensorflow as tf import datetime LOGDIR = '../logger/' + datetime.datetime.now().strftime('%Y%m%d-%H%M%S') + '- computational_graph' with tf.compat.v1.Session() as sess: # Build a graph. x = tf.compat.v1.Variable(5, name='x') y = tf.compat.v1.Variable(10, name='y') ten = tf.constant(11, name='ten') f = tf.add(tf.multiply(x,y, 'multiply'), ten)
Before starting the logs, a so-called session and the log directory must be defined. Furthermore, the initialization of the variables must be carried out.
writer = tf.compat.v1.summary.FileWriter(LOGDIR, sess.graph)
init = tf.compat.v1.global_variables_initializer() sess.run(init)
Now the session is started and the computational graph is stored in the LOGDIR directory.
result = sess.run(f) print(result) # --> 60
The dashboard is started with tensorboard -logdir=LOGDIR in any console or terminal program –> now the localhost (127.0.0.1) can be called via the browser on port 6006 and the graph can be displayed.
Graphs for accuracy and loss in Tensorboard
Let’s come to a workflow that has prepared data in a previous step.
The data may have been extracted on the basis of a CSV file. It does not matter for further consideration, thus the PreProcessing part can be skipped here. It is only important that the data already contains the features Xn and the corresponding label Yn.
Main program could look like this:
# Main program x, y = PreProcessing(rawfile) ExecuteNeuralNetwork(x, y)
Let’s dive into the details of the NeuralNetwork method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
def ExecuteNeuralNetwork(x, y): print('Running Neural Network') import tensorflow as tf from tensorflow.keras.callbacks import TensorBoard import datetime from sklearn.model_selection import train_test_split x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 42) # some hyperparameters modelActivation = 'tanh' #relu, tanh compileOptimizer = 'adam' #adam, sgd loss = 'sparse_categorical_crossentropy' #sparse_categorical_crossentropy, mean_squared_error model = tf. keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(10,)), tf.keras.layers.Dense(128, activation=modelActivation), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10, activation='sigmoid') #sigmoid, softmax ]) print('Running Compile...') model.compile(optimizer=compileOptimizer, loss=loss, metrics=['accuracy']) print('Running compile done') # define LOGDIR with timestamp and some important hyperparameters LOGDIR = 'logger/' + datetime.datetime.now().strftime('%Y%m%d-%H%M%S') + '-' + compileOptimizer + '-' + modelActivation + '-' + loss tensorboard_callback = TensorBoard(log_dir=LOGDIR) print('Processing Epochs...') model.fit(x_train, y_train, epochs=20, validation_data=(x_test, y_test) ,callbacks=[tensorboard_callback]) print('Processing Epochs done') |
Here we see some important processes and hyperparameters that are essential for training. First, the data is separated into a training and test portion (line 8). This is done with a SKLEARN function and the corresponding test_size parameter. In order to get a shuffle, the random_state-parameter is set to, e.g. 42. Taking 0 instead, the list would be used as it is, i.e. the first 80% would be processed as training data, the remaining 20% as test data. This can sometimes have an unfavourable effect on the result.
In our example, a neural network is used as the algorithm (line 15-20). Of course, other methods such as RandomForest or the DecissionTree-Classifier are also conceivable. Here, one must proceed according to the objectives and prepare the choice very carefully. However, it is not the subject of this post, but you can find the examples for mentioned algorithms on my GitHub path
The only important thing for our consideration is that the results of the training end up in our Tensorboard object and its log directory defined in the callbacks parameter (line 31, 34). It can be helpful here that all runs are time-stamped and concatenated with other hyperparameter strings. The creation of subfolders in the log directory can then be used for the historical display of the data in the Tensorboard dashboard.
In order to also get the validation data metrics, one must set the parameter in line 34 accordingly. They will appear as separate fields in the Tensorboard dashboard.
The results of the different settings can even be observed during the training run. To do this, simply update the web browser from time to time. You can also filter the results by selecting the appropriate runs in the left corner of the dashboard.
The architecture of the NeuralNetwork or the underlaying algorithm can be viewed in the GRAPHS-tab. Just like we did already with the simple graph at the beginning of this article:
Finally, let’s look at the explanation of the two metrics we achieved with our training:
Accuracy
Accuracy is a method of measuring the performance of a classification model. It is usually expressed as a percentage and easier to interpret than loss. Basically, it is the number of predictions where the predicted value is equal to the actual value. It is binary (true/false) for a selected sample.
As already mentioned above, accuracy can be plotted and monitored in Tensorboard at runtime of the training phase, although the value is often associated with the overall or final model accuracy.
Loss
A loss function, also called a cost function, takes into account the probabilities of a prediction. It is based on how much the prediction deviates from the true value. This gives us a more sophisticated view of how well the model is performing.
Unlike accuracy, loss is not a percentage – it is a summation of the errors made for each sample in the training or validation sets. Loss can be used in the training process to find the „best“ parameter values for the model (e.g. weights in a neural network). During the training process, the goal is to minimize this value and obtain a falling curve.
The most common loss functions are the logarithmic loss and the cross-entropy loss (which give the same result when calculating error rates between 0 and 1), as well as the mean square error and the likelihood loss.
Unlike accuracy, loss can be used in both classification and regression problems.
Most often, one observes that accuracy increases as loss decreases – but this is not always the case. Accuracy and loss have different definitions and measure different things. At first glance, they often appear to be inversely proportional, but there is no mathematical relationship between these two metrics. In our example, the two runs are almost identical. In this respect, one cannot really speak of an advantage in the use for one or the other hyperparameter. Of course, this can look quite different for other data sets.
All in all, one can really say that with Tensorboard and its various overviews, a tool has been created that greatly simplifies the evaluation of Machine Learning Algorithms and its metrics. I hope I was able to give you a better understanding of how it works with this post. Have fun exploring further.
See also more details on LinkedIn