Changes

Blacker, Pete C Dr (PG/R - Elec Electronic Eng) · 41e57d36
--- a/tutorials/Tutorial-1-Exporting-a-Basic-MNIST-Classifier.md
+++ b/tutorials/Tutorial-1-Exporting-a-Basic-MNIST-Classifier.md
+
+### Overview
+This tutorial will guide you through the steps needed to export a C++ code implementation of a
+trained TensorFlow model. We will use the standard MNIST classification task for this since
+a functioning model can be quickly trained using a very small dataset. This tutorial will 
+cover both exporting the model from python building it into a simple C++ project and finally
+building and executing this code.
+
+### Building and Training the Model
+We will use the convolutional MNIST example for this tutorial, which can be found in the
+`tutorials/mnist_conv` directory. The script `mnist_train_and_export.py` will be described step
+by step showing how the TFMin library has been integrated with this simple example adapted
+from TensorFlows own tutorials.
+
+**Note** : This code will attempt to download the MNIST training dataset the fist time, so internet
+connection is required.
+
+The flowgraph created has a variable batch size, which is essential for it to be
+exported correctly by TFMin. TFMin generates code for a single inference step from a flow-graph 
+which processes batches, this requires the the number of dimensions of tensors processed by the
+network to be reduced by one. The library assumes that the batch dimension will be undefined, i.e. 
+has a value equal to `tf.Dimension(None)` allowing it to be automatically detected and reduced.
+
+### Creating the Exporter Object
+Now that our flowgraph contains a trained network model with a variable batch size it's ready to be
+exported by TFMin. First the library itself needs to be imported, this line can be found at the
+top of the mnist_conv example.
+```python
+from tf_min import exporter as tfm_ex
+```
+Constructing the Exporter object requires two parameters; the [tf.Session](https://www.tensorflow.org/api_docs/python/tf/Session) object which the flowgraph
+is being executed within and a list of output tensors the inference operation will generate.
+The exporter uses this list of output tensors to trace back through the flowgraph and work out which operations are
+needed for inference and which can be ignored (training ops, introspection ops, etc).
+```python
+c_exporter = tfm_ex.Exporter(sess, ['layer2/activation:0'])
+```
+This line creates an instance of the Exporter object using the `tf.Session` object `sess` and defines
+a single output tensor, the first (and only) tensor produced by the `layer2/activation` operation. Output
+tensors are defined by their name strings as opposed to the tf.Operation objects.
+
+### Analysing the Model To Export
+Now we have created the `Exporter` object it
+is useful to be able to check which operations the library will now export. The `print_graph()`
+method does exactly this, by printing out a tree representation of the tensors and operations that
+will be exported.
+```python
+c_exporter.print_graph()
+```
+When this method is called in the example code the following debugging information is printed 
+to the terminal.
+```
+[1] output tensors found okay.
+  <"layer2/activation:0" with size (?, 10)>
+    [Add "layer2/Wx_plus_b/add"]  Grads
+    |<"layer2/Wx_plus_b/MatMul:0" with size (?, 10)>
+    |  [MatMul "layer2/Wx_plus_b/MatMul"]  Grads
+    |  |<"Dense1/activation/Maximum:0" with size (?, 300)>
+    |  |  [Maximum "Dense1/activation/Maximum"]  Grads
+    |  |  |<"Dense1/activation/mul:0" with size (?, 300)>
+    |  |  |  [Mul "Dense1/activation/mul"]  Grads
+    |  |  |  |<"Dense1/activation/alpha:0" with size ()>
+    |  |  |  |  [Const "Dense1/activation/alpha"] 
+    |  |  |  |<"Dense1/Wx_plus_b/add:0" with size (?, 300)>
+    |  |  |     [Add "Dense1/Wx_plus_b/add"]  Grads
+    |  |  |     |<"Dense1/Wx_plus_b/MatMul:0" with size (?, 300)>
+    |  |  |     |  [MatMul "Dense1/Wx_plus_b/MatMul"]  Grads
+    |  |  |     |  |<"Conv1/Reshape_4:0" with size (?, 400)>
+    |  |  |     |  |  [Reshape "Conv1/Reshape_4"]  Grads
+    |  |  |     |  |  |<"Conv1/pooling:0" with size (?, 5, 5, 16)>
+    |  |  |     |  |  |  [MaxPool "Conv1/pooling"]  Grads
+    |  |  |     |  |  |  |<"Conv1/activations/Maximum:0" with size (?, 12, 12, 16)>
+    |  |  |     |  |  |     [Maximum "Conv1/activations/Maximum"]  Grads
+    |  |  |     |  |  |     |<"Conv1/activations/mul:0" with size (?, 12, 12, 16)>
+    |  |  |     |  |  |     |  [Mul "Conv1/activations/mul"]  Grads
+    |  |  |     |  |  |     |  |<"Conv1/activations/alpha:0" with size ()>
+    |  |  |     |  |  |     |  |  [Const "Conv1/activations/alpha"] 
+    |  |  |     |  |  |     |  |<"Conv1/convolution:0" with size (?, 12, 12, 16)>
+    |  |  |     |  |  |     |     [Conv2D "Conv1/convolution"]  Grads
+    |  |  |     |  |  |     |     |<"Reshape:0" with size (?, 28, 28, 1)>
+    |  |  |     |  |  |     |     |  [Reshape "Reshape"]  Grads
+    |  |  |     |  |  |     |     |  |<"input/x-input:0" with size (?, 784)>
+    |  |  |     |  |  |     |     |  |  [Placeholder "input/x-input"] 
+    |  |  |     |  |  |     |     |  |<"Reshape/shape:0" with size (4,)>
+    |  |  |     |  |  |     |     |     [Const "Reshape/shape"] 
+    |  |  |     |  |  |     |     |<"Conv1/filter_weights/Variable/read:0" with size (5, 5, 1, 16)>
+    |  |  |     |  |  |     |        [VariableV2 "Conv1/filter_weights/Variable"] 
+    |  |  |     |  |  |     |<"Conv1/convolution:0" with size (?, 12, 12, 16)>
+    |  |  |     |  |  |        [Conv2D "Conv1/convolution"]  Grads
+    |  |  |     |  |  |         . . .
+    |  |  |     |  |  |<"Conv1/Reshape_4/shape:0" with size (2,)>
+    |  |  |     |  |     [Const "Conv1/Reshape_4/shape"] 
+    |  |  |     |  |<"Dense1/weights/Variable/read:0" with size (400, 300)>
+    |  |  |     |     [VariableV2 "Dense1/weights/Variable"] 
+    |  |  |     |<"Dense1/biases/Variable/read:0" with size (300,)>
+    |  |  |        [VariableV2 "Dense1/biases/Variable"] 
+    |  |  |<"Dense1/Wx_plus_b/add:0" with size (?, 300)>
+    |  |     [Add "Dense1/Wx_plus_b/add"]  Grads
+    |  |      . . .
+    |  |<"layer2/weights/Variable/read:0" with size (300, 10)>
+    |     [VariableV2 "layer2/weights/Variable"] 
+    |<"layer2/biases/Variable/read:0" with size (10,)>
+       [VariableV2 "layer2/biases/Variable"] 
+-------------------------------------
+```
+Separate trees are shown for each of the output tensors which are defined, these trees will probably
+have a significant overlap in most cases.
+Tensors are shown in the tree between <> brackets, with their names and sizes. The sizes should 
+show the variable batch dimension as a ?. Operations are shown between [] brackets with the type
+of the operation first followed by the name of the instance of the operation.
+Because flowgraphs are networks not trees, a single tensor can be the parent of multiple operations
+to avoid showing parts of the network multiple times three dots `. . .` are shown if the parent
+of the operation has already been shown earlier in the tree. This also helps keep the size of these
+trees down in the case of much larger networks.
+
+These operation trees can be used to confirm that only the parts of the flowgraph we want will be exported
+to C++. In the example above as we would expect our final output will be a vector of ten elements, 
+which will be the estimates of each of the digits 0-9. 
+
+### Exporting the C++ Implementation
+Now that the Exporter has been created and the parts of the flowgraph that will be exported have
+been checked we just need to generate the actual C++ code files by called the `generate()` method as
+shown below.
+```python
+c_exporter.generate("tfmin_generated/mnist_model",
+                    "mnistModel",
+                    layout='RowMajor')
+```
+The first parameter defines the base name and path of the generated source files (if the directory
+`tfmin_generated` doesn't already exist the library will automatically create it). In this example
+the following files will be generated:
+* tfmin_generated/mnist_model.h
+* tfmin_generated/mnist_model.cpp
+* tfmin_generated/mnist_model_data.h
+
+Here `mnist_model_data.h` will include literal definitions of the weights of the model.
+
+The second parameter defines the identifier of the Object that will be defined by the C++ code.
+
+The final optional parameter defines the internal layout of the tensors used. This defaults to 
+`ColMajor`.
+
+**NOTE** Currently only RowMajor works for all the operations defined in TFMin, this is an open issue.
+
+When the call above is executed in the example code, it will print the following to the terminal:
+```
+Analysed flow-graph
+Optimised memory map.
+Generated constructor
+Generated inference method.
+Generated data header.
+Complete
+```
+
+### Integration with a Simple C++ Project
+After the `mnist_train_and_export.py` script has been successfully executed and the C++ files described
+above generated then we are ready to build this code into a larger project. A simple example
+project is included in the directory `cpp_project`. The `test_mnist.cpp` source file contains the entry point for
+the application and everything necessary to execute inference model:
+```cpp
+/*----------------------------------------------------------------------------------
+
+  Simple example integrating the c++ MNIST classification inference model
+  generated by TFMin with an application
+
+----------------------------------------------------------------------------------*/
+
+#include <iostream>
+#include "mnist_model.h"
+#include "example_mnist_input.h"
+
+int main()
+{
+    // Instantiate mnist inference model object
+    MNISTModel mnist;
+    
+    // Create single threaded executionEigen device
+    Eigen::DefaultDevice device;
+    
+    std::cout << "Running inference model." << std::endl;
+    float *input = (float*)exampleInputDataHex;
+    float output[10];
+    mnist.eval(device, input, output);
+    
+    std::cout << "Completed output was." << std::endl;
+    for (int i=0; i<10; ++i)
+        std::cout << "[" << i << "] = " << output[i] << std::endl;
+    
+    return 0;
+}
+```
+Now lets break this code down.
+```cpp
+#include "mnist_model.h"
+#include "example_mnist_input.h"
+```
+The first local include adds the header for the generated MNISTModel object that was generated
+by the python code earlier, **note** the relative path `../tfmin_generate` needs to be passed to the
+compiler for this to build successfully.
+
+The second local include defines an example input matrix for the MNIST classifier containing a seven.
+```cpp
+// Instantiate mnist inference model object
+MNISTModel mnist;
+
+// Create single threaded executionEigen device
+Eigen::DefaultDevice device;
+```
+Now the objects required to perform the inference are created, an instance of the MNISTModel object
+generated by the python code, and an Eigen default device object. This instructs the Eigen library
+to use a single thread to execute tensor operations.
+```cpp
+std::cout << "Running inference model." << std::endl;
+float *input = (float*)exampleInputDataHex;
+float output[10];
+mnist.eval(device, example_input, output);
+```
+Now we can perform the inference itself using the `eval()` method of the MNISTModel object.
+We need to pass the Eigen device object, a pointer to the example input defined in `example_mnist_input.h`
+and a pointer to an output buffer.
+
+The example data has been defined as an int array using hex literals, these actually represent
+32 bit floating point values but have been represented like this to avoid any loss of precision when
+converting to and from decimal representation. This is why the input data needs to be cast to a float pointer.
+```cpp
+std::cout << "Completed output was." << std::endl;
+for (int i=0; i<10; ++i)
+    std::cout << "[" << i << "] = " << output[i] << std::endl;
+```
+Finally we print out the results of the inference operation. Assuming that the model trained
+correctly when it was generated by the python script then the highest value printed should be
+in the seventh row corresponding to the correct digit of seven, as shown below (Actual values will
+vary based upon training).
+```
+Running inference model.
+Completed output was.
+[0] = -0.111315
+[1] = -0.948662
+[2] = 4.31211
+[3] = 3.83368
+[4] = -5.44246
+[5] = -5.67521
+[6] = -13.4396
+[7] = 13.9334
+[8] = -2.29676
+[9] = 1.56804
+```
+
+### Building and Running the C++ Project
+The example code described above can be built using the included Makefile, or you can use the 
+following direct call to g++
+```
+g++ -std=c++11 -O3 -o native_test test_mnist.cpp ../tfmin_generated/mnist_model.cpp -I ../tfmin_generated/
+```
+Firstly `-std=c++11 -O3` tells the compiler to use the c++ 11 standard and full optimisation (which is important to get
+the best out of the Eigen library).
+
+Next `-o native_test test_mnist.cpp` defines the output binary and entry point source described above.
+
+Then `../tfmin_generated/mnist_model.cpp` compiles the MNISTModel object which was generated by the python script.
+
+Finally `-I ../tfmin_generated` adds this directory to the includes path so that our entry point code can find the
+`mnist_model.h` header file.
+
+### Summary
+This tutorial should have explained how to integrate the TFMin code generator into an existing python
+TensorFlow script, generate a C++ version of a model and integrate it into a simple project. Next we will 
+look at generating additional methods to automatically analyse the generated model.
+
+[Next Tutorial](/Tutorials/Tutorial-2-Evaluating-the-Runtime-of-a-Model) - Evaluating the Runtime of a Model
+