|
|
|
|
|
|
|
### Overview
|
|
|
|
This tutorial will guide you through the steps needed to export a C++ code implementation of a
|
|
|
|
trained TensorFlow model. We will use the standard MNIST classification task for this since
|
|
|
|
a functioning model can be quickly trained using a very small dataset. This tutorial will
|
|
|
|
cover both exporting the model from python building it into a simple C++ project and finally
|
|
|
|
building and executing this code.
|
|
|
|
|
|
|
|
### Building and Training the Model
|
|
|
|
We will use the convolutional MNIST example for this tutorial, which can be found in the
|
|
|
|
`tutorials/mnist_conv` directory. The script `mnist_train_and_export.py` will be described step
|
|
|
|
by step showing how the TFMin library has been integrated with this simple example adapted
|
|
|
|
from TensorFlows own tutorials.
|
|
|
|
|
|
|
|
**Note** : This code will attempt to download the MNIST training dataset the fist time, so internet
|
|
|
|
connection is required.
|
|
|
|
|
|
|
|
The flowgraph created has a variable batch size, which is essential for it to be
|
|
|
|
exported correctly by TFMin. TFMin generates code for a single inference step from a flow-graph
|
|
|
|
which processes batches, this requires the the number of dimensions of tensors processed by the
|
|
|
|
network to be reduced by one. The library assumes that the batch dimension will be undefined, i.e.
|
|
|
|
has a value equal to `tf.Dimension(None)` allowing it to be automatically detected and reduced.
|
|
|
|
|
|
|
|
### Creating the Exporter Object
|
|
|
|
Now that our flowgraph contains a trained network model with a variable batch size it's ready to be
|
|
|
|
exported by TFMin. First the library itself needs to be imported, this line can be found at the
|
|
|
|
top of the mnist_conv example.
|
|
|
|
```python
|
|
|
|
from tf_min import exporter as tfm_ex
|
|
|
|
```
|
|
|
|
Constructing the Exporter object requires two parameters; the [tf.Session](https://www.tensorflow.org/api_docs/python/tf/Session) object which the flowgraph
|
|
|
|
is being executed within and a list of output tensors the inference operation will generate.
|
|
|
|
The exporter uses this list of output tensors to trace back through the flowgraph and work out which operations are
|
|
|
|
needed for inference and which can be ignored (training ops, introspection ops, etc).
|
|
|
|
```python
|
|
|
|
c_exporter = tfm_ex.Exporter(sess, ['layer2/activation:0'])
|
|
|
|
```
|
|
|
|
This line creates an instance of the Exporter object using the `tf.Session` object `sess` and defines
|
|
|
|
a single output tensor, the first (and only) tensor produced by the `layer2/activation` operation. Output
|
|
|
|
tensors are defined by their name strings as opposed to the tf.Operation objects.
|
|
|
|
|
|
|
|
### Analysing the Model To Export
|
|
|
|
Now we have created the `Exporter` object it
|
|
|
|
is useful to be able to check which operations the library will now export. The `print_graph()`
|
|
|
|
method does exactly this, by printing out a tree representation of the tensors and operations that
|
|
|
|
will be exported.
|
|
|
|
```python
|
|
|
|
c_exporter.print_graph()
|
|
|
|
```
|
|
|
|
When this method is called in the example code the following debugging information is printed
|
|
|
|
to the terminal.
|
|
|
|
```
|
|
|
|
[1] output tensors found okay.
|
|
|
|
<"layer2/activation:0" with size (?, 10)>
|
|
|
|
[Add "layer2/Wx_plus_b/add"] Grads
|
|
|
|
|<"layer2/Wx_plus_b/MatMul:0" with size (?, 10)>
|
|
|
|
| [MatMul "layer2/Wx_plus_b/MatMul"] Grads
|
|
|
|
| |<"Dense1/activation/Maximum:0" with size (?, 300)>
|
|
|
|
| | [Maximum "Dense1/activation/Maximum"] Grads
|
|
|
|
| | |<"Dense1/activation/mul:0" with size (?, 300)>
|
|
|
|
| | | [Mul "Dense1/activation/mul"] Grads
|
|
|
|
| | | |<"Dense1/activation/alpha:0" with size ()>
|
|
|
|
| | | | [Const "Dense1/activation/alpha"]
|
|
|
|
| | | |<"Dense1/Wx_plus_b/add:0" with size (?, 300)>
|
|
|
|
| | | [Add "Dense1/Wx_plus_b/add"] Grads
|
|
|
|
| | | |<"Dense1/Wx_plus_b/MatMul:0" with size (?, 300)>
|
|
|
|
| | | | [MatMul "Dense1/Wx_plus_b/MatMul"] Grads
|
|
|
|
| | | | |<"Conv1/Reshape_4:0" with size (?, 400)>
|
|
|
|
| | | | | [Reshape "Conv1/Reshape_4"] Grads
|
|
|
|
| | | | | |<"Conv1/pooling:0" with size (?, 5, 5, 16)>
|
|
|
|
| | | | | | [MaxPool "Conv1/pooling"] Grads
|
|
|
|
| | | | | | |<"Conv1/activations/Maximum:0" with size (?, 12, 12, 16)>
|
|
|
|
| | | | | | [Maximum "Conv1/activations/Maximum"] Grads
|
|
|
|
| | | | | | |<"Conv1/activations/mul:0" with size (?, 12, 12, 16)>
|
|
|
|
| | | | | | | [Mul "Conv1/activations/mul"] Grads
|
|
|
|
| | | | | | | |<"Conv1/activations/alpha:0" with size ()>
|
|
|
|
| | | | | | | | [Const "Conv1/activations/alpha"]
|
|
|
|
| | | | | | | |<"Conv1/convolution:0" with size (?, 12, 12, 16)>
|
|
|
|
| | | | | | | [Conv2D "Conv1/convolution"] Grads
|
|
|
|
| | | | | | | |<"Reshape:0" with size (?, 28, 28, 1)>
|
|
|
|
| | | | | | | | [Reshape "Reshape"] Grads
|
|
|
|
| | | | | | | | |<"input/x-input:0" with size (?, 784)>
|
|
|
|
| | | | | | | | | [Placeholder "input/x-input"]
|
|
|
|
| | | | | | | | |<"Reshape/shape:0" with size (4,)>
|
|
|
|
| | | | | | | | [Const "Reshape/shape"]
|
|
|
|
| | | | | | | |<"Conv1/filter_weights/Variable/read:0" with size (5, 5, 1, 16)>
|
|
|
|
| | | | | | | [VariableV2 "Conv1/filter_weights/Variable"]
|
|
|
|
| | | | | | |<"Conv1/convolution:0" with size (?, 12, 12, 16)>
|
|
|
|
| | | | | | [Conv2D "Conv1/convolution"] Grads
|
|
|
|
| | | | | | . . .
|
|
|
|
| | | | | |<"Conv1/Reshape_4/shape:0" with size (2,)>
|
|
|
|
| | | | | [Const "Conv1/Reshape_4/shape"]
|
|
|
|
| | | | |<"Dense1/weights/Variable/read:0" with size (400, 300)>
|
|
|
|
| | | | [VariableV2 "Dense1/weights/Variable"]
|
|
|
|
| | | |<"Dense1/biases/Variable/read:0" with size (300,)>
|
|
|
|
| | | [VariableV2 "Dense1/biases/Variable"]
|
|
|
|
| | |<"Dense1/Wx_plus_b/add:0" with size (?, 300)>
|
|
|
|
| | [Add "Dense1/Wx_plus_b/add"] Grads
|
|
|
|
| | . . .
|
|
|
|
| |<"layer2/weights/Variable/read:0" with size (300, 10)>
|
|
|
|
| [VariableV2 "layer2/weights/Variable"]
|
|
|
|
|<"layer2/biases/Variable/read:0" with size (10,)>
|
|
|
|
[VariableV2 "layer2/biases/Variable"]
|
|
|
|
-------------------------------------
|
|
|
|
```
|
|
|
|
Separate trees are shown for each of the output tensors which are defined, these trees will probably
|
|
|
|
have a significant overlap in most cases.
|
|
|
|
Tensors are shown in the tree between <> brackets, with their names and sizes. The sizes should
|
|
|
|
show the variable batch dimension as a ?. Operations are shown between [] brackets with the type
|
|
|
|
of the operation first followed by the name of the instance of the operation.
|
|
|
|
Because flowgraphs are networks not trees, a single tensor can be the parent of multiple operations
|
|
|
|
to avoid showing parts of the network multiple times three dots `. . .` are shown if the parent
|
|
|
|
of the operation has already been shown earlier in the tree. This also helps keep the size of these
|
|
|
|
trees down in the case of much larger networks.
|
|
|
|
|
|
|
|
These operation trees can be used to confirm that only the parts of the flowgraph we want will be exported
|
|
|
|
to C++. In the example above as we would expect our final output will be a vector of ten elements,
|
|
|
|
which will be the estimates of each of the digits 0-9.
|
|
|
|
|
|
|
|
### Exporting the C++ Implementation
|
|
|
|
Now that the Exporter has been created and the parts of the flowgraph that will be exported have
|
|
|
|
been checked we just need to generate the actual C++ code files by called the `generate()` method as
|
|
|
|
shown below.
|
|
|
|
```python
|
|
|
|
c_exporter.generate("tfmin_generated/mnist_model",
|
|
|
|
"mnistModel",
|
|
|
|
layout='RowMajor')
|
|
|
|
```
|
|
|
|
The first parameter defines the base name and path of the generated source files (if the directory
|
|
|
|
`tfmin_generated` doesn't already exist the library will automatically create it). In this example
|
|
|
|
the following files will be generated:
|
|
|
|
* tfmin_generated/mnist_model.h
|
|
|
|
* tfmin_generated/mnist_model.cpp
|
|
|
|
* tfmin_generated/mnist_model_data.h
|
|
|
|
|
|
|
|
Here `mnist_model_data.h` will include literal definitions of the weights of the model.
|
|
|
|
|
|
|
|
The second parameter defines the identifier of the Object that will be defined by the C++ code.
|
|
|
|
|
|
|
|
The final optional parameter defines the internal layout of the tensors used. This defaults to
|
|
|
|
`ColMajor`.
|
|
|
|
|
|
|
|
**NOTE** Currently only RowMajor works for all the operations defined in TFMin, this is an open issue.
|
|
|
|
|
|
|
|
When the call above is executed in the example code, it will print the following to the terminal:
|
|
|
|
```
|
|
|
|
Analysed flow-graph
|
|
|
|
Optimised memory map.
|
|
|
|
Generated constructor
|
|
|
|
Generated inference method.
|
|
|
|
Generated data header.
|
|
|
|
Complete
|
|
|
|
```
|
|
|
|
|
|
|
|
### Integration with a Simple C++ Project
|
|
|
|
After the `mnist_train_and_export.py` script has been successfully executed and the C++ files described
|
|
|
|
above generated then we are ready to build this code into a larger project. A simple example
|
|
|
|
project is included in the directory `cpp_project`. The `test_mnist.cpp` source file contains the entry point for
|
|
|
|
the application and everything necessary to execute inference model:
|
|
|
|
```cpp
|
|
|
|
/*----------------------------------------------------------------------------------
|
|
|
|
|
|
|
|
Simple example integrating the c++ MNIST classification inference model
|
|
|
|
generated by TFMin with an application
|
|
|
|
|
|
|
|
----------------------------------------------------------------------------------*/
|
|
|
|
|
|
|
|
#include <iostream>
|
|
|
|
#include "mnist_model.h"
|
|
|
|
#include "example_mnist_input.h"
|
|
|
|
|
|
|
|
int main()
|
|
|
|
{
|
|
|
|
// Instantiate mnist inference model object
|
|
|
|
MNISTModel mnist;
|
|
|
|
|
|
|
|
// Create single threaded executionEigen device
|
|
|
|
Eigen::DefaultDevice device;
|
|
|
|
|
|
|
|
std::cout << "Running inference model." << std::endl;
|
|
|
|
float *input = (float*)exampleInputDataHex;
|
|
|
|
float output[10];
|
|
|
|
mnist.eval(device, input, output);
|
|
|
|
|
|
|
|
std::cout << "Completed output was." << std::endl;
|
|
|
|
for (int i=0; i<10; ++i)
|
|
|
|
std::cout << "[" << i << "] = " << output[i] << std::endl;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
```
|
|
|
|
Now lets break this code down.
|
|
|
|
```cpp
|
|
|
|
#include "mnist_model.h"
|
|
|
|
#include "example_mnist_input.h"
|
|
|
|
```
|
|
|
|
The first local include adds the header for the generated MNISTModel object that was generated
|
|
|
|
by the python code earlier, **note** the relative path `../tfmin_generate` needs to be passed to the
|
|
|
|
compiler for this to build successfully.
|
|
|
|
|
|
|
|
The second local include defines an example input matrix for the MNIST classifier containing a seven.
|
|
|
|
```cpp
|
|
|
|
// Instantiate mnist inference model object
|
|
|
|
MNISTModel mnist;
|
|
|
|
|
|
|
|
// Create single threaded executionEigen device
|
|
|
|
Eigen::DefaultDevice device;
|
|
|
|
```
|
|
|
|
Now the objects required to perform the inference are created, an instance of the MNISTModel object
|
|
|
|
generated by the python code, and an Eigen default device object. This instructs the Eigen library
|
|
|
|
to use a single thread to execute tensor operations.
|
|
|
|
```cpp
|
|
|
|
std::cout << "Running inference model." << std::endl;
|
|
|
|
float *input = (float*)exampleInputDataHex;
|
|
|
|
float output[10];
|
|
|
|
mnist.eval(device, example_input, output);
|
|
|
|
```
|
|
|
|
Now we can perform the inference itself using the `eval()` method of the MNISTModel object.
|
|
|
|
We need to pass the Eigen device object, a pointer to the example input defined in `example_mnist_input.h`
|
|
|
|
and a pointer to an output buffer.
|
|
|
|
|
|
|
|
The example data has been defined as an int array using hex literals, these actually represent
|
|
|
|
32 bit floating point values but have been represented like this to avoid any loss of precision when
|
|
|
|
converting to and from decimal representation. This is why the input data needs to be cast to a float pointer.
|
|
|
|
```cpp
|
|
|
|
std::cout << "Completed output was." << std::endl;
|
|
|
|
for (int i=0; i<10; ++i)
|
|
|
|
std::cout << "[" << i << "] = " << output[i] << std::endl;
|
|
|
|
```
|
|
|
|
Finally we print out the results of the inference operation. Assuming that the model trained
|
|
|
|
correctly when it was generated by the python script then the highest value printed should be
|
|
|
|
in the seventh row corresponding to the correct digit of seven, as shown below (Actual values will
|
|
|
|
vary based upon training).
|
|
|
|
```
|
|
|
|
Running inference model.
|
|
|
|
Completed output was.
|
|
|
|
[0] = -0.111315
|
|
|
|
[1] = -0.948662
|
|
|
|
[2] = 4.31211
|
|
|
|
[3] = 3.83368
|
|
|
|
[4] = -5.44246
|
|
|
|
[5] = -5.67521
|
|
|
|
[6] = -13.4396
|
|
|
|
[7] = 13.9334
|
|
|
|
[8] = -2.29676
|
|
|
|
[9] = 1.56804
|
|
|
|
```
|
|
|
|
|
|
|
|
### Building and Running the C++ Project
|
|
|
|
The example code described above can be built using the included Makefile, or you can use the
|
|
|
|
following direct call to g++
|
|
|
|
```
|
|
|
|
g++ -std=c++11 -O3 -o native_test test_mnist.cpp ../tfmin_generated/mnist_model.cpp -I ../tfmin_generated/
|
|
|
|
```
|
|
|
|
Firstly `-std=c++11 -O3` tells the compiler to use the c++ 11 standard and full optimisation (which is important to get
|
|
|
|
the best out of the Eigen library).
|
|
|
|
|
|
|
|
Next `-o native_test test_mnist.cpp` defines the output binary and entry point source described above.
|
|
|
|
|
|
|
|
Then `../tfmin_generated/mnist_model.cpp` compiles the MNISTModel object which was generated by the python script.
|
|
|
|
|
|
|
|
Finally `-I ../tfmin_generated` adds this directory to the includes path so that our entry point code can find the
|
|
|
|
`mnist_model.h` header file.
|
|
|
|
|
|
|
|
### Summary
|
|
|
|
This tutorial should have explained how to integrate the TFMin code generator into an existing python
|
|
|
|
TensorFlow script, generate a C++ version of a model and integrate it into a simple project. Next we will
|
|
|
|
look at generating additional methods to automatically analyse the generated model.
|
|
|
|
|
|
|
|
[Next Tutorial](/Tutorials/Tutorial-2-Evaluating-the-Runtime-of-a-Model) - Evaluating the Runtime of a Model
|
|
|
|
|