|
|
|
|
|
|
|
### Overview
|
|
|
|
This tutorial will show you how to generate the `timing()` method in addition to the standard `eval()`
|
|
|
|
method and how to use it to collect per operation runtime time information from a model. This will
|
|
|
|
build upon the convolutional MNIST example used in tutorial 1, but this time you'll need to add some
|
|
|
|
code to the python and C++ sources.
|
|
|
|
|
|
|
|
### Generating the Timing Method
|
|
|
|
Since the timing method does not process information differently than the origin eval method it can
|
|
|
|
be easily generated alongside it. Inside the `mnist_train_and_export.py` python script modify the call
|
|
|
|
to `generate()` so it includes the parameter `timing=True` as shown below:
|
|
|
|
```python
|
|
|
|
c_exporter.generate("tfmin_generated/mnist_model",
|
|
|
|
"mnistModel",
|
|
|
|
timing=True,
|
|
|
|
layout='RowMajor')
|
|
|
|
```
|
|
|
|
If you now run the python script again the output should look almost exactly the same as in tutorial
|
|
|
|
1 except at the bottom it should now print:
|
|
|
|
```
|
|
|
|
Analysed flow-graph
|
|
|
|
Optimised memory map.
|
|
|
|
Generated constructor
|
|
|
|
Generated inference method.
|
|
|
|
Generated timing method.
|
|
|
|
Generated data header.
|
|
|
|
Complete
|
|
|
|
```
|
|
|
|
|
|
|
|
### The Timing Method
|
|
|
|
The declaration of the timing method that has been generated by the modified python script is shown below:
|
|
|
|
```
|
|
|
|
template <typename Device>
|
|
|
|
std::vector<TFMin::OperationTime> YourObjectName::timing(const Device &d, <buffer ptrs. . .>, print=true);
|
|
|
|
```
|
|
|
|
|
|
|
|
This method can be used in exactly the same way as the eval method, it takes the same input and output buffer pointers and performs
|
|
|
|
the same inference operation. By default this method will print out per operating runtime results to
|
|
|
|
the terminal but this can be disabled by passing a `false` value to the optional `print` parameter. In all cases the timing method also returns a vector of TFMin::OperationTime objects allowing a C++ project
|
|
|
|
to make use of the results directly.
|
|
|
|
|
|
|
|
##### The OperationTime Object
|
|
|
|
|
|
|
|
```cpp
|
|
|
|
class OperationTime
|
|
|
|
{
|
|
|
|
public:
|
|
|
|
OperationTime(std::string name, float duration)
|
|
|
|
{
|
|
|
|
this->name = name;
|
|
|
|
this->duration = duration;
|
|
|
|
};
|
|
|
|
|
|
|
|
std::string name;
|
|
|
|
float duration;
|
|
|
|
};
|
|
|
|
typedef std::vector<OperationTime> TimingResult;
|
|
|
|
```
|
|
|
|
|
|
|
|
### Using the Timing Method in your C++ Project
|
|
|
|
This example uses the timing method to find the average timing results of a hundred execution of this
|
|
|
|
model. This is an important method to use when evaluating timing on complex operation systems. This code disables the printing of timing results by the timing method, but later uses the models `printTiming()` method
|
|
|
|
to print the same data structure.
|
|
|
|
|
|
|
|
```cpp
|
|
|
|
std::cout << "getting performance of model." << std::endl;
|
|
|
|
TFMin::TimingResult sum;
|
|
|
|
int runCount = 100;
|
|
|
|
for (int i=0; i<runCount; ++i)
|
|
|
|
{
|
|
|
|
TFMin::TimingResult times = mnist.timing(device, input, output, false);
|
|
|
|
|
|
|
|
if (sum.size() == 0)
|
|
|
|
sum = times;
|
|
|
|
else
|
|
|
|
{
|
|
|
|
for (auto a=0; a<times.size(); ++a)
|
|
|
|
sum[a].duration += times[a].duration;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
for (auto s : sum)
|
|
|
|
s.duration /= runCount;
|
|
|
|
mnist.printTiming(sum);
|
|
|
|
```
|
|
|
|
If you copy the code snippet above into the test_mnist.cpp file after the inference operation and re-make the
|
|
|
|
project then it should now produce the following output:
|
|
|
|
```
|
|
|
|
Running inference model.
|
|
|
|
Completed output was.
|
|
|
|
[0] = -1.88451
|
|
|
|
[1] = -1.98022
|
|
|
|
[2] = 1.95988
|
|
|
|
[3] = 3.76175
|
|
|
|
[4] = -5.40039
|
|
|
|
[5] = -3.67301
|
|
|
|
[6] = -11.9938
|
|
|
|
[7] = 12.6651
|
|
|
|
[8] = -0.731242
|
|
|
|
[9] = 2.37594
|
|
|
|
getting performance of model.
|
|
|
|
Operating timing results for
|
|
|
|
5.70007e-05, seconds, Reshape
|
|
|
|
0.008144, seconds, Conv1/convolution
|
|
|
|
8.10022e-05, seconds, Conv1/activations/mul
|
|
|
|
8.50009e-05, seconds, Conv1/activations/Maximum
|
|
|
|
0.00770898, seconds, Conv1/pooling
|
|
|
|
4.90015e-05, seconds, Conv1/Reshape_4
|
|
|
|
0.00363198, seconds, Dense1/Wx_plus_b/MatMul
|
|
|
|
5.29992e-05, seconds, Dense1/Wx_plus_b/add
|
|
|
|
4.70006e-05, seconds, Dense1/activation/mul
|
|
|
|
4.0005e-05, seconds, Dense1/activation/Maximum
|
|
|
|
0.000275009, seconds, layer2/Wx_plus_b/MatMul
|
|
|
|
4.69848e-05, seconds, layer2/Wx_plus_b/add
|
|
|
|
0.020219, seconds, Total duration
|
|
|
|
```
|
|
|
|
The text output of the `printTiming()` method is in CSV format as well as being human readable, this means
|
|
|
|
it can be easily copy and pasted into a range of programs for further analysis.
|
|
|
|
|
|
|
|
### Summary
|
|
|
|
This tutorial should have demonstrated how to generate use the timing method of the MNIST inference model.
|
|
|
|
|
|
|
|
---
|
|
|
|
[Previous Tutorial](/Tutorials/Tutorial-1-Exporting-a-Basic-MNIST-Classifier) - Exporting a basic MNIST Classifier
|
|
|
|
|
|
|
|
[Next Tutorial](/Tutorials/Tutorial-3-Validating-the-Accuracy-of-a-Model) - Validating the Accuracy of a Model
|
|
|
|
|