Overview
This tutorial will show you how to generate the timing()
method in addition to the standard eval()
method and how to use it to collect per operation runtime time information from a model. This will
build upon the convolutional MNIST example used in tutorial 1, but this time you'll need to add some
code to the python and C++ sources.
Generating the Timing Method
Since the timing method does not process information differently than the origin eval method it can
be easily generated alongside it. Inside the mnist_train_and_export.py
python script modify the call
to generate()
so it includes the parameter timing=True
as shown below:
c_exporter.generate("tfmin_generated/mnist_model",
"mnistModel",
timing=True,
layout='RowMajor')
If you now run the python script again the output should look almost exactly the same as in tutorial 1 except at the bottom it should now print:
Analysed flow-graph
Optimised memory map.
Generated constructor
Generated inference method.
Generated timing method.
Generated data header.
Complete
The Timing Method
The declaration of the timing method that has been generated by the modified python script is shown below:
template <typename Device>
std::vector<TFMin::OperationTime> YourObjectName::timing(const Device &d, <buffer ptrs. . .>, print=true);
This method can be used in exactly the same way as the eval method, it takes the same input and output buffer pointers and performs
the same inference operation. By default this method will print out per operating runtime results to
the terminal but this can be disabled by passing a false
value to the optional print
parameter. In all cases the timing method also returns a vector of TFMin::OperationTime objects allowing a C++ project
to make use of the results directly.
The OperationTime Object
class OperationTime
{
public:
OperationTime(std::string name, float duration)
{
this->name = name;
this->duration = duration;
};
std::string name;
float duration;
};
typedef std::vector<OperationTime> TimingResult;
Using the Timing Method in your C++ Project
This example uses the timing method to find the average timing results of a hundred execution of this
model. This is an important method to use when evaluating timing on complex operation systems. This code disables the printing of timing results by the timing method, but later uses the models printTiming()
method
to print the same data structure.
std::cout << "getting performance of model." << std::endl;
TFMin::TimingResult sum;
int runCount = 100;
for (int i=0; i<runCount; ++i)
{
TFMin::TimingResult times = mnist.timing(device, input, output, false);
if (sum.size() == 0)
sum = times;
else
{
for (auto a=0; a<times.size(); ++a)
sum[a].duration += times[a].duration;
}
}
for (auto s : sum)
s.duration /= runCount;
mnist.printTiming(sum);
If you copy the code snippet above into the test_mnist.cpp file after the inference operation and re-make the project then it should now produce the following output:
Running inference model.
Completed output was.
[0] = -1.88451
[1] = -1.98022
[2] = 1.95988
[3] = 3.76175
[4] = -5.40039
[5] = -3.67301
[6] = -11.9938
[7] = 12.6651
[8] = -0.731242
[9] = 2.37594
getting performance of model.
Operating timing results for
5.70007e-05, seconds, Reshape
0.008144, seconds, Conv1/convolution
8.10022e-05, seconds, Conv1/activations/mul
8.50009e-05, seconds, Conv1/activations/Maximum
0.00770898, seconds, Conv1/pooling
4.90015e-05, seconds, Conv1/Reshape_4
0.00363198, seconds, Dense1/Wx_plus_b/MatMul
5.29992e-05, seconds, Dense1/Wx_plus_b/add
4.70006e-05, seconds, Dense1/activation/mul
4.0005e-05, seconds, Dense1/activation/Maximum
0.000275009, seconds, layer2/Wx_plus_b/MatMul
4.69848e-05, seconds, layer2/Wx_plus_b/add
0.020219, seconds, Total duration
The text output of the printTiming()
method is in CSV format as well as being human readable, this means
it can be easily copy and pasted into a range of programs for further analysis.
Summary
This tutorial should have demonstrated how to generate use the timing method of the MNIST inference model.
Previous Tutorial - Exporting a basic MNIST Classifier
Next Tutorial - Validating the Accuracy of a Model