TFLite perf regression with in-graph feature computation #2139

reuben · 2019-05-29T22:50:08Z

The TFLite interpreter API runs the entire graph when you call Invoke(), which means every time we try to compute features we run the acoustic model as well.

The Interpreter API has experimental support for subgraphs, but InterpreterBuilder doesn't support that yet, so we would have to build the graphs manually (by copy-pasting and modifying the InterpreterBuilder code), which is hacky and makes upgrading TF versions difficult, as well as relying on an experimental API.

Alternatively, we can use two different models/interpreters. One with the acoustic model graph and one with the MFCC feature computation graph. The question is then how do we package this secondary graph:

Separate .tflite file that has to be specified in the API (terrible for users)
Separate .tflite file that we then convert into a C source file and embed into libdeepspeech.so, then load with tflite::FlatBufferModel::BuildFromBuffer (hacky, specially process wise - need to check-in generated code, remember to update it when the source changes, etc)
Separate .tflite file that we then append to the main .tflite file, then mmap the whole thing ourselves, and use tflite::FlatBufferModel::BuildFromBuffer to create the two separate FlatBuffer models and interpreters (cleaner than other options IMO, but would require implementing or importing some cross-platform mmap abstraction)
Something else that I'm missing.

I've implemented option 2 as a proof of concept to validate that it fixes the perf regression, it's here: 0c8017a

@lissyx thoughts?

The text was updated successfully, but these errors were encountered:

lissyx · 2019-05-30T07:14:37Z

The Interpreter API has experimental support for subgraphs, but InterpreterBuilder doesn't support that yet, so we would have to build the graphs manually (by copy-pasting and modifying the InterpreterBuilder code), which is hacky and makes upgrading TF versions difficult, as well as relying on an experimental API.

Right. Something I was wondering but could not check in time is: is the Mfcc node connected to the rest of the graph ? Why TensorFlow runtime is able to flow only the Mfcc node if it's connected ?
Could it be a bug, either in toco or in our export graph, that makes the Mfcc node connected to the rest of the graph ?

reuben · 2019-05-30T12:02:13Z

The two graphs are not connected. This behavior is documented in tflite::Interpreter::Invoke, the whole graph is run in topological order.

reuben · 2019-05-30T12:04:40Z

This could be fixed cleanly if TF had better support for subgraphs. Right now the .tflite produced by toco has all nodes in the primary subgraph, because other APIs like InterpreterBuilder don't support multiple subgraphs yet.

lissyx · 2019-05-30T13:06:02Z

TF had better support for subgraphs.

It might be the case, I saw some commits from februrary toying with that. Not sure which release it is, though, and we may want to have something ready now anyway.

reuben · 2019-05-30T13:22:04Z

Yeah, it'd be better to release 0.5 without this regression. But if better subgraph support is coming in, say, TF 1.14, we can always revert the solution we use now to a better one.

lissyx · 2019-05-30T13:54:05Z

So as discussed on IRC, we might have a way playing around with the execution plan: limiting it to nodes 1 and 2 for Mfcc, I could get expected behavior of not traversing the whole graph. Only limitation so far is that this too naive approach proves to provide improper final result. It's possible it requires more refinements.

lock · 2019-06-30T00:34:16Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

reuben added the bug label May 29, 2019

reuben self-assigned this May 29, 2019

kdavis-mozilla mentioned this issue May 30, 2019

checkpoints #2137

Closed

reuben closed this as completed in 4fcf47c May 31, 2019

lock bot locked and limited conversation to collaborators Jun 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFLite perf regression with in-graph feature computation #2139

TFLite perf regression with in-graph feature computation #2139

reuben commented May 29, 2019

lissyx commented May 30, 2019

reuben commented May 30, 2019

reuben commented May 30, 2019

lissyx commented May 30, 2019

reuben commented May 30, 2019

lissyx commented May 30, 2019

lock bot commented Jun 30, 2019

TFLite perf regression with in-graph feature computation #2139

TFLite perf regression with in-graph feature computation #2139

Comments

reuben commented May 29, 2019

lissyx commented May 30, 2019

reuben commented May 30, 2019

reuben commented May 30, 2019

lissyx commented May 30, 2019

reuben commented May 30, 2019

lissyx commented May 30, 2019

lock bot commented Jun 30, 2019