Unable to test and deploy a deeplabv3-mobilenetv2 tensorflow-lite segmentation model for inference

Asked 9/11, 2018 at 15:47 Answered 29/11, 2019 at 16:35

android tensorflow tensorflow-lite deeplab

We are trying to run a semantic segmentation model on android using deeplabv3 and mobilenetv2.We followed the official tensorflow lite conversion procedure using TOCO and tflite_convert with the help of bazel.The source frozen graph was obtained from the official TensorFlow DeepLab Model Zoo.

We were able to successfully convert the model with the following command:-

CUDA_VISIBLE_DEVICES="0" toco --output_file=toco256.tflite --graph_def_file=path/to/deeplab/deeplabv3_mnv2_pascal_trainval/frozen_inference_graph.pb --input_arrays=ImageTensor --output_arrays=SemanticPredictions --input_shapes=1,256,256,3 --inference_input_type=QUANTIZED_UINT8 --inference_type=FLOAT --mean_values=128 --std_dev_values=127 --allow_custom_ops --post_training_quantize

The size of the tflite file was around 2.25 Mb.But when we tried to test the model using the official benchmark tool, it failed with the following error report :-

bazel run -c opt tensorflow/contrib/lite/tools/benchmark:benchmark_model -- --graph=`realpath toco256.tflite`
INFO: Analysed target //tensorflow/contrib/lite/tools/benchmark:benchmark_model (0 packages loaded).
INFO: Found 1 target...
Target //tensorflow/contrib/lite/tools/benchmark:benchmark_model up-to-date:
  bazel-bin/tensorflow/contrib/lite/tools/benchmark/benchmark_model
INFO: Elapsed time: 0.154s, Critical Path: 0.00s
INFO: 0 processes.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/tensorflow/contrib/lite/tools/benchmark/benchmark_model '--graph=path/to/deeplab/venINFO: Build completed successfully, 1 total action
STARTING!
Num runs: [50]
Inter-run delay (seconds): [-1]
Num threads: [1]
Benchmark name: []
Output prefix: []
Warmup runs: [1]
Graph: path/to/venv/tensorflow/toco256.tflite]
Input layers: []
Input shapes: []
Use nnapi : [0]
Loaded model path/to/venv/tensorflow/toco256.tflite
resolved reporter
Initialized session in 45.556ms
Running benchmark for 1 iterations 
tensorflow/contrib/lite/kernels/pad.cc:96 op_context.dims != 4 (3 != 4)
Node number 24 (PAD) failed to prepare.

Failed to invoke!
Aborted (core dumped)

We also tried the same command without including the 'allow_custom_ops' and 'post_training_quantize' options and even used the same input size as 1,513,513,3; but the result was the same.

This issue seems to be similar to the following github issue: (https://github.com/tensorflow/tensorflow/issues/21266). However in the latest version of TensorFlow the issue is supposed to be fixed.

Model: http://download.tensorflow.org/models/deeplabv3_mnv2_pascal_trainval_2018_01_29.tar.gz Tensorflow version: 1.11 Bazel version: 0.17.2 OS: Ubuntu 18.04

Also the android application was not able to load the model properly (tflite interpretr)

So, how can we convert a segmentation model properly to a tflite format which can be used for inference on an android device?

UPDATE:-

Using tensorflow 1.12, we got a new error :

$ bazel run -c opt tensorflow/lite/tools/benchmark:benchmark_model -- --graph=`realpath /path/to/research/deeplab/venv/tensorflow/toco256.tflite`

    tensorflow/lite/kernels/depthwise_conv.cc:99 params->depth_multiplier * SizeOfDimension(input, 3) != SizeOfDimension(filter, 3) (0 != 32)
    Node number 30 (DEPTHWISE_CONV_2D) failed to prepare.

Also,while using a newer version of the same model(3 Mb .pb file) with depth_multiplier=0.5 from the tensorflow deeplab model zoo, we got a different error:-

F tensorflow/lite/toco/graph_transformations/propagate_fixed_sizes.cc:116] Check failed: dim_x == dim_y (3 vs. 32)Dimensions must match

In this case we used the same aforementioned command for tflite conversion ;but we were not even able to produce a 'tflite' file as output.It seems to be an issue with depth multiplier values.(Even we tried giving the depth_multiplier parameter as argument at the time of conversion).

Cleanser answered 9/11, 2018 at 15:47 Comment(0)

I also ran into this problem. There seems to be 2 issues in the conversion:

The input tensor has a dynamic shape, that is [?,?,?,3]
The pad_to_bounding_box node part is not auto converted to a static shape

For the solution below, this is tested on:

Tensorflow 1.15
Ubuntu 16.0.4

Solution

I assume you've already created a .pb file using the export_model.py file in the deeplab folder and named this file deeplab_mobilenet_v2.pb. From here on:

STEP 1: OPTIMIZE FOR INFERENCE

download optimize_for_inference.py https://raw.githubusercontent.com/benoitsteiner/tensorflow-opencl/master/tensorflow/python/tools/optimize_for_inference.py
run optimization (change parameters according to your config)

python3 optimize_for_inference.py \
        --input "path/to/your/deeplab_mobilenet_v2.pb" \
        --output "path/to/deeplab_mobilenet_v2_opt.pb" \
        --frozen_graph True \
        --input_names ImageTensor \
        --output_names SemanticPredictions \
        --placeholder_type_enum=4

placeholder_type_enum=4 is the uint8 data type (dtypes.uint8.as_datatype_enum)

STEP 2: APPLY GRAPH TRANSFORM TOOL

Make sure you have installed bazel and have downloaded the tensorflow r1.15 branch from github. Then make the transform_graph tool from the tensorflow repo:

bazel build tensorflow/tools/graph_transforms:transform_graph

Then run the transform_graph tool (make sure to set the shape to whatever shape you are using as input):

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph="/path/to/deeplab_mobilenet_v2_opt.pb" \
--out_graph="/path/to/deeplab_mobilenet_v2_opt_flatten.pb" \
--inputs='ImageTensor' \
--outputs='SemanticPredictions' \
--transforms='
    strip_unused_nodes(type=quint8, shape="1,400,225,3")
    flatten_atrous_conv
    fold_constants(ignore_errors=true, clear_output_shapes=false)
    fold_batch_norms
    fold_old_batch_norms
    remove_device
    sort_by_execution_order'

STEP 3: BYPASS pad_to_bounding_box NODE AND MAKE INPUT STATIC

Run the python file below, making sure to change the model_filepath, save_folder and save_name to whatever suits your needs.

import tensorflow as tf
import numpy as np
from tensorflow.contrib import graph_editor as ge

def freeze_session(session, keep_var_names=None, output_names=None, clear_devices=True):
    """
    Freezes the state of a session into a pruned computation graph.

    Creates a new computation graph where variable nodes are replaced by
    constants taking their current value in the session. The new graph will be
    pruned so subgraphs that are not necessary to compute the requested
    outputs are removed.
    @param session The TensorFlow session to be frozen.
    @param keep_var_names A list of variable names that should not be frozen,
                          or None to freeze all the variables in the graph.
    @param output_names Names of the relevant graph outputs.
    @param clear_devices Remove the device directives from the graph for better portability.
    @return The frozen graph definition.
    """
    graph = session.graph
    with graph.as_default():
        freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(keep_var_names or []))
        output_names = output_names or []
        output_names += [v.op.name for v in tf.global_variables()]
        input_graph_def = graph.as_graph_def()
        if clear_devices:
            for node in input_graph_def.node:
                node.device = ""
        frozen_graph = tf.graph_util.convert_variables_to_constants(
            session, input_graph_def, output_names, freeze_var_names)
        return frozen_graph

def load_convert_save_graph(model_filepath, save_folder, save_name):
    '''
    Lode trained model.
    '''
    print('Loading model...')
    graph = tf.Graph()
    sess = tf.InteractiveSession(graph = graph)

    with tf.gfile.GFile(model_filepath, 'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())

    print('Check out the input placeholders:')
    nodes = [n.name + ' => ' +  n.op for n in graph_def.node if n.op in ('Placeholder')]
    for node in nodes:
        print(node)

    # Define input tensor
    input = tf.placeholder(np.uint8, shape = [1,400,225,3], name='ImageTensor')

    tf.import_graph_def(graph_def, {'ImageTensor': input}, name='')

    print('Model loading complete!')

    # remove the pad to bounding box node
    name = "pad_to_bounding_box"
    print(name)
    sgv = ge.make_view_from_scope(name, tf.get_default_graph())
    print("\t" + sgv.inputs[0].name)
    for node in sgv.inputs:
        print("name in = " + node.name)
    for node in sgv.outputs:
        print("name out = " + node.name)
    print("\t" + sgv.outputs[len(sgv.outputs)-1].name)
    sgv = sgv.remap_inputs([0])
    sgv = sgv.remap_outputs([len(sgv.outputs)-1])
    (sgv2, det_inputs) = ge.bypass(sgv)


    frozen_graph = freeze_session(sess,
                              output_names=['SemanticPredictions'])
    tf.train.write_graph(frozen_graph, save_folder, save_name, as_text=False)


load_convert_save_graph("path/to/deeplab_mobilenet_v2_opt_flatten.pb", "/path/to", "deeplab_mobilenet_v2_opt_flatten_static.pb")

STEP 4: CONVERT TO TFLITE

tflite_convert \
  --graph_def_file="/path/to/deeplab_mobilenet_v2_opt_flatten_static.pb" \
  --output_file="/path/to/deeplab_mobilenet_v2_opt_flatten_static.tflite" \
  --output_format=TFLITE \
  --input_shape=1,400,225,3 \
  --input_arrays="ImageTensor" \
  --inference_type=FLOAT \
  --inference_input_type=QUANTIZED_UINT8 \
  --std_dev_values=128 \
  --mean_values=128 \
  --change_concat_input_ranges=true \
  --output_arrays="SemanticPredictions" \
  --allow_custom_ops

DONE

You can now run your tflite model

Dartmoor answered 29/11, 2019 at 16:35 Comment(1)

If there was an option to upvote many times, I would do that. This answers saved me big time! – Yarn 16/11, 2020 at 17:5

I have same issue. from https://github.com/tantara/JejuNet I see that he was successfully convert model to tflite. I PM him for help, but unfortunately no response right now.

Cockalorum answered 23/11, 2018 at 1:25 Comment(3)

Actually, we were also trying the same and we even had a discussion with the same person.Unfortunately, we couldn't still resolve the issue.But it looks like the problem lies in the initial layers of the frozen graph where some p reprocessing operations related to size or dimension adjustment is being carried out(input is [1, ?,?, 3], for accepting arbitrary size ).Some of them may not be supported by tensorflow lite(which expects fixed size inputs).May be if we remove or skip these it may work.Otherwise we may have to retrain the network after modifying it. – Cleanser 24/11, 2018 at 3:51

I dumped JejuNet tflite, it's very different with we converted from official model. the input node is MobilenetV2/MobilenetV2/input and type uint8[1,256,256,3]. total OPs only 71. but we have 156 – Cockalorum 26/11, 2018 at 2:11

Is the issue resolved now? If yes, Could you please share how to resolve. In my case, I could generate .tflite file. but, When i place it in Jejunet , I get error like "java.lang.IllegalArgumentException: ByteBuffer is not a valid flatbuffer model". If possible share the parameters to convert .pb to .tflite. If you are not using JejuNet, Please suggest me what else can i use? – Fanya 19/4, 2019 at 6:8