How to evaluate a pretrained model in Tensorflow object detection api
Asked Answered
I

3

8

Trying work with the recently released Tensorflow Object Detection API, and was wondering how I could evaluate one of the pretrained models they provided in their model zoo? ex. how can I get the mAP value for that pretrained model?

Since the script they've provided seems to use checkpoints (according to their documentation) I've tried making a dumb copy of a checkpoint that pointed to the provided model.ckpt.data-00000-of-00001 model in their model zoo, but eval.py didn't like that.

checkpoint
   model_checkpoint_path: "model.ckpt.data-00000-of-00001"

I've considered training on the pretrained one briefly then evaluating that... but I'm not sure if this would give me the right metric.

Sorry if this is a rudimentary question - I'm just starting out on Tensorflow and wanted to verify I was getting the right stuff. Would appreciate any pointers!

EDIT:

I made a checkpoint file as per Jonathan's answer:

model_checkpoint_path: "model.ckpt"
all_model_checkpoint_paths: "model.ckpt"

which the evaluation script took, and evaluated using the COCO dataset. However the evaluation stopped and said there was a shape mismatch:

...
[[Node: save/Assign_19 = Assign[T=DT_FLOAT, _class=["loc:@BoxPredictor_4/ClassPredictor/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](BoxPredictor_4/ClassPredictor/weights, save/RestoreV2_19/_15)]]
2017-07-05 18:40:11.969641: W tensorflow/core/framework/op_kernel.cc:1158] Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [1,1,256,486] rhs shape= [1,1,256,546]
[[Node: save/Assign_19 = Assign[T=DT_FLOAT, _class=["loc:@BoxPredictor_4/ClassPredictor/weights"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](BoxPredictor_4/ClassPredictor/weights, save/RestoreV2_19/_15)]]
2017-07-05 18:40:11.969725: W tensorflow/core/framework/op_kernel.cc:1158] 
...
Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [1,1,256,486] rhs shape= [1,1,256,546]
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [1,1,256,486] rhs shape= [1,1,256,546]

What might have caused this shape mismatch? And how do I fix it?

Indebted answered 22/6, 2017 at 18:53 Comment(0)
U
4

You can evaluate the pretrained models by running the eval.py script. It will ask you to point to a config file (which will be in the samples/configs directory) and a checkpoint, and for this you will provide a path of the form .../.../model.ckpt (dropping any extensions, like .meta, or .data-00000-of-00001).

You also have to create a file named "checkpoint" inside the directory that contains that checkpoint that you'd like to evaluate. Then inside that file write the following two lines:

model_checkpoint_path: “path/to/model.ckpt"
all_model_checkpoint_paths: “path/to/model.ckpt"

(where you modify path/to/ appropriately)

The number that you get at the end is mean Average Precision using 50% IOU as the cutoff threshold for true positives. This is slightly different than the metric that is reported in the model zoo, which uses the COCO mAP metric and averages over multiple IOU values.

Ulani answered 29/6, 2017 at 0:56 Comment(8)
Thanks for the reply Jonathan! I tried running python eval.py --logtostderr --checkpoint_dir=path/to/model.ckpt eval_dir=path/to/eval --pipeline_config_path=path/to/.config but this didn't work; to clarify, where exactly am I indicating where to point to? (Currently am using the .config file to point to the ckpt file as well) Also just to be sure: is it a single mAP value that I get at the end?Indebted
You will get a single mAP value at the end, yes. Regarding config files, check out this directory: github.com/tensorflow/models/tree/master/object_detection/… --- you will have to point at the file inside that directory that matches the checkpoint that you'd like to evaluate.Ulani
Sorry, is this in regards to evaluating the model in general? I was hoping to reproduce the model zoo results. I ended up converting the COCO dataset to a TFRecord and running training/evaluating on that for a couple of iterations to get the mAP.... the information about the mAP value difference was helpful though!Indebted
Hi Jon - I missed your edit about the checkpoint file. Tried that and ran into another problem, which I edited into my question. Thanks!Indebted
Not sure I've seen this error before. Can you provide your exact command line and config file?Ulani
After running eval.py where is the mAP value located? I can't find it in the eval folder, If I run tensorboard it doesn't show me anything either.Egwin
When you say "running the eval.py script" - I don't know what script you are referring to. Could you please specify where do I find it? I can't find it. Thank you!Incommunicado
@jan-pisl In TF2, eval.py was moved to the models/research/object_detection/legacy directory. The preferred way to run evaluation in TF2 is to use models/research/object_detection/model_main_.py and pass a checkpoint_dir argumentWeiner
B
4

You can also used model_main.py to evaluate your model.

If you want to evaluate your model on validation data you should use:

python models/research/object_detection/model_main.py --pipeline_config_path=/path/to/pipeline_file --model_dir=/path/to/output_results --checkpoint_dir=/path/to/directory_holding_checkpoint --run_once=True

If you want to evaluate your model on training data, you should set 'eval_training_data' as True, that is:

python models/research/object_detection/model_main.py --pipeline_config_path=/path/to/pipeline_file --model_dir=/path/to/output_results --eval_training_data=True --checkpoint_dir=/path/to/directory_holding_checkpoint --run_once=True

I also add comments to clarify some of previous options:

--pipeline_config_path: path to "pipeline.config" file used to train detection model. This file should include paths to the TFRecords files (train and test files) that you want to evaluate, i.e. :

    ...
    train_input_reader: {
        tf_record_input_reader {
                #path to the training TFRecord
                input_path: "/path/to/train.record"
        }
        #path to the label map 
        label_map_path: "/path/to/label_map.pbtxt"
    }
    ...
    eval_input_reader: {
        tf_record_input_reader {
            #path to the testing TFRecord
            input_path: "/path/to/test.record"
        }
        #path to the label map 
        label_map_path: "/path/to/label_map.pbtxt"
    }
    ...

--model_dir: Output directory where resulting metrics will be written, particularly "events.*" files that can be read by tensorboard.

--checkpoint_dir: Directory holding a checkpoint. That is the model directory where checkpoint files ("model.ckpt.*") has been written, either during training process, or after export it by using "export_inference_graph.py". In your case, you should point to the pretrained model folder download from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md.

--run_once: True to run just one round of evaluation.

Bambibambie answered 7/5, 2020 at 0:5 Comment(2)
there's a bug anyway in this method.. github.com/tensorflow/models/pull/5450Diachronic
But how can someone evaluate a pre-trained model. Like if I have just tflite model file, can i evaluate that?Etana
P
0

Try:

python eval.py --logtostderr --checkpoint_dir=training --eval_dir=path/to/eval_dir --pipeline_config_path=path/to/pretrained_model.config

For example:

python eval.py --logtostderr --checkpoint_dir=training --eval_dir=images/val \
  --pipelineline_config_path=training/faster_rcnn_inception_v2.config

Note:

The training dir contains all your training checkpoints. During training Tensorflow generates a checkpoint file inside this directory with all your checkpoint metadata in it so you do not need to create another one. If you wish to evaluate your trained custom model after generating your inference graph then ensure your change your original pretrained_model/model.chpt to your new_trained_model/model.ckpt in the .config you used for training. You should get a similar output:

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.457
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.729
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.502
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.122
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.297
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.659
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.398
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.559
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.590
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.236
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.486
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.746
INFO:tensorflow:Writing metrics to tf summary.
INFO:tensorflow:DetectionBoxes_Precision/mAP: 0.456758
INFO:tensorflow:DetectionBoxes_Precision/mAP (large): 0.659280
INFO:tensorflow:DetectionBoxes_Precision/mAP (medium): 0.296693
INFO:tensorflow:DetectionBoxes_Precision/mAP (small): 0.122108
INFO:tensorflow:DetectionBoxes_Precision/[email protected]: 0.728587
INFO:tensorflow:DetectionBoxes_Precision/[email protected]: 0.502194
INFO:tensorflow:DetectionBoxes_Recall/AR@1: 0.397509
INFO:tensorflow:DetectionBoxes_Recall/AR@10: 0.558966
INFO:tensorflow:DetectionBoxes_Recall/AR@100: 0.590182
INFO:tensorflow:DetectionBoxes_Recall/AR@100 (large): 0.745691
INFO:tensorflow:DetectionBoxes_Recall/AR@100 (medium): 0.485964
INFO:tensorflow:DetectionBoxes_Recall/AR@100 (small): 0.236275
INFO:tensorflow:Losses/Loss/BoxClassifierLoss/classification_loss: 0.234645
INFO:tensorflow:Losses/Loss/BoxClassifierLoss/localization_loss: 0.139109
INFO:tensorflow:Losses/Loss/RPNLoss/localization_loss: 0.603733
INFO:tensorflow:Losses/Loss/RPNLoss/objectness_loss: 0.206419
Panne answered 19/5, 2019 at 19:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.