How to run TF object detection API model_main.py in evaluation mode only
Asked Answered
I

1

5

I would like to evaluate a custom-trained Tensorflow object detection model on a new test set using Google Cloud.

I obtained the inital checkpoints from: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

I know that the Tensorflow object-detection API allows me to run training and evaluation simultaneously by using:

https://github.com/tensorflow/models/blob/master/research/object_detection/model_main.py

To start such a job, i submit following ml-engine job:

gcloud ml-engine jobs submit training [JOBNAME] 
--runtime-version 1.9 
--job-dir=gs://path_to_bucket/model-dir 
--packages dist/object_detection- 
    0.1.tar.gz,slim/dist/slim-0.1.tar.gz,pycocotools-2.0.tar.gz 
--module-name object_detection.model_main 
--region us-central1 
--config object_detection/samples/cloud/cloud.yml 
-- 
--model_dir=gs://path_to_bucket/model_dir 
--pipeline_config_path=gs://path_to_bucket/data/model.config

However, after I have successfully transfer-trained a model I would like to use calculate performance metrics, such as COCO mAP(http://cocodataset.org/#detection-eval) or PASCAL mAP (http://host.robots.ox.ac.uk/pascal/VOC/pubs/everingham10.pdf) on a new test data set which has not been previously used (neither during training nor during evaluation).

I have seen, that there is possible flag in model_main.py:

flags.DEFINE_string(
    'checkpoint_dir', None, 'Path to directory holding a checkpoint. If '
    '`checkpoint_dir` is provided, this binary operates in eval-only 
     mode, '
    'writing resulting metrics to `model_dir`.')

But I don't know whether this really implicates that model_main.py can be run in exclusive evaluation mode? If yes, how should I submit the ML-Engine job?

Alternatively, are there any functions in the Tensorflow API which allows me to evaluate an existing output dictionary (containing bounding boxes, class labels, scores) based on COCO and/or Pascal mAP? If there is, I could easily read in a Tensorflow record file locally, run inference and then evaluate the output dictionary.

I know how to obtain these metrics for the evaluation data set, which is evaluated during training in model_main.py. However, from my understanding I should still report model performance on a new test data set, since I compare multiple models and implement some hyper-parameter optimization and thus I should not report on evaluation data set, am I right? On a more general note: I can really not comprehend why one would switch from separate training and evaluation (as it is in the legacy code) to a combined training and evaluation script?

Edit: I found two related posts. However I do not think that the answers provided are complete:

how to check both training/eval performances in tensorflow object_detection

How to evaluate a pretrained model in Tensorflow object detection api

The latter has been written while TF's object detection API still had separate evaluation and training scripts. This is not the case anymore.

Thank you very much for any help.

Irrefrangible answered 1/4, 2019 at 13:4 Comment(1)
Thanks, I also was searching for that functionality of running just the evaluation.Tris
C
6

If you specify the checkpoint_dir and set run_once to be true, then it should run evaluation exactly once on the eval dataset. I believe that metrics will be written to the model_dir and should also appear in your console logs. I usually just run this on my local machine (since it's just doing one pass over the dataset) and is not a distributed job. Unfortunately I haven't tried running this particular codepath on CMLE.

Regarding why we have a combined script... from the perspective of the Object Detection API, we were trying to write things in the tf.Estimator paradigm --- but you are right that personally I found it a bit easier when the two functionalities lived in separate binaries. If you want, you can always wrap up this functionality in another binary :)

Cough answered 3/4, 2019 at 21:54 Comment(3)
Thank you for the answer. So I first run model_main.py (until loss and/or mAP converges), right? Then, I add the flag checkpoint_dir (same directory as model_dir?) so that model_main runs in evaluation only mode based on the latest checkpoint stored in model_dir? Or should I move the latest checkpoint to a separate directory and specify this directory as checkpoint_dir? Regarding the CMLE: You are right, it should not make any difference whether to run in Cloud or locally.Outface
Correct, you should be able to just set checkpoint_dir to be the same as model_dir after having run model_main once to train a model.Cough
Thanks, I also was searching for that functionality of running just the evaluation.Tris

© 2022 - 2024 — McMap. All rights reserved.