Some Python objects were not bound to checkpointed values
Asked Answered
C

4

17

I am trying to get started with Tensorflow 2.0 Object Detection API. I have gone through the installation following the official tutorial and I pass all the tests. However, I keep getting an error message that I don't understand when I try to run the main module. This is how I run it:

python model_main_tf2.py --model_dir=ssd_resnet50_v1_fpn_640x640_coco17_tpu-8 --pipeline_config_path=ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/pipeline.config

This is the beginning of the error message:

Traceback (most recent call last):
  File "model_main_tf2.py", line 113, in <module>
    tf.compat.v1.app.run()
  File "/home/hd/hd_hd/hd_rs239/.conda/envs/jan_tf2/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/hd/hd_hd/hd_rs239/.conda/envs/jan_tf2/lib/python3.7/site-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/home/hd/hd_hd/hd_rs239/.conda/envs/jan_tf2/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "model_main_tf2.py", line 110, in main
    record_summaries=FLAGS.record_summaries)
  File "/home/hd/hd_hd/hd_rs239/.conda/envs/jan_tf2/lib/python3.7/site-packages/object_detection/model_lib_v2.py", line 569, in train_loop
    unpad_groundtruth_tensors)
  File "/home/hd/hd_hd/hd_rs239/.conda/envs/jan_tf2/lib/python3.7/site-packages/object_detection/model_lib_v2.py", line 383, in load_fine_tune_checkpoint
    ckpt.restore(checkpoint_path).assert_existing_objects_matched()
  File "/home/hd/hd_hd/hd_rs239/.conda/envs/jan_tf2/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 791, in assert_existing_objects_matched
    (list(unused_python_objects),))
AssertionError: Some Python objects were not bound to checkpointed values, likely due to changes in the Python program: [SyncOnReadVariable:{
  0: <tf.Variable 'conv2_block1_0_bn/moving_variance:0' shape=(256,) dtype=float32, numpy=
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,

In the pipeline.config, I specify a checkpoint like this:

  fine_tune_checkpoint: "ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0" 

These are the contents of ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint/ :

checkpoint  
ckpt-0.data-00000-of-00001  
ckpt-0.index

I have searched Google but couldn't find any answer. In this issue, the suggested solution is outdated (the code they suggest to replace is not there anymore).

Question: What is the problem and how can I solve it?

I am doing this on a server with CentOS Linux 7. I am using Python 3.7. I am new to Tensorflow so please if I am missing any important information, let me know.

Cholecyst answered 23/8, 2020 at 21:29 Comment(2)
Most likely a bug in latest tensorflow 2.3.0Piecrust
I meet the same bug in TensorFlow 2.2.0 when export my custom model.Middlebrow
L
52

From the file name you provided (ssd_resnet50_v1_fpn_640x640_coco17_tpu-8), I can see you are trying to work with an object detection task. Therefore, in your pipeline.config file change this line:

fine_tune_checkpoint_type: "classification"

To:

fine_tune_checkpoint_type: "detection"

This should solve your problem.

Laywoman answered 1/9, 2020 at 2:26 Comment(0)
L
4

For me it was usefull to check type of feature extractor. I change type: "mobilenet_v2" to type: "mobilenet_v2_fpn_sep_conv" in pipeline.config. And its start working.

Landholder answered 23/6, 2021 at 13:49 Comment(0)
U
1

I've been running into the same issue trying to get MobileNet & CenterNet to work. First of all: this error seems to be dependend on which Tensorflow version you are using. In my case, a colleague used TF 2.2 and it worked, whereas my TF 2.10 threw this error!

However, there are reasons why you would not want to downgrade. If you are training a custom dataset and don't need the pre-trained COCO weights, there is an easy workaround:

Simply don't use the fine tune checkpoint which you downloaded from the Model Zoo. To do so, in pipeline.config delete the line fine_tune_checkpoint: "your_path" and this error will disappear.

Ubana answered 22/11, 2022 at 8:42 Comment(0)
C
0

I had the same error but for me, it was a simple copy&paste mistake. My fine_tune_checkpoint pointed to faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8/checkpoint/ckpt-0 instead of faster_rcnn_resnet50_v1_640x640_coco17_tpu-8/checkpoint/ckpt-0

Crust answered 15/11, 2022 at 7:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.