Running the script in Google Cloud Shell to call the below preprocess on set of images following the steps of flowers example.
Preprocess was successfully on both eval set and train set. But the generated .tfrecord.gz files does not seem matching the image numbers in eval/train_set.csv.
i.e. eval-00000-of-00157.tfrecord.gz says there are 158 tfrecord while there are 35227 rows in eval_set.csv. Each record include a valid image_url (all of them are uploaded to Storage), each record has valid label tagged.
Would like to know if there is a way to monitor and control the number of images per tfrecord in config.
Update, got this work out right:
import tensorflow as tf
import os
from import file_io
options = tf.python_io.TFRecordOptions(
sum(1 for f in file_io.get_matching_files(os.path.join(url/path, '*.tfrecord.gz'))
for example in tf.python_io.tf_record_iterator(f, options=options))