What is the suggest way of loading data from GCS? The sample code shows copying the data from GCS to the /tmp/
directory. If this is the suggest approach, how much data may be copied to /tmp/
?
While you have that option, you shouldn't need to copy the data over to local disk. You should be able to reference training and evaluation data directly from GCS, by referencing your files/objects using their GCS URI -- eg. gs://bucket/path/to/file. You can use these paths where you'd normally use local file system paths in TensorFlow APIs that accept file paths. TensorFlow supports the ability to access data (and write to) GCS.
You should also be able to use a prefix to reference a set of matching files, rather than referencing each file individually.
Followup note -- you'll want to check out https://cloud.google.com/ml/docs/how-tos/using-external-buckets in case you need to appropriately ACL your data for being accessible to training.
Hope that helps.
tf.read_file
node? –
Weathersby © 2022 - 2024 — McMap. All rights reserved.