I want to use the tensorflow dataset saving and loading functions but I am not sure to understand the sharding method.
The documentation indicates :
The saved dataset is saved in multiple file "shards". By default, the dataset output is divided to shards in a round-robin fashion but custom sharding can be specified via the shard_func function.
But when I save a dataset through the save function, it seems that only one huge shard is generated.
import tempfile
import tensorflow as tf
path = os.path.join(tempfile.gettempdir(), "saved_data")
dataset = tf.data.Dataset.range(10**8)
dataset.save(path)
Am I missing something ?
I use Tensorflow 2.10.0 and Python 3.9.7