The TensorFlow documentation for tf.train.Example and tf.train.SequenceExample doesn't provide much detail. It refers to the protocol buffer definitions here: example.proto.
While both classes are used to serialize data in TensorFlow, there are important distinctions:
tf.train.Example:
- Best for handling data with a fixed number of features.
- Use when each example in your dataset is independent of the others and has
a constant structure.
tf.train.SequenceExample:
- Ideal for sequential data, such as time series or sentences.
- Useful when each example contains a sequence of elements (e.g., list of
features over time).
Here's a more detailed explanation to help you decide which to use:
tf.train.Example:
- Use this for standard, non-sequential data.
- Each example is self-contained.
Examples:
- Images with fixed attributes (height, width, label).
- Tabular data where each row represents an independent record.
tf.train.SequenceExample:
- Designed for sequential or time-dependent data.
- Contains two main parts: context and feature_lists.
context: Fixed features (similar to tf.train.Example).
feature_lists: Lists of features, where each list corresponds to a sequence.
Examples:
- Text data where each example is a sequence of words.
Time series data where each example is a sequence of measurements over time.
Example use cases:
tf.train.Example: Use for an image dataset where each image has a label and maybe additional fixed attributes.
tf.train.SequenceExample: Use for a dataset of videos where each example is a sequence of frames, or a dataset of sentences where each example is a sequence of words.
Understanding these differences will help you choose the appropriate class based on the structure and nature of your data.
Here's a simple example to illustrate the use of tf.train.Example
import tensorflow as tf
def create_example():
feature =
{'feature1':tf.train.Feature(float_list=tf.train.FloatList(value=[1.0, 2.0, 3.0])),
'feature2': tf.train.Feature(int64_list=tf.train.Int64List(value=[1, 2, 3])),
'feature3': tf.train.Feature(bytes_list=tf.train.BytesList(value=[b'a', b'b', b'c']))
}
example_proto =
tf.train.Example(features=tf.train.Features(feature=feature))
return example_proto
example = create_example()
print(example)
Here's a simple example to illustrate the use of tf.train.SequenceExample
import tensorflow as tf
def create_sequence_example():
# Context features
context = tf.train.Features(feature={
'feature1': tf.train.Feature(float_list=tf.train.FloatList(value=[1.0])),
'feature2': tf.train.Feature(int64_list=tf.train.Int64List(value=[2]))
})
# Sequence features
feature_lists = tf.train.FeatureLists(feature_list={
'sequence_feature1': tf.train.FeatureList(feature=[
tf.train.Feature(float_list=tf.train.FloatList(value=[1.1])),
tf.train.Feature(float_list=tf.train.FloatList(value=[1.2]))
]),
'sequence_feature2': tf.train.FeatureList(feature=[
tf.train.Feature(int64_list=tf.train.Int64List(value=[3])),
tf.train.Feature(int64_list=tf.train.Int64List(value=[4]))
])
})
sequence_example_proto = tf.train.SequenceExample(context=context,
feature_lists=feature_lists)
return sequence_example_proto
sequence_example = create_sequence_example()
print(sequence_example)