Skip to main content

How to use TFRecord format?

When dealing with the TFRecord format throughout TensorFlow's API, several objects (classes) come out. However, there are surprisingly many, so it seems that it is difficult to understand. Now let's show the objects hierarchically.

  • tf.Example
    • tf.train.Features
      • tf.train.Feature
        • tf.train.BytesList
        • tf.train.FloatList
        • tf.train.Int64List
  • Fundamentally, a tf.train.Example is a {"string": tf.train.Feature} mapping. The tf.train.Feature message type can accept one of the three types(tf.train.BytesList, tf.train.FloatList, and tf.train.Int64List). Each function takes a scalar input value and returns a tf.train.Feature containing one of the three list types.
    See https://www.tensorflow.org/tutorials/load_data/tfrecord#creating_a_tftrainexample_message
    import tensorflow as tf
    int64_list = tf.train.Int64List(value=[1, 2, 3, 4])
    # Or dict can be taken.
    # int64_list = dict(value=[1, 2, 3, 4])
    int64_feature = tf.train.Feature(int64_list=int64_list)
    int64_feature
    # int64_list {
    #   value: 1
    #   value: 2
    #   value: 3
    #   value: 4
    # }
    
    Create a dictionary mapping the feature name with tf.train.Features.
    int64_feature0 = tf.train.Feature(int64_list=tf.train.Int64List(value=[1, 2, 3, 4]))
    int64_feature1 = tf.train.Feature(int64_list=tf.train.Int64List(value=[5, 6, 7, 8]))
    int64_feature2 = tf.train.Feature(int64_list=tf.train.Int64List(value=[9, 10, 11, 12]))
    int64_feature3 = tf.train.Feature(int64_list=tf.train.Int64List(value=[13, 14, 15, 16]))
    
    int64_features = tf.train.Features(feature={
        'feature0': int64_feature0,
        'feature1': int64_feature1,
        'feature2': int64_feature2,
        'feature3': int64_feature3,
    })
    
    Then given to tf.train.Example.
    tf.train.Example(features=int64_features)
    
    features {
      feature {
        key: "feature0"
        value {
          int64_list {
            value: 1
            value: 2
            value: 3
            value: 4
          }
        }
      }
      feature {
        key: "feature1"
        value {
          int64_list {
            value: 5
            value: 6
            value: 7
            value: 8
          }
        }
      }
      feature {
        key: "feature2"
        value {
          int64_list {
            value: 9
            value: 10
            value: 11
            value: 12
          }
        }
      }
      feature {
        key: "feature3"
        value {
          int64_list {
            value: 13
            value: 14
            value: 15
            value: 16
          }
        }
      }
    }
    

    Comments