How to make FP16 quantization on gpt/xl?

Question

How to make FP16 quantization on gpt/xl?

Archelunch opened this issue 5 years ago · 4 comments

How could I fix this error?
ValueError: Message tensorflow.GraphDef exceeds maximum protobuf size of 2GB: 6234365906

Answer 1 · 2020-01-02T16:48:35.000Z

Hi @Archelunch, what is the script you use? At which step are you getting this error?

Answer 2 · 2020-01-03T09:01:47.000Z

output_graph = "frozen_graph.pb"`
output_graph_def = tf.graph_util.convert_variables_to_constants(
                        sess,
                        tf.get_default_graph().as_graph_def(),
                        ["sample_sequence_2/while/Exit_3"]
                    ) 

with tf.gfile.GFile(output_graph, "wb") as f:
    f.write(output_graph_def.SerializeToString())

ValueError                                Traceback (most recent call last)
<ipython-input-69-6930509752c3> in <module>
      8     # serialize and dump the output graph to the filesystem
      9 with tf.gfile.GFile(output_graph, "wb") as f:
---> 10     f.write(output_graph_def.SerializeToString())

ValueError: Message tensorflow.GraphDef exceeds maximum protobuf size of 2GB: 6233583551```

Answer 3 · 2020-01-03T15:12:12.000Z

Are you working with a big model? It seems you're reaching the limit size (2GB) supported by protobuf which is used internally by TensorFlow to serialize the graph. It doesn't have any relation with quantization, it's a more general TF limitation it seems.

Answer 4 · 2020-01-10T01:47:26.000Z

Is there any way to loop through the .pb file in a similar way to this stackoverflow question/answer so as to cut down on the amount of information that is loaded in memory?