akanyaani/gpt-2-tensorflow2.0

sg.sample_sequence returns context after pre-trained model

bytes-commerce opened this issue · 3 comments

First ofd all, thanks for providing this amazing repository providing a possibility for tf2!
Secondly, I were using the Readme to pre-train my model and eventually using sequence_generator.py to pass some context to the model.

However, the response is always 1:1 the same as the context but the capital letters are being replaced with ??s. The question now is, what am I doing wrong? Have I maybe forgotten a thing? Is there maybe a edge case leading to this point that could be prevented?

Please let me know any additional information you might need! Thanks a lot!

same problem

also getting weird output like this.

First of all, thank you for sharing your code! Helped me a lot starting with gpt2.
I really do not know if this is relevant but I just debugged sample.py.

output will only append zeros:
tf.Tensor([[ 3 13727 5825 0 0 0 0 0 ...]], shape=(1, 515), dtype=int32)

If my sequence length is 512 — I will get 512 zeros (+3 above zero numbers because of my context).
My output is just the words I have provided as context because the rest is 0.

edit 1:
logits is always nan in my case resulting in 0.

edit 2:
self.embedding_weights is nan. Maybe somethings wrong with the initializer?