Couple of queries: 1) Fine tuned GPT2 2) BPE Encoding
sb1992 opened this issue · 2 comments
Hi
I had a couple of queries.
-
I was wondering if you could direct me to the part of the code and recommend changes I could make so that i can also calculate this score on my own fine-tuned gpt2 model (which has its own path where it is saved)
-
I was also thinking that gpt2 uses BPE encoding. So when you return probability score it always returns the probability for the complete word (not the sub units). As far as i understand BPE it divides the token into sub pieces and gives the corresponding ids to those sub pieces. So do you know how is that working internally, that is able to assign probability to complete word ?
Thanks
-
If you pass the path to your model as
model_name
to theGPT2LMScorer
class it should work. -
Right now we already return the probability of each sub unit.
Thank you