lopuhin/transformer-lm

Transformer language model (GPT-2) with sentencepiece tokenizer

Python

Issues

No license file
#32 opened 2 years ago by Maniues
0
I would like a longer text result
#31 opened 4 years ago by r23
2
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in posotion 47
#30 opened 4 years ago by c6s0
0
Finetuning
#19 opened 5 years ago by Stamenov
42
How to generate or convert vocab.json, merges.txt, and config.json to match huggingface/transformers requirements ?
#29 opened 4 years ago by ycat3
4
Silent failure when training on GPU
#14 opened 5 years ago by vilhub
5
Possible gpt-2-gen bug: assertion error in inference.py
#10 opened 5 years ago by strumke
10
Pytorch: Speed up get_log_probs function
#5 opened 6 years ago by binhvq
2
Question about train dataset format
#17 opened 4 years ago by choomz
3
"state_dict" Mismatch
#20 opened 4 years ago by nitinnairk
1
Select GPU of choice
#25 opened 4 years ago by Meghana-Meghana
1
No speed up when using muli-gpu training
#24 opened 5 years ago by zaidalyafeai
6
Training GPT-2 on very large corpus
#23 opened 5 years ago by simonefrancia
6
Fail to resume on multiple gpu
#21 opened 5 years ago by knok
7
What's the 'max_sentence_length' when training sentencepiece model?
#18 opened 5 years ago by ty5491003
2
Validation loss not computed
#16 opened 5 years ago by nitinnairk
5
Unigram algorithm instead of BPE
#15 opened 5 years ago by nitinnairk
2
Plans for transformer-xl?
#13 opened 5 years ago by gooofy
2
Plans to add gradient checkpointing?
#11 opened 5 years ago by gooofy
2
Stand-alone text generation and scoring scripts
#2 opened 6 years ago by binhvq
7
Training from scratch - how many epochs?
#8 opened 5 years ago by gooofy
7
Predict with GPU
#7 opened 5 years ago by binhvq
0
Error on validate, batch is empty
#6 opened 6 years ago by binhvq
1
Train in large dataset
#3 opened 6 years ago by binhvq
10
How to prepare the data for text generation task. Thank you very much.
#1 opened 6 years ago by guotong1988
6