huggingface/pytorch-openai-transformer-lm

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI

PythonMIT

Issues

Instructions on how to train a language model from scratch
#29 opened 6 years ago by Froskekongen
9
Avoid model overfitting
#31 opened 6 years ago by BangLiu
9
problems about DoubleHeadModel implementation
#50 opened 6 years ago by eveliao
1
Instructions for encoding own sentences
#58 opened 5 years ago by izaskr
1
Training from scratch: Repeated and mangled words
#59 opened 5 years ago by maruker
0
Running on new dataset similar to rocstories
#57 opened 6 years ago by priyanka-chaudhary
0
How to create transforms for entailment task?
#40 opened 6 years ago by lordzuko
12
vocab = n_vocab + n_special + n_ctx means?
#54 opened 6 years ago by JiahangOK
1
ConvAI
#55 opened 6 years ago by bkj
0
Why do we need to apply mask while fine tuning?
#53 opened 6 years ago by pranoy-k
4
Confused by multiply n to both clf_logits and clf_losses?
#39 opened 6 years ago by Vimos
3
Implementation of Seq2Seq with Transformer
#52 opened 6 years ago by bhedayat
0
why it didn't use softmax in computing multichoice loss
#51 opened 6 years ago by eveliao
1
How should one modify the code to successfully run text classification?
#43 opened 6 years ago by davidefiocco
7
Potentially incorrect regex in text_utils.py
#48 opened 6 years ago by schmmd
0
Why is output vocab including positional embeddings?
#47 opened 6 years ago by OanaMariaCamburu
2
Encoder paddings influence results?
#45 opened 6 years ago by OanaMariaCamburu
0
Retrain the LM on new dataset?
#46 opened 6 years ago by fabrahman
0
help to understand bpe logic
#42 opened 6 years ago by BogdanDidenko
2
Results and questions on text generation experiments with pretrained LM model
#36 opened 6 years ago by xiaoda99
10
How does position embedding implementation work?
#44 opened 6 years ago by bcserna
2
In `transform_roc`, why do we need `xmb[:, :, :, 1] `?
#12 opened 6 years ago by FrankWork
3
How is the file "cloze_test_test__spring2016 - cloze_test_ALL_test.csv" created?
#41 opened 6 years ago by luffycodes
5
Vocabulary size code explanation and occasionally shape error
#38 opened 6 years ago by Vimos
2
Redundant decoder
#32 opened 6 years ago by joshim5
1
finetuning the model for NLI, but using sentence embeddings instead of word embeddings
#30 opened 6 years ago by saurabhvyas
2
How to use Inference?
#33 opened 6 years ago by masati91
4
So we can not change the word embedding with the pretrained LM?
#34 opened 6 years ago by herbertchen1
1
How does Dropout2d help in cloze task?
#11 opened 6 years ago by sai-prasanna
12
loading pretrained open ai model
#28 opened 6 years ago by mehdimashayekhi
3
Can someone explain this line?
#21 opened 6 years ago by teucer
4
Clarifying last step of the 'transform_roc' function
#26 opened 6 years ago by sharpsy
7
Having various network heads
#24 opened 6 years ago by rodgzilla
1
what is the use of dropout in the Transformer?
#19 opened 6 years ago by teucer
2
Object is not specified
#13 opened 6 years ago by Oktai15
4
DoubleHeadModel is not instanciated when n_gpu <= 1
#14 opened 6 years ago by rodgzilla
1
Pre-trained LMHead
#7 opened 6 years ago by rodgzilla
3
dimensioning bug?
#2 opened 6 years ago by jtatusko
4
a git clone is large because history has the "add model" commit
#3 opened 6 years ago by soumith
1