EleutherAI/gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
PythonMIT
Issues
- 1
Cannot Connect To Local TPU-VM
#323 opened by nikhilanayak - 5
Generation should allow user to specify max length of generated portion, rather than total
#268 opened by monsieurpooh - 5
The temperature at 0.0001 (or other arbitrarily small float) is still too high
#270 opened by monsieurpooh - 3
GPT-neo 350M weights?
#264 opened by gangiswag - 1
FYI:Japanese pre-trained gpt-neo implementation showcase by using PyTorch, Transformers, and Rust
#322 opened by ycat3 - 1
IndexError: index out of range in self
#273 opened by dzlab - 0
TPU device does not support heartbeats.
#272 opened by iliemihai - 2
The model should return just the generated text, not the prompt text + generated text.
#271 opened by monsieurpooh - 0
Not able to generate predicted text after `Done with copy master to slices.` with 1.3B pre-trained model
#269 opened by SanchiMittal - 2
Predict runtime error
#251 opened by Borshig - 1
- 7
Can't load GPT3_XL
#226 opened by MK096 - 0
Argument not a list with same length as devices
#266 opened by monsieurpooh - 1
Links in the readme to the-eye.eu don't work
#265 opened by monsieurpooh - 2
Inferencing
#258 opened by BakingBrains - 6
the-eye.eu is down again, is there a mirror?
#263 opened by nepeee - 3
GPT3_1_3B configuration for a v3-32 TPU
#262 opened by iliemihai - 1
Colab: Download of pre trained dataset not possible. the-eye.eu is offline
#261 opened by JonasPertschy - 3
How to fine tine gptneo for QA dataset?
#250 opened by shankyemcee - 1
Incosistent inference TPU vs GPU (huggingface)
#260 opened by vvv-tech - 2
- 4
Finetuning doesn't run
#232 opened by SamyakDhole - 8
Tokenizing error when training on Colab
#233 opened by Marcus-Arcadius - 2
Mesh TensorFlow CPU Inference
#216 opened by pablogranolabar - 4
Performance issue in tasks.py
#240 opened by DLPerf - 0
- 1
Have any GPTNeoForCausalLM training example in pytorch with hardware acceleration?
#235 opened by Pwang001 - 3
The_Eye hosted models wont download
#231 opened by diskreet90 - 2
[colab notebooks] Can't restore pretrained weights
#241 opened by sky1ove - 1
- 1
Dataset preparation
#257 opened by BakingBrains - 1
Freeze Transformer Weight
#259 opened by ivokun - 2
Performance issues in the program
#239 opened by DLPerf - 0
Exception: stream did not contain valid UTF-8
#256 opened by BakingBrains - 1
a gui or command line for training it?
#253 opened by lootnath - 1
- 1
- 1
Error when connecting to Google Cloud Storage
#247 opened by goaaats - 0
Bug in Google Colab
#243 opened by vnitu02 - 2
Where to download 125M model? thanks.
#238 opened by tamal777 - 0
Issue loading gpt-neo-125M from checkpoint
#236 opened by ixn872 - 4
- 1
Is It Possible To Continue Finetuning From Checkpoints on Any GPT-Neo Model?
#234 opened by nikhilanayak - 1
How can I get train data?
#229 opened by thefreeman007 - 2
Can't sample from GPT3 2.7B model
#224 opened by texturejc - 9
Using gpt neo checkpoint
#227 opened by MK096 - 1
Killed Message
#222 opened by adnan-fakahr-pk-90 - 1
any model for paraphrase generation?
#215 opened by SeekPoint - 1
- 1
Question) Time schedule for release
#220 opened by Kyubyong