EleutherAI/gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
PythonMIT
Issues
- 3
How do we decrease inference speed?
#189 opened by aalayrot - 2
Intermediate checkpoints?
#198 opened by matthiasgalle - 0
Add performance metrics for 125M and 350M models
#206 opened by paulbricman - 9
- 4
- 3
Public API
#214 opened by fire17 - 3
ValueError: Unrecognized configuration class <class 'transformers.models.gpt_neo.configuration_gpt_neo.GPTNeoConfig'> for this kind of AutoModel: TFAutoModelForCausalLM. Model type should be one of BertConfig, OpenAIGPTConfig, GPT2Config, TransfoXLConfig, XLNetConfig, XLMConfig, CTRLConfig.
#194 opened by D3MZ - 12
Can't generate samples from pre-trained GPT3_XL using main.py without errors
#163 opened by texturejc - 7
- 2
Documentation request: How to convert a GPTNeo TF checkpoint into a huggingface supported model
#210 opened by Norod - 2
- 3
Missing __init__.py
#177 opened by dakami - 3
ValueError: Unrecognized configuration class <class 'transformers.models.gpt_neo.configuration_gpt_neo.GPTNeoConfig'>
#204 opened by dPacc - 2
Requirements to impement GPTNeo on GPUs
#178 opened by bpm246 - 2
Support conversion to pytorch for GPT Neo
#186 opened by sink-chan - 1
Stop Sequences
#195 opened by jaehyunshinML - 1
- 2
Is there a playground for neo? or slack Channel?
#203 opened by superjayman - 6
Argument not a list with same length as devices
#193 opened by HughPH - 3
dummy tfrecord file could not be downloaded.
#199 opened by zshyang - 22
- 13
ValueError when predicting with pretrained models
#150 opened by iocuydi - 8
Failed to install requirements
#142 opened by notooth1 - 2
huggingface instructions broken
#185 opened by p-christ - 8
- 2
The README is a lie
#188 opened by shawwn - 1
Provide more options for inference
#157 opened by JanPokorny - 2
training on TPU v2.8-512
#184 opened by riccardo247 - 0
GPT-3 configuration for a v3-32 TPU
#183 opened by stefan-it - 2
The conversion script doesn’t work
#174 opened by StellaAthena - 3
Fine-tuning stuck in endless no-op loop at the end
#156 opened by JanPokorny - 1
Making predictions deterministic
#164 opened by zilunpeng - 3
Tips & tricks to speed up inference
#179 opened by danielpatrickhug - 0
- 2
Questions about the hyperparameters in 'config'
#166 opened by ihatedebug - 4
- 1
Any way to run model without payment method on gcloud?
#172 opened by qpwo - 1
connect timeout
#155 opened by wingdi - 4
Inference Fails on a single RTX 3090
#158 opened by afiaka87 - 1
- 1
https://the-eye.eu/public/AI/gptneo-release/GPT3_2-7B/ cannot access and download
#161 opened by JoeshpCheung - 1
`--check-dataset` fails `tf.enable_eager_execution must be called at program startup`
#159 opened by afiaka87 - 2
Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object.
#160 opened by c-box - 5
Can't infer on the provided Colab
#148 opened by JanPokorny - 2
- 1
While loop in predict mode
#152 opened by ArturTan - 1
- 2
- 1
Does this support distributed training?
#141 opened by CrazyPython - 1
Link to dataset is not available
#140 opened by zilunpeng