PiotrNawrot/nanoT5
Fast & Simple repository for pre-training and fine-tuning T5-style models
PythonApache-2.0
Issues
- 0
How to change training objective from next token prediction to Masked Language Modeling?
#45 opened by HaninZeyad - 1
A possible bug in the generate method
#44 opened by SiyuanHuangSJTU - 1
The weird curve
#42 opened by nguyenvannghiem0312 - 2
How to replace the tokenizer with another one?
#41 opened by hifarer - 2
nanoT5 for different embeddings
#40 opened by victoriazinkovich - 1
- 1
About Pre-training objectives
#38 opened by SoshyHayami - 1
pre-training on local C4 dataset?
#37 opened by TTTTCoding - 5
Just a quick question to pretrain Flan-T5
#35 opened by hohoCode - 1
Continued pretraining from official models.
#36 opened by IdeaKing - 19
- 3
Learning rate for multi-GPUs training
#34 opened by phucdoitoan - 2
Beginner Question : Would it be wise to use this as a backbone for custom seq2seq modeling fMRI data and custom encoder?
#33 opened by dyhan316 - 1
- 3
- 1
How to create pytorch_model.bin file?
#30 opened by mayanks43 - 2
Flash attention
#28 opened by Taytay - 5
Larger models and training on the Pile
#29 opened by Taytay - 15
RMS scaling issues
#15 opened by SmerkyG - 1
Pre-train on different Dataset than C4
#27 opened by nikifori - 0
Transformation to HF model
#26 opened - 7
About pre-training on another dataset
#21 opened by tarudesu - 4
self-defined loss function failed to work (torch._dynamo.exc.InternalTorchDynamoError: ln_encoder)
#24 opened by QinengWang-Aiden - 7
- 4
Citing Repo
#1 opened by dhairyadalal - 2
- 9
query regrading muti-gpu
#12 opened by trinanjan12 - 1
AttributeError: Can't pickle local object 'IterableDataset.map.<locals>.<lambda>'
#20 opened by turian - 2
- 1
pre-train on long context.
#16 opened by enpassanty - 1
How to run on CPU
#18 opened by ratan-prasad - 1
Shape mismatch warning
#14 opened by TuTruongVian - 1
Pre training on my own dataset
#11 opened by trinanjan12 - 1
Why isn't the lr warm up from 0?
#9 opened by jzhang38 - 5
Pre-trained nanoT5 model on C4 corpus
#6 opened by SungHo3268 - 5
Resume the pre-training process
#7 opened by QizhiPei - 2
Computing Rouge score during training
#3 opened by sjelassi - 2
- 1