How to fine-tune T5 with a Casual Language Modeling object?
nanbeitk opened this issue · 0 comments
Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.
My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this document and it use AutoModelForCasualLM
, but this liabrary just not include series of t5 models.
So my question is:
-
How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting
token_2
fromtoken_1
,token_3
fromtoken_1, token_2
until the end of input sequence, so i am confused how to finish this process myself. -
I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
input_ids labels
-
[t1, 1, 1, 1, 1, 1, ...]
[t1, t2, 1, 1, 1, 1, ...]
-
[t1, t2, 1, 1, 1, 1, ...]
[t1, t2, t3, 1, 1, 1, ...]
-
[t1, t2, t3, 1, 1, 1, ...]
[t1, t2, t3, t4, 1, 1, ...]
-
[t1, t2, t3, t4, 1, 1, ...]
[t1, t2, t3, t4, t5, 1, ...]
The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don't know if this fulfills the CLM's mission requirements. This is the only idea that i can catch up to solve this problem, does it work?
Thanks!!