How to fine-tune T5 with a Casual Language Modeling object?

Question

How to fine-tune T5 with a Casual Language Modeling object?

nanbeitk opened this issue 2 years ago · 0 comments

Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.

My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this document and it use AutoModelForCasualLM, but this liabrary just not include series of t5 models.

So my question is:

How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting token_2 from token_1 , token_3 from token_1, token_2 until the end of input sequence, so i am confused how to finish this process myself.
I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
input_ids labels

[t1, 1, 1, 1, 1, 1, ...] [t1, t2, 1, 1, 1, 1, ...]
[t1, t2, 1, 1, 1, 1, ...] [t1, t2, t3, 1, 1, 1, ...]
[t1, t2, t3, 1, 1, 1, ...] [t1, t2, t3, t4, 1, 1, ...]
[t1, t2, t3, t4, 1, 1, ...] [t1, t2, t3, t4, t5, 1, ...]
The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don't know if this fulfills the CLM's mission requirements. This is the only idea that i can catch up to solve this problem, does it work?

Thanks!!