1-line programs for fine-tuning, inference and more
- Videos 📽️
- Installation
- Documentation
- ACL-2022 Tutorial
gft contains 4 main functions:
- gft_fit: fit a pretrained model to data (aka fine-tuning)
- gft_predict: apply a model to inputs (aka inference)
- gft_eval: score a model on a split of a dataset
- gft_summary: Find good stuff (popular models and datasets), and explain what's in those models and datasets.
These gft functions make use of 4 main arguments (though most arguments in most hubs are also supported):
- data: standard datasets hosted on hubs such as HuggingFace, PaddleNLP, or custom datasets hosted on the local filesystem
- model: standard models hosted on hubs such as HuggingFace, PaddleNLP, or custom models hosted on the local filesystem
- equation: string such as "classify: label ~ text", where classify is a task, and label and text refer to columns in a dataset
- task: classify, classify_tokens, classify_spans, classify_audio, classify_images, regress, text-generation, translation, ASR, fill-mask
Here are some simple examples:
emodel=H:bhadresh-savani/roberta-base-emotion
# Summarize a dataset and/or model
gft_summary --data H:emotion
gft_summary --model $emodel
gft_summary --data H:emotion --model $emodel
# find some popular datasets and models that contain "emotion"
gft_summary --data H:__contains__emotion --topn 5
gft_summary --model H:__contains__emotion --topn 5
# make predictions on inputs from stdin
echo 'I love you.' | gft_predict --task classify
# The default model (for the classification task) performs sentiment analysis
# The model, $emodel, outputs emotion classes (as opposed to POSITIVE/NEGATIVE)
echo 'I love you.' | gft_predict --task classify --model $emodel
# some other tasks (beyond classification)
echo 'I love New York.' | gft_predict --task H:token-classification
echo 'I <mask> you.' | gft_predict --task H:fill-mask
# make predictions on inputs from a split of a standard dataset
gft_predict --eqn 'classify: label ~ text' --model $emodel --data H:emotion --split test
# return a single score (as opposed to a prediction for each input)
gft_eval --eqn 'classify: label ~ text' --model $emodel --data H:emotion --split test
# Input a pre-trained model (bert) and output a post-trained model
gft_fit --eqn 'classify: label ~ text' \
--model H:bert-base-cased \
--data H:emotion \
--output_dir $outdir
The table below shows a 3-step recipe, which has become standard in the literature on deep nets.
Step | gft Support | Description | Time | Hardware |
---|---|---|---|---|
1 | Pre-Training | Days/Weeks | Large GPU Cluster | |
2 | gft_fit | Fine-Tuning | Hours/Days | 1+ GPUs |
3 | gft_predict | Inference | Seconds/Minutes | 0+ GPUs |
This repo provides support for step 2 (gft_fit) and step 3 (gft_predict). Most gft_fit and gft_predict programs are short (1-line), much shorter than examples such as these, which are typically a few hundred lines of python. With gft, users should not need to read or modify any python code for steps 2 and 3 in the table above.
Step 1, pre-training, is beyond the scope of this work. We recommend starting with models from HuggingFace and PaddleHub/PaddleNLP hubs, as illustrated in the examples below.
Paper (draft) is here.