RUCAIBox/TextBox

how to train a translation task by using TextBox

moseshu opened this issue · 1 comments

how to train a translation task by using TextBox

Hello,

First, you have to download dataset and put it into dataset/YOUR_DATASET/ folder. Data should be split into train set, develop set and test set. And source text and target text should be put in *.src and *.tgt respectively. You can find the example in dataset/samsum/.

Second, you have to set configuration of dataset in TextBox/textbox/properties/dataset/YOUR_DATASET.yaml. It's important to note that the filename of yaml file should be consistent with folder name of your dataset. You can find the example in TextBox/textbox/properties/dataset/wmt16-en-ro.yaml.

Third, running command line as follow:

python run_textbox.py \
    --model=MODEL_NAME \
    --model_path=MODEL_PATH \
    --dataset=YOUR_DATASET

Hope this can help you.