What is the data format to perform fine turning of the phoGPT model?

Question

What is the data format to perform fine turning of the phoGPT model?

dangyuuki123 opened this issue 10 months ago · 1 comments

I want to perform fine turning on the phoGPT model with the goal of answering the information in the text source. What format should the data have to be able to do this?

Answer 1 · 2023-11-30T16:25:17.000Z

See: https://github.com/mosaicml/llm-foundry/blob/main/scripts/train/README.md#llmfinetuning

Here is the formatted example I used for fine-tuning the base PhoGPT with context-based QA.

formatted_example = {'prompt': "### Câu hỏi:\nDựa vào văn bản sau đây:\n{text}\nHãy trả lời câu hỏi: {question}\n\n### Trả lời:" , 'response': "{response_text}"}