VinAIResearch/PhoGPT

What is the data format to perform fine turning of the phoGPT model?

dangyuuki123 opened this issue · 1 comments

I want to perform fine turning on the phoGPT model with the goal of answering the information in the text source. What format should the data have to be able to do this?

See: https://github.com/mosaicml/llm-foundry/blob/main/scripts/train/README.md#llmfinetuning

Here is the formatted example I used for fine-tuning the base PhoGPT with context-based QA.

formatted_example = {'prompt': "### Câu hỏi:\nDựa vào văn bản sau đây:\n{text}\nHãy trả lời câu hỏi: {question}\n\n### Trả lời:" , 'response': "{response_text}"}