How to generate valid dataset from seed tasks?
handhand123 opened this issue · 3 comments
Hi,
Thank you very much for such an interesting project, I want to build some tools by myself and use lora for training. How should I build seed tasks and use gpt4 to generate data sets?
Looking forward to your reply
Hi, you can follow Table 7 of the tech report to generate the tool-related instructions. Before that, you need to define your tools according to Table 6.
@Yangr116 Thank you for your reply, I already knew how to generate simple instructions. But I still have a question, how to generate complex multiple rounds of thought step by step according to CoT as follows? It has troubled me for a long time.
"""
New input: Generate a real image with parked cars and moving cars from the edge image\nGPT4Tools needs to use tools to observe images, not directly imagine them. Thoughts and observations in the conversation are only visible to GPT4Tools. When answering human questions, repeat important information. Let's think step by step.\nThought: Do I need to use a tool? Yes\nAction: Edge Detection On Image\nAction Input: image/alzkcuai.png\nObservation: Result saved as qjyececq.png\nThought: Do I need to use a tool? Yes\nAction: Generate Image Condition On Canny Image\nAction Input: qjyececq.png, real image with parked and moving cars\nObservation: Result saved as image/xzytzuqp.png\n", "input": "", "output": "Thought: Do I need to use a tool? No\nAI: Result saved as image/xzytzuqp.png\n\n"}
"""
You can write a code to assemble related tools and their prompts.