How to use custom dataset for training?

Question

How to use custom dataset for training?

yongshuo-Z opened this issue 4 years ago · 1 comments

Hi,

Thanks for the super nice work!! I wonder what modifications I should do if I want to use my own dataset? Suppose I have a source domain dataset A with corresponding images/mask, and target domain dataset B with corresponding images/mask. What folder should I put them? Any advice will be appreciated! Thanks.

By the way, have you tried training the model in a relatively smaller dataset, e.g. 10K images? Would the model still achieve such good performance?

Answer 1 · 2021-05-07T09:45:19.000Z

Hi @yongshuo-Z,

what modifications I should do if I want to use my own dataset?

In short, you need to define a new task using cfg.TRAIN.TASK. You will need to modify the code slightly by adding new cases to specify your new training and validation data. Just search for the occurences of cfg.TRAIN.TASK in the code and you'll see. You will then place your data in data/<your dataset>, the train/val file lists will have paths like <your dataset>/<path to the image>[space]<your dataset>/<path to ground truth>.

By the way, have you tried training the model in a relatively smaller dataset, e.g. 10K images? Would the model still achieve such good performance?

This size is comparable to the standard UDA setup we used. So, yes, in terms of the dataset size this should not be a problem.

Hope this helps.
Nikita