VDIGPKU/CBNetV2

How to train on a custom dataset

aliceinland opened this issue · 10 comments

Hi! I've edited the coco.py file, inside dataset, with my custom dataset which works in coco format. I have not understand how I can train the network, in my case I will need the Mask R-CNN, with my custom dataset.
Is there a config file to complete/edit? Or I need to pass all the information with command line?

Hi! Document of Train with customized datasets from MMDetection may be helpful to you.

Hi! Thank you!
I have done the process, here is my code:

_base_ = [
    '../swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py'
]

model = dict(
    backbone=dict(
        type='CBSwinTransformer',
    ),
    neck=dict(
        type='CBFPN',
    ),
)

# Modify dataset related settings
dataset_type = 'COCODataset'
CLASSES = ('short sleeve top', 'long sleeve top', 'short sleeve outwear', 'long sleeve outwear', 'vest', 'sling', 'shorts', 'trousers', 'skirt', 'short sleeve dress', 'long sleeve dress', 'vest dress', 'sling dress')
data = dict(
    train=dict(
        img_prefix='/workspace/CBNetV2/dataset/train/image/',
        classes=classes,
        ann_file='/workspace/CBNetV2/dataset/train/train.json'),
    val=dict(
        img_prefix='/workspace/CBNetV2/dataset/validation/image/',
        classes=classes,
        ann_file='/workspace/CBNetV2/dataset/validation.json'))
    '''test=dict(
        img_prefix='balloon/val/',
        classes=classes,
        ann_file='balloon/val/annotation_coco.json'))'''

# We can use the pre-trained Mask RCNN model to obtain higher performance
#load_from = './checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
load_from = '/workspace/CBNetV2/checkpoint/swin_tiny_patch4_window7_224.pth' 

but it does not work. Do you see any error?

The command that I am using for training is: tools/dist_train.sh configs/cbnet/mask_rcnn_cbv2_swin_small_patch4_window7_mstrain_480-800_adamw_3x_coco.py 8

swin_tiny_patch4_window7_224.pth is checkpoint of backbone (not detector), which is pretrained on ImageNet.
Pretrained model (Duel-Swin-T with Mask RCNN) on COCO can be download from mask_rcnn_cbv2_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.pth.

It seems to me that replacing the path of load_from with the checkpoint of detector should work fine.

I have another error:
''' module.num_classes == len(dataset.CLASSES),
AssertionError: The num_classes (80) in Shared2FCBBoxHead of MMDistributedDataParallel does not matches the length of CLASSES 13) in CocoDataset ''''

I changed the number of classes in the config file, the one reported above, but I could find anywhere to change them again.
Could you please tell me where this information need to be passed? Thank you.

@chuxiaojie where is batchsize? i want to set new batchsize. thks

@aliceinland
You need to change num_calsses in configs/_base_/models/mask_rcnn_swin_fpn.py.
More information about config can be found in document of config

@larsoncs
Hi, Batch size in each single GPU is determined by samples_per_gpu in config files.
Tutorial 1: Learn about Configs may be helpful to you.

RuntimeError: CUDA out of memory. Tried to allocate 120.00 MiB (GPU 0; 15.90 GiB total capacity; 14.74 GiB already allocated; 37.88 MiB free; 15.09 GiB reserved in total by PyTorch)

@chuxiaojie how to reduce the memory of using ?

You should reduce the batch-size.
How can I set the GPUs to use? I have 8 GPUs but I need to use the last four and I did not found a way to do it with the .sh script.

the batch-size is 1, i want to use single scale with training. dist_train.sh can set the num of gpus.