KITTI dataset
thatnn opened this issue · 7 comments
First thank u for your amazing work
I want to train and test with kitti format data.
So i modify some parameter but it doesn't work
error is below
Traceback (most recent call last):
File "train.py", line 228, in
main()
File "train.py", line 172, in main
train_model(
File "/home/user/DSVT/tools/train_utils/train_utils.py", line 224, in train_model
accumulated_iter = train_one_epoch(
File "/home/user/DSVT/tools/train_utils/train_utils.py", line 75, in train_one_epoch
loss, tb_dict, disp_dict = model_func(model, batch)
File "/home/user/DSVT/tools/../pcdet/models/init.py", line 42, in model_func
ret_dict, tb_dict, disp_dict = model(batch_dict)
File "/home/user/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/DSVT/tools/../pcdet/models/detectors/centerpoint.py", line 14, in forward
loss, tb_dict, disp_dict = self.get_training_loss()
File "/home/user/DSVT/tools/../pcdet/models/detectors/centerpoint.py", line 27, in get_training_loss
loss_rpn, tb_dict = self.dense_head.get_loss()
File "/home/user/DSVT/tools/../pcdet/models/dense_heads/center_head.py", line 258, in get_loss
hm_loss = self.hm_loss_func(pred_dict['hm'], target_dicts['heatmaps'][idx])
File "/home/user/anaconda3/envs/openpcdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/user/DSVT/tools/../pcdet/utils/loss_utils.py", line 312, in forward
return self.neg_loss(out, target, mask=mask)
File "/home/user/DSVT/tools/../pcdet/utils/loss_utils.py", line 282, in neg_loss_cornernet
pos_loss = torch.log(pred) * torch.pow(1 - pred, 2) * pos_inds
RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0
and here is my yaml file
CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']
DATA_CONFIG:
_BASE_CONFIG_: cfgs/dataset_configs/kitti_dataset.yaml
POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['placeholder']
AUG_CONFIG_LIST:
- NAME: gt_sampling
USE_ROAD_PLANE: False
DB_INFO_PATH:
- kitti_dbinfos_train.pkl
PREPARE: {
filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
filter_by_difficulty: [-1],
}
SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
NUM_POINT_FEATURES: 4
REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
LIMIT_WHOLE_SCENE: True
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x','y']
- NAME: random_world_rotation
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
- NAME: random_world_scaling
WORLD_SCALE_RANGE: [0.95, 1.05]
- NAME: random_world_translation
NOISE_TRANSLATE_STD: [0.5, 0.5, 0.5]
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True
- NAME: shuffle_points
SHUFFLE_ENABLED: {
'train': True,
'test': False
}
- NAME: transform_points_to_voxels_placeholder
VOXEL_SIZE: [ 0.4, 0.4, 0.1875 ]
MODEL:
NAME: CenterPoint
VFE:
NAME: DynPillarVFE3D
WITH_DISTANCE: False
USE_ABSLOTE_XYZ: True
USE_NORM: True
NUM_FILTERS: [192, 192]
BACKBONE_3D:
NAME: DSVT
INPUT_LAYER:
sparse_shape: [400, 300, 32]
downsample_stride: [[1, 1, 4], [1, 1, 4], [1, 1, 2]]
d_model: [192, 192, 192, 192]
set_info: [[48, 1], [48, 1], [48, 1], [48, 1]]
window_shape: [[12, 12, 32], [12, 12, 8], [12, 12, 2], [12, 12, 1]]
hybrid_factor: [2, 2, 1] # x, y, z
shifts_list: [[[0, 0, 0], [6, 6, 0]], [[0, 0, 0], [6, 6, 0]], [[0, 0, 0], [6, 6, 0]], [[0, 0, 0], [6, 6, 0]]]
normalize_pos: False
block_name: ['DSVTBlock','DSVTBlock','DSVTBlock','DSVTBlock']
set_info: [[48, 1], [48, 1], [48, 1], [48, 1]]
d_model: [192, 192, 192, 192]
nhead: [8, 8, 8, 8]
dim_feedforward: [384, 384, 384, 384]
dropout: 0.0
activation: gelu
reduction_type: 'attention'
output_shape: [468, 468]
conv_out_channel: 192
# ues_checkpoint: True
MAP_TO_BEV:
NAME: PointPillarScatter3d
INPUT_SHAPE: [468, 468, 1]
NUM_BEV_FEATURES: 192
BACKBONE_2D:
NAME: BaseBEVResBackbone
LAYER_NUMS: [ 1, 2, 2 ]
LAYER_STRIDES: [ 1, 2, 2 ]
NUM_FILTERS: [ 128, 128, 256 ]
UPSAMPLE_STRIDES: [ 1, 2, 4 ]
NUM_UPSAMPLE_FILTERS: [ 128, 128, 128 ]
DENSE_HEAD:
NAME: CenterHead
CLASS_AGNOSTIC: False
CLASS_NAMES_EACH_HEAD: [
['Car', 'Pedestrian', 'Cyclist']
]
SHARED_CONV_CHANNEL: 64
USE_BIAS_BEFORE_NORM: False
NUM_HM_CONV: 2
BN_EPS: 0.001
BN_MOM: 0.01
SEPARATE_HEAD_CFG:
HEAD_ORDER: ['center', 'center_z', 'dim', 'rot']
HEAD_DICT: {
'center': {'out_channels': 2, 'num_conv': 2},
'center_z': {'out_channels': 1, 'num_conv': 2},
'dim': {'out_channels': 3, 'num_conv': 2},
'rot': {'out_channels': 2, 'num_conv': 2},
'iou': {'out_channels': 1, 'num_conv': 2},
}
TARGET_ASSIGNER_CONFIG:
FEATURE_MAP_STRIDE: 1
NUM_MAX_OBJS: 500
GAUSSIAN_OVERLAP: 0.1
MIN_RADIUS: 2
IOU_REG_LOSS: True
LOSS_CONFIG:
LOSS_WEIGHTS: {
'cls_weight': 1.0,
'loc_weight': 2.0,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
POST_PROCESSING:
SCORE_THRESH: 0.5
POST_CENTER_LIMIT_RANGE: [-80, -80, -10.0, 80, 80, 10.0]
MAX_OBJ_PER_SAMPLE: 500
USE_IOU_TO_RECTIFY_SCORE: True
IOU_RECTIFIER: [0.68, 0.71, 0.65]
NMS_CONFIG:
NMS_TYPE: multi_class_nms # only for centerhead, use mmdet3d version nms
NMS_THRESH: [0.7, 0.6, 0.55]
NMS_PRE_MAXSIZE: [4096, 4096, 4096]
NMS_POST_MAXSIZE: [500, 500, 500]
POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
EVAL_METRIC: kitti
OPTIMIZATION:
BATCH_SIZE_PER_GPU: 4
NUM_EPOCHS: 30
OPTIMIZER: adam_onecycle
LR: 0.003
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 10
LOSS_SCALE_FP16: 32.0
HOOK:
DisableAugmentationHook:
DISABLE_AUG_LIST: ['gt_sampling','random_world_flip','random_world_rotation','random_world_scaling', 'random_world_translation']
NUM_LAST_EPOCHS: 1
Can u help me or provide some yaml file to train kitti dataset?
I'm waiting for yor reply
Thank you!!
According your config, you use the wrong sparse_shape: [400, 300, 32]
and INPUT_SHAPE: [468, 468, 1]
.
According your config, you use the wrong
sparse_shape: [400, 300, 32]
andINPUT_SHAPE: [468, 468, 1]
.
Thank you for your reply
What are the values fit on Sparse_shape and Input_shape ?
is it realated to point cloud range?
Thank you
According your config, you use the wrong
sparse_shape: [400, 300, 32]
andINPUT_SHAPE: [468, 468, 1]
.Thank you for your reply
What are the values fit on Sparse_shape and Input_shape ?
is it realated to point cloud range?
Thank you
POINT_CLOUD_RANGE, VOXEL_SIZE and downsample_stride.
According your config, you use the wrong
sparse_shape: [400, 300, 32]
andINPUT_SHAPE: [468, 468, 1]
.Thank you for your reply
What are the values fit on Sparse_shape and Input_shape ?
is it realated to point cloud range?
Thank youPOINT_CLOUD_RANGE, VOXEL_SIZE and downsample_stride.
Can u suggest some values?
Thanks
The downsample_stride should be carefully considered and is relative to INPUT_SHAPE
in MAP_TO_BEV. I recommend you read the origin code.
According your config, you use the wrong
sparse_shape: [400, 300, 32]
andINPUT_SHAPE: [468, 468, 1]
.Thank you for your reply
What are the values fit on Sparse_shape and Input_shape ?
is it realated to point cloud range?
Thank youPOINT_CLOUD_RANGE, VOXEL_SIZE and downsample_stride.
Can u suggest some values?
Thanks
By the way, if you use the voxel_size
of [ 0.4, 0.4, 0.1875 ] and POINT_CLOUD_RANGE
of [0, -39.68, -3, 69.12, 39.68, 1], the sparse_shape
should be [173, 199,22].
According your config, you use the wrong
sparse_shape: [400, 300, 32]
andINPUT_SHAPE: [468, 468, 1]
.Thank you for your reply
What are the values fit on Sparse_shape and Input_shape ?
is it realated to point cloud range?
Thank youPOINT_CLOUD_RANGE, VOXEL_SIZE and downsample_stride.
Can u suggest some values?
ThanksBy the way, if you use the
voxel_size
of [ 0.4, 0.4, 0.1875 ] andPOINT_CLOUD_RANGE
of [0, -39.68, -3, 69.12, 39.68, 1], thesparse_shape
should be [173, 199,22].
@thatnn Have you trained successfully after using the above parameters?