top_down_layer_inputs = torch.cat([upsample_feat, feat_low], 1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 16 but got size 15 for tensor number 1 in the list.
99HU opened this issue · 6 comments
Prerequisite
- I have searched the existing and past issues but cannot get the expected help.
- I have read the FAQ documentation but cannot get the expected help.
- The bug has not been fixed in the latest version.
🐞 Describe the bug
I try use yolov8 to Train on my own Dataset,the classnum is 1. I am begginer of mmyolo,is there any mistake in my configs? my config is from the sample of mmyolo. Anyone help,i will be Appreciated
The configs:
`
base = ['../configs/base/default_runtime.py', '../configs/base/det_p5_tta.py']
========================Frequently modified parameters======================
-----data related-----
data_root = 'PaperDataSet' #### Root path of data
Path of train annotation file
train_ann_file = 'Train.json'
train_data_prefix = '' #### Prefix of train image path
Path of val annotation file
val_ann_file = 'Test.json'
val_data_prefix = '/' #### Prefix of val image path
num_classes = 1 #### Number of classes for classification
Batch size of a single GPU during training
train_batch_size_per_gpu = 16
Worker to pre-fetch data for each single GPU during training
train_num_workers = 2
persistent_workers must be False if num_workers is 0
persistent_workers = True
-----train val related-----
Base learning rate for optim_wrapper. Corresponding to 8xb16=64 bs
base_lr = 0.01
max_epochs = 200 #### Maximum training epochs
Disable mosaic augmentation for final 10 epochs (stage 2)
close_mosaic_epochs = 10
model_test_cfg = dict(
#### The config of multi-label for multi-class prediction.
multi_label=True,
#### The number of boxes before NMS
nms_pre=30000,
score_thr=0.001, #### Threshold to filter out boxes.
nms=dict(type='nms', iou_threshold=0.7), #### NMS type and threshold
max_per_img=300) #### Max number of detections of each image
========================Possible modified parameters========================
-----data related-----
img_scale = (256, 236) #### width, height
Dataset type, this will be used to define the dataset
dataset_type = 'YOLOv5CocoDataset'
Batch size of a single GPU during validation
val_batch_size_per_gpu = 1
Worker to pre-fetch data for each single GPU during validation
val_num_workers = 2
Config of batch shapes. Only on val.
We tested YOLOv8-m will get 0.02 higher than not using it.
batch_shapes_cfg = None
You can turn on batch_shapes_cfg
by uncommenting the following lines.
batch_shapes_cfg = dict(
type='BatchShapePolicy',
batch_size=val_batch_size_per_gpu,
img_size=img_scale[0],
#### The image scale of padding should be divided by pad_size_divisor
size_divisor=32,
#### Additional paddings for pixel scale
extra_pad_ratio=0.5)
-----model related-----
The scaling factor that controls the depth of the network structure
deepen_factor = 0.33
The scaling factor that controls the width of the network structure
widen_factor = 0.5
Strides of multi-scale prior box
strides = [8, 16, 32]
The output channel of the last stage
last_stage_out_channels = 1024
num_det_layers = 3 #### The number of model output scales
norm_cfg = dict(type='BN', momentum=0.03, eps=0.001) #### Normalization config
-----train val related-----
affine_scale = 0.5 #### YOLOv5RandomAffine scaling ratio
YOLOv5RandomAffine aspect ratio of width and height thres to filter bboxes
max_aspect_ratio = 100
tal_topk = 10 #### Number of bbox selected in each level
tal_alpha = 0.5 #### A Hyper-parameter related to alignment_metrics
tal_beta = 6.0 #### A Hyper-parameter related to alignment_metrics
TODO: Automatically scale loss_weight based on number of detection layers
loss_cls_weight = 0.5
loss_bbox_weight = 7.5
Since the dfloss is implemented differently in the official
and mmdet, we're going to divide loss_weight by 4.
loss_dfl_weight = 1.5 / 4
lr_factor = 0.01 #### Learning rate scaling factor
weight_decay = 0.0005
Save model checkpoint and validation intervals in stage 1
save_epoch_intervals = 10
validation intervals in stage 2
val_interval_stage2 = 1
The maximum checkpoints to keep.
max_keep_ckpts = 2
Single-scale training is recommended to
be turned on, which can speed up training.
env_cfg = dict(cudnn_benchmark=True)
===============================Unmodified in most cases====================
model = dict(
type='YOLODetector',
data_preprocessor=dict(
type='YOLOv5DetDataPreprocessor',
mean=[0., 0., 0.],
std=[255., 255., 255.],
bgr_to_rgb=True),
backbone=dict(
type='YOLOv8CSPDarknet',
arch='P5',
last_stage_out_channels=last_stage_out_channels,
deepen_factor=deepen_factor,
widen_factor=widen_factor,
norm_cfg=norm_cfg,
act_cfg=dict(type='SiLU', inplace=True)),
neck=dict(
type='YOLOv8PAFPN',
deepen_factor=deepen_factor,
widen_factor=widen_factor,
in_channels=[256, 512, last_stage_out_channels],
out_channels=[256, 512, last_stage_out_channels],
num_csp_blocks=3,
norm_cfg=norm_cfg,
act_cfg=dict(type='SiLU', inplace=True)),
bbox_head=dict(
type='YOLOv8Head',
head_module=dict(
type='YOLOv8HeadModule',
num_classes=num_classes,
in_channels=[256, 512, last_stage_out_channels],
widen_factor=widen_factor,
reg_max=16,
norm_cfg=norm_cfg,
act_cfg=dict(type='SiLU', inplace=True),
featmap_strides=strides),
prior_generator=dict(
type='mmdet.MlvlPointGenerator', offset=0.5, strides=strides),
bbox_coder=dict(type='DistancePointBBoxCoder'),
#### scaled based on number of detection layers
loss_cls=dict(
type='mmdet.CrossEntropyLoss',
use_sigmoid=True,
reduction='none',
loss_weight=loss_cls_weight),
loss_bbox=dict(
type='IoULoss',
iou_mode='ciou',
bbox_format='xyxy',
reduction='sum',
loss_weight=loss_bbox_weight,
return_iou=False),
loss_dfl=dict(
type='mmdet.DistributionFocalLoss',
reduction='mean',
loss_weight=loss_dfl_weight)),
train_cfg=dict(
assigner=dict(
type='BatchTaskAlignedAssigner',
num_classes=num_classes,
use_ciou=True,
topk=tal_topk,
alpha=tal_alpha,
beta=tal_beta,
eps=1e-9)),
test_cfg=model_test_cfg)
albu_train_transforms = [
dict(type='Blur', p=0.01),
dict(type='MedianBlur', p=0.01),
dict(type='ToGray', p=0.01),
dict(type='CLAHE', p=0.01)
]
pre_transform = [
dict(type='LoadImageFromFile', backend_args=base.backend_args),
dict(type='LoadAnnotations', with_bbox=True)
]
last_transform = [
dict(
type='mmdet.Albu',
transforms=albu_train_transforms,
bbox_params=dict(
type='BboxParams',
format='pascal_voc',
label_fields=['gt_bboxes_labels', 'gt_ignore_flags']),
keymap={
'img': 'image',
'gt_bboxes': 'bboxes'
}),
dict(type='YOLOv5HSVRandomAug'),
dict(type='mmdet.RandomFlip', prob=0.5),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape', 'flip',
'flip_direction'))
]
train_pipeline = [
*pre_transform,
dict(
type='Mosaic',
img_scale=img_scale,
pad_val=114.0,
pre_transform=pre_transform),
dict(
type='YOLOv5RandomAffine',
max_rotate_degree=0.0,
max_shear_degree=0.0,
scaling_ratio_range=(1 - affine_scale, 1 + affine_scale),
max_aspect_ratio=max_aspect_ratio,
#### img_scale is (width, height)
border=(-img_scale[0] // 2, -img_scale[1] // 2),
border_val=(114, 114, 114)),
*last_transform
]
train_pipeline_stage2 = [
*pre_transform,
dict(type='YOLOv5KeepRatioResize', scale=img_scale),
dict(
type='LetterResize',
scale=img_scale,
allow_scale_up=True,
pad_val=dict(img=114.0)),
dict(
type='YOLOv5RandomAffine',
max_rotate_degree=0.0,
max_shear_degree=0.0,
scaling_ratio_range=(1 - affine_scale, 1 + affine_scale),
max_aspect_ratio=max_aspect_ratio,
border_val=(114, 114, 114)), *last_transform
]
train_dataloader = dict(
batch_size=train_batch_size_per_gpu,
num_workers=train_num_workers,
persistent_workers=persistent_workers,
pin_memory=True,
sampler=dict(type='DefaultSampler', shuffle=True),
collate_fn=dict(type='yolov5_collate'),
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file=train_ann_file,
data_prefix=dict(img=train_data_prefix),
filter_cfg=dict(filter_empty_gt=False, min_size=32),
pipeline=train_pipeline))
test_pipeline = [
dict(type='LoadImageFromFile', backend_args=base.backend_args),
dict(type='YOLOv5KeepRatioResize', scale=img_scale),
dict(
type='LetterResize',
scale=img_scale,
allow_scale_up=False,
pad_val=dict(img=114)),
dict(type='LoadAnnotations', with_bbox=True, scope='mmdet'),
dict(
type='mmdet.PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor', 'pad_param'))
]
val_dataloader = dict(
batch_size=val_batch_size_per_gpu,
num_workers=val_num_workers,
persistent_workers=persistent_workers,
pin_memory=True,
drop_last=False,
sampler=dict(type='DefaultSampler', shuffle=False),
dataset=dict(
type=dataset_type,
data_root=data_root,
test_mode=True,
data_prefix=dict(img=val_data_prefix),
ann_file=val_ann_file,
pipeline=test_pipeline,
batch_shapes_cfg=batch_shapes_cfg))
test_dataloader = val_dataloader
param_scheduler = None
optim_wrapper = dict(
type='OptimWrapper',
clip_grad=dict(max_norm=10.0),
optimizer=dict(
type='SGD',
lr=base_lr,
momentum=0.937,
weight_decay=weight_decay,
nesterov=True,
batch_size_per_gpu=train_batch_size_per_gpu),
constructor='YOLOv5OptimizerConstructor')
default_hooks = dict(
param_scheduler=dict(
type='YOLOv5ParamSchedulerHook',
scheduler_type='linear',
lr_factor=lr_factor,
max_epochs=max_epochs),
checkpoint=dict(
type='CheckpointHook',
interval=save_epoch_intervals,
save_best='auto',
max_keep_ckpts=max_keep_ckpts))
custom_hooks = [
dict(
type='EMAHook',
ema_type='ExpMomentumEMA',
momentum=0.0001,
update_buffers=True,
strict_load=False,
priority=49),
dict(
type='mmdet.PipelineSwitchHook',
switch_epoch=max_epochs - close_mosaic_epochs,
switch_pipeline=train_pipeline_stage2)
]
val_evaluator = dict(
type='mmdet.CocoMetric',
proposal_nums=(100, 1, 10),
ann_file=data_root +"\"+ val_ann_file,
metric='bbox')
test_evaluator = val_evaluator
train_cfg = dict(
type='EpochBasedTrainLoop',
max_epochs=max_epochs,
val_interval=save_epoch_intervals,
dynamic_intervals=[((max_epochs - close_mosaic_epochs),
val_interval_stage2)])
val_cfg = dict(type='ValLoop')
test_cfg = dict(type='TestLoop')
`
The Command
python tools\train.py moudle\yolov8.py
The errorMesage
11/17 14:24:07 - mmengine - INFO - Checkpoints will be saved to E:\HWS\mmlab\mmyolo-main\work_dirs\yolov8. Traceback (most recent call last): File "tools\train.py", line 123, in <module> main() File "tools\train.py", line 119, in main runner.train() File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmengine\runner\runner.py", line 1777, in train model = self.train_loop.run() # type: ignore File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmengine\runner\loops.py", line 96, in run self.run_epoch() File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmengine\runner\loops.py", line 112, in run_epoch self.run_iter(idx, data_batch) File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmengine\runner\loops.py", line 128, in run_iter outputs = self.runner.model.train_step( File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmengine\model\base_model\base_model.py", line 114, in train_step losses = self._run_forward(data, mode='loss') # type: ignore File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmengine\model\base_model\base_model.py", line 346, in _run_forward results = self(**data, mode=mode) File "C:\Users\A\AppData\Roaming\Python\Python38\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmdet\models\detectors\base.py", line 92, in forward return self.loss(inputs, data_samples) File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmdet\models\detectors\single_stage.py", line 77, in loss x = self.extract_feat(batch_inputs) File "D:\ProgramData\anaconda3\envs\PaperLocation\lib\site-packages\mmdet\models\detectors\single_stage.py", line 148, in extract_feat x = self.neck(x) File "C:\Users\A\AppData\Roaming\Python\Python38\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "e:\hws\mmlab\mmyolo-main\mmyolo\models\necks\base_yolo_neck.py", line 239, in forward top_down_layer_inputs = torch.cat([upsample_feat, feat_low], 1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 16 but got size 15 for tensor number 1 in the list.
Environment
mmcv 2.0.1
mmdet 3.2.0
mmengine 0.9.1
mmyolo 0.6.0
Additional information
I try to use myown dataset which is mscocotype,i have ever train on other mmlab project.maybe the bug can not happen to the dataset. my configs is from the confis,and i only modify the Frequently modified parameters and Possible modified parameters Part. Is there anything need to modify which i missed. i am looking forward for your reply ,thanks very much.
I solved this issue by modifying the img_scale.
I solved this issue by modifying the img_scale.
Thank you very much. I made a mistake. My problem has been solved. : )
You changed at all the places where img_scale is present? also what is your input size 640x640?
Hi @99HU, did you face KeyError: 'YOLOv5CocoDataset is not in the mmengine::dataset registry. Please check whether the value of YOLOv5CocoDataset
is correct or it was registered as expected. More details can be found at https://mmengine.readthedocs.io/en/latest/advanced_tutorials/config.html#import-the-custom-module' ?