LDConv: Linear deformable convoluton for improving convolutioanl neural networks (Image and Vision Computing)

This repository is a PyTorch implementation of our paper: LDConv: Linear deformable convoluton for improving convolutioanl neural networks.

If you are interested in our other work, you can find information on https://github.com/Liuchen1997/RFAConv.

The relevant interpolation codes and resampling codes are referenced at https://github.com/dontLoveBugs/Deformable_ConvNet_pytorch.

The code has been opened, thank you for your support.

LDConv provides kernels of different sizes for efficient extraction of features.

Object detection based on COCO2017 and YOLOv5

Models	LDConv	AP50	AP75	AP	APS	APM	APL	GFLOPS	Params (M)
YOLOv5n (Baseline)	-	45.6	28.9	27.5	13.5	31.5	35.9	4.5	1.87
	3	47.8	31	29.8	14.5	33.2	41	3.8	1.51
YOLOv5n	5	48.8	32.6	31	14.6	34.1	43.2	4.1	1.65
	9	50.5	33.9	32.3	14.9	36.1	44.1	4.8	1.94
	13	51.2	34.5	33	15.7	36.3	45.6	5.5	2.23
YOLOv5s (Baseline)	-	57	39.9	37.1	20.9	42.4	47.8	16.4	7.23
	4	58.2	41.9	39.2	21.4	43.2	53.4	14.1	6.01
YOLOv5s	6	59.2	42.6	39.9	21.5	44.2	54.7	15.3	6.55
	7	59.4	43.2	40.4	21.5	44.6	55.1	15.9	6.82

Object detection based on VOC 7+12 and YOLOv7

Models	LDConv	Precision	Recall	mAP50	mAP	FLOPS	Params
YOLOv7-tiny (Baseline)	-	77.3	69.8	76.4	50.2	13.2	6.06
	3	80.1	68.4	76.1	50.3	12.1	5.56
	4	78.2	70.3	76.2	50.7	12.4	5.66
YOLOv7-tiny	5	77	71.1	76.5	50.8	12.6	5.75
	6	79.6	69.9	76.9	51	12.9	5.85
	8	78.6	70.1	76.7	51.2	13.4	6.04
	9	81	69.3	76.7	51.3	13.7	6.14

Object detection based on VisDrone-DET2021 and YOLOv5

Models	LDConv	Precision	Recall	mAP50	mAP	FLOPS	Params (M)
YOLOv5n (Baseline)	-	38.5	28	26.4	13.4	4.2	1.77
	3	37.9	27.4	25.9	13.2	3.5	1.41
	5	40	28	26.9	13.7	3.8	1.56
	6	38.1	28.1	26.8	13.6	4	1.63
YOLOv5n	7	39.8	28.2	27.5	14.2	4.2	1.7
	9	39.7	28.9	27.7	14.3	4.5	1.84
	11	40.4	28.8	27.7	14.2	4.8	1.99
	14	40	28.8	27.9	14.3	5.3	2.2

Comparison experiments

Models	AP50	AP75	AP	APS	APM	APL	GFLOPS	Params (M)
YOLOv5s	54.8	37.5	35	19.2	40	45.2	16.4	7.23
YOLOv5s (DSConv =5)	43.2	23.5	23.9	13.0	27.6	30.5	14.8	6.45
YOLOv5s (LDConv=5)	56.6	40.7	38	20.8	41.8	52	14.8	6.54
YOLOv5s (LDConv=9)	57.8	41.4	38.7	20.8	42.8	52.3	17.1	7.37
YOLOv5s (LDConv=9, padding)	58.3	41.9	39.2	21.6	43.2	53.5	17.1	7.37
YOLOv5s (Deformable Conv = 3)	58.5	41.8	39.1	20.8	43.4	53.6	17.1	7.37
YOLOv5s (LDConv=11)	58.5	42.1	39.3	21.9	43.3	53.8	18.3	7.91
YOLOv5s (LDConv=11, padding)	58.6	42.1	39.5	21.3	43.7	53.2	18.3	7.91

Comparison experiments

Models	Precision	Recall	mAP50	mAP	GFLOPS	Params (M)
YOLOv5n	73.8	62.2	68.1	41.5	4.2	1.77
YOLOv5n (DSConv=4)	63	50.4	54.2	26.1	3.7	1.55
YOLOv5n (LDConv=4)	76.5	63.6	70.8	46.5	3.7	1.55
YOLOv5n (DSConv=9)	60.6	50.8	53.4	25.3	4.8	1.9
YOLOv5n (LDConv=9)	76.7	65.2	71.8	48.4	4.8	1.9

Exploring experiments

Models	AP50	AP75	AP	APS	APM	APL	GFLOPS	Params (M)
YOLOv8n	49.0	37.1	34.2	16.9	37.1	49.1	8.7	3.15
YOLOv8n-5 (Sampled Shape 1)	49.5	37.6	34.9	16.8	38.2	50.2	8.4	2.94
YOLOv8n-5 (Sampled Shape 2)	49.6	37.8	34.9	15.9	38.4	50.1	8.4	2.94
YOLOv8n-5 (Sampled Shape 3)	49.6	38.1	35	16.6	38.2	50.9	8.4	2.94
YOLOv8n-6 (Sampled Shape 1)	50.1	38.3	35.3	16.6	38.6	51.1	8.6	3.01
YOLOv8n-6 (Sampled Shape 2)	50.2	38.2	35.4	16.6	38.3	51.3	8.6	3.01

Models	Initial Shape	Precision	Recall	mAP50	mAP
YOLOv5n	a	39.5	27.9	26.9	13.7
YOLOv5n	b	39.4	28.2	26.8	13.6
YOLOv5n	c	37.4	27.8	26.1	13.4
YOLOv5n	d	37.5	27	25.5	12.9
YOLOv5n	e	38.4	27.6	26.4	13.4

Citation

You may want to cite:

@inproceedings{dai2017deformable,
  title={Deformable convolutional networks},
  author={Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen},
  booktitle={Proceedings of the IEEE international conference on computer vision},
  pages={764--773},
  year={2017}
}

@article{zhang2024ldconv,
  title={LDConv: Linear deformable convolution for improving convolutional neural networks},
  author={Zhang, Xin and Song, Yingze and Song, Tingting and Yang, Degang and Ye, Yichen and Zhou, Jie and Zhang, Liming},
  journal={Image and Vision Computing},
  pages={105190},
  year={2024},
  publisher={Elsevier}
}

CV-ZhangXin/LDConv

LDConv: Linear deformable convoluton for improving convolutioanl neural networks (Image and Vision Computing)

If you are interested in our other work, you can find information on https://github.com/Liuchen1997/RFAConv.

LDConv provides kernels of different sizes for efficient extraction of features.

Object detection based on COCO2017 and YOLOv5

Object detection based on VOC 7+12 and YOLOv7

Object detection based on VisDrone-DET2021 and YOLOv5

Comparison experiments

Comparison experiments

Exploring experiments

Citation

You may want to cite: