rishikksh20/convolution-vision-transformers

Implementation of convolutional projection

leoxiaobin opened this issue · 7 comments

hi, @rishikksh20

Thank you for your quick implementation!

I notice that your depth-wise separable convolution is implemented as depth-wise conv --> point-wise conv. In the paper, CvT's depth-wise seprable convolution is implemented as depth-wise conv --> bn --> point-wise conv.
image

@leoxiaobin oh my bad I just reuse my older SepCon2d class in this repo, I will update that class.

Done, do let me know if you find any other vulnerabilities in code.

@leoxiaobin Hi! When will the official implementation be open source?

@leoxiaobin
The official code URL denoted in the paper is broken. Do you have a plan to open the code and model weights?

hi, @youngwanLEE, you can get the pre-trained model at https://github.com/microsoft/CvT

@leoxiaobin Hi! When will the official implementation be open source?

Please find it at https://github.com/microsoft/CvT

Thanks @leoxiaobin I am adding repo on ReadMe.