DingXiaoH/RepLKNet-pytorch

Installing large_depthwise_conv2d_torch_extension failed on Windows 10

newcomertzc opened this issue · 7 comments

My Python version: 3.8.12 / 3.8.13
My Pytorch version: 1.8.2 / 1.10.1

I tried to install the extension from https://github.com/MegEngine/cutlass/tree/master/examples/19_large_depthwise_conv2d_torch_extension. on my computer with Windows 10 OS, but the following errors are raised:

D:\*****\lib\site-packages\torch\include\pybind11\cast.h(1429): error: too few arguments for template template parameter "Tuple"
          detected during instantiation of class "pybind11::detail::tuple_caster<Tuple, Ts...> [with Tuple=std::pair, Ts=<T1, T2>]"
(1507): here

D:\*****\lib\site-packages\torch\include\pybind11\cast.h(1503): error: too few arguments for template template parameter "Tuple"
          detected during instantiation of class "pybind11::detail::tuple_caster<Tuple, Ts...> [with Tuple=std::pair, Ts=<T1, T2>]"
(1507): here

Hi, we have not tried Windows, but it seems a cuda-version-related error. We used both cuda 10.2 and 11.2 and it worked fine.

老哥,加个qq聊一下,我也是win。1183654643

老哥,加个qq聊一下,我也是win。1183654643

请问你解决了吗,我win10安装报错error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe' failed with exit status 1181

老哥,加个qq聊一下,我也是win。1183654643

请问你解决了吗,我win10安装报错error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe' failed with exit status 1181

解决了,装linux系统,训练速度提升了4倍左右,但还是比较慢

老哥,加个qq聊一下,我也是win。1183654643

请问你解决了吗,我win10安装报错error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe' failed with exit status 1181

解决了,装linux系统,训练速度提升了4倍左右,但还是比较慢

老哥,请问您如何解决的呢,您的联系方式是正确的吗,QQ搜索不到

老哥,加个qq聊一下,我也是win。1183654643

请问你解决了吗,我win10安装报错error: 命令 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe' 失败,退出状态为 1181

解决了,装linux系统,训练速度提升了4倍左右,但还是比较慢

老哥,请问您如何解决的呢,您的联系方式是正确的吗,QQ搜索不到

qq注销了,微信 liu_yi_fei_lao_gong

In order to get 19_large_depthwise_conv2d_torch_extension installed correctly on windows 10, you can try my steps to install it.

  1. Modify the cast.h file according to BachiLi/diffvg#26 (comment). This file can be found in C:\Users\****\AppData\Roaming\Python\Python39\site-packages\torch\include\pybind11\cast.h where pybind is located.

Specifically change this part of the code
template <typename T1, typename T2> class type_caster<std::pair<T1, T2>> : public tuple_caster<std::pair, T1, T2> {};
to:

template <typename T1, typename T2> class type_caster<std::pair<T1, T2>> {
    typedef std::pair<T1, T2> type;
public:
    bool load(handle src, bool convert) {
        if (!isinstance<sequence>(src))
            return false;
        const auto seq = reinterpret_borrow<sequence>(src);
        if (seq.size() != 2)
            return false;
        return first.load(seq[0], convert) && second.load(seq[1], convert);
    }

    static handle cast(const type &src, return_value_policy policy, handle parent) {
        auto o1 = reinterpret_steal<object>(make_caster<T1>::cast(src.first, policy, parent));
        auto o2 = reinterpret_steal<object>(make_caster<T2>::cast(src.second, policy, parent));
        if (!o1 || !o2)
            return handle();
        tuple result(2);
        PyTuple_SET_ITEM(result.ptr(), 0, o1.release().ptr());
        PyTuple_SET_ITEM(result.ptr(), 1, o2.release().ptr());
        return result.release();
    }

    static constexpr auto name = _("Pair");

    template <typename T> using cast_op_type = type;

    operator type() & { return type(cast_op<T1>(first), cast_op<T2>(second)); }
    operator type() && { return type(cast_op<T1>(std::move(first)), cast_op<T2>(std::move(second))); }
protected:
    make_caster<T1> first;
    make_caster<T2> second;
};
  1. Modify the cuda files cutlass\19_large_depthwise_conv2d_torch_extension\*.cu.
  • 19_large_depthwise_conv2d_torch_extension\backward_data_fp16.cu
  • 19_large_depthwise_conv2d_torch_extension\backward_data_fp32.cu
  • 19_large_depthwise_conv2d_torch_extension\backward_filter_fp16.cu
  • 19_large_depthwise_conv2d_torch_extension\backward_filter_fp32.cu
  • 19_large_depthwise_conv2d_torch_extension\forward_fp16.cu
  • 19_large_depthwise_conv2d_torch_extension\forward_fp32.cu

chage

options.update({input.size(0), input.size(2), input.size(3), input.size(1)},
                   {weight.size(0), weight.size(2), weight.size(3), 1});

to

options.update({(int)input.size(0), (int)input.size(2), (int)input.size(3), (int)input.size(1)},
                   {(int)weight.size(0), (int)weight.size(2), (int)weight.size(3), 1});
  1. run `python setup.py install --user'.
  2. Finish.

Test on windows10, python 3.9, vs2022.