PaddlePaddle/PaddleCustomDevice

[mlu]训练transformer时直接系统卡死,cpu softlock

robvoid opened this issue · 13 comments

您好,请问您能提供下您的CPU型号和MLU型号吗?可以通过 lscpucnmon 这两个命令分别查看。

另外可以通过top命令按CPU使用率排序看下进程使用情况,是否有除了python训练以外的其他进程大量占用CPU进程导致的问题,谢谢!

您好,请问您能提供下您的CPU型号和MLU型号吗?可以通过 lscpucnmon 这两个命令分别查看。

另外可以通过top命令按CPU使用率排序看下进程使用情况,是否有除了python训练以外的其他进程大量占用CPU进程导致的问题,谢谢!

CPU是intel xeon Gold 6330, mlu是mlu-370X4,8卡机器。驱动版本v5.10.22。操作系统centos 7。CNN网络无问题,可以训练 推理。但是transformer的都不行,ViT、SWIN、CSWIN、MobileViT都测试过,只要一训练就CPU Softlock。可能会是驱动版本或者是基础操作系统的问题吗?如果是和操作系统有关,有没有基于centos 7的基础镜像呢?或者能否提供下基础镜像的搭建步骤呢,我尝试下系统和docker镜像都是centos7的情况?

@robvoid 请问您这里使用的开发镜像是否是 https://github.com/PaddlePaddle/PaddleCustomDevice/blob/develop/backends/mlu/README_cn.md 这个文档里面的最新的镜像版本?可以参考readme跑一下下面这个命令,看下输出的MLU SDK的版本是什么

# 2) 检查当前安装版本
python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"
# 预期得到如下输出结果
version: 0.0.0
commit: 5c29d8a4bfd742081ec3b457e02e276f738ef786
cntoolkit: 3.8.4
cnnl: 1.23.2
cnnlextra: 1.6.1
cncl: 1.14.0
mluops: 0.11.0

以上版本兼容的MLU SDK驱动版本是 v5.10.26,和您这里的驱动版本的确是有差异的。

@robvoid 请问您这里使用的开发镜像是否是 https://github.com/PaddlePaddle/PaddleCustomDevice/blob/develop/backends/mlu/README_cn.md 这个文档里面的最新的镜像版本?可以参考readme跑一下下面这个命令,看下输出的MLU SDK的版本是什么

# 2) 检查当前安装版本
python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"
# 预期得到如下输出结果
version: 0.0.0
commit: 5c29d8a4bfd742081ec3b457e02e276f738ef786
cntoolkit: 3.8.4
cnnl: 1.23.2
cnnlextra: 1.6.1
cncl: 1.14.0
mluops: 0.11.0

以上版本兼容的MLU SDK驱动版本是 v5.10.26,和您这里的驱动版本的确是有差异的。

基础镜像版本是registry.baidubce.com/device/paddle-mlu:ubuntu20-x86_64-gcc84-py310
python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"输出是:
version: 0.0.0
commit: dc78966
cntoolkit: 3.8.2
cnnl: 1.23.2
cncl: 1.14.0
mluops: 0.11.0
我试试升级驱动吧

@robvoid 请问您这里使用的开发镜像是否是 https://github.com/PaddlePaddle/PaddleCustomDevice/blob/develop/backends/mlu/README_cn.md 这个文档里面的最新的镜像版本?可以参考readme跑一下下面这个命令,看下输出的MLU SDK的版本是什么

# 2) 检查当前安装版本
python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"
# 预期得到如下输出结果
version: 0.0.0
commit: 5c29d8a4bfd742081ec3b457e02e276f738ef786
cntoolkit: 3.8.4
cnnl: 1.23.2
cnnlextra: 1.6.1
cncl: 1.14.0
mluops: 0.11.0

以上版本兼容的MLU SDK驱动版本是 v5.10.26,和您这里的驱动版本的确是有差异的。

您好,重新拉取镜像和PaddleCustomDevice的代码后,mlu无法编译,报错:
/usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory

手动编译安装paddle/third_party/eigen3,然后将/usr/local/include/eigen3加到CPATH里,可以继续编译PaddleCustomDevice

另外,这个版本的镜像里,cntoolkit版本还是3.8.2
编译信息如下:
+++ dirname tools/compile.sh
++ cd tools/../
++ pwd

  • SOURCE_ROOT=/paddle/git/PaddleCustomDevice/backends/mlu
  • mkdir -p /paddle/git/PaddleCustomDevice/backends/mlu/build
  • cd /paddle/git/PaddleCustomDevice/backends/mlu/build
    ++ uname -i
  • arch=x86_64
  • '[' x86_64 == x86_64 ']'
  • WITH_MKLDNN=ON
  • WITH_ARM=OFF
  • cat
    ========================================
    Configuring cmake in build ...
    -DCMAKE_BUILD_TYPE=Release
    -DWITH_KERNELS=ON
    -DWITH_TESTING=ON
    -DWITH_MKLDNN=ON
    -DWITH_ARM=OFF
    -DON_INFER=OFF
    ========================================
  • set +e
  • cmake .. -DCMAKE_BUILD_TYPE=Release -DWITH_KERNELS=ON -DWITH_TESTING=ON -DWITH_MKLDNN=ON -DWITH_ARM=OFF -DON_INFER=OFF -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
    I0422 11:09:13.208012 69 init.cc:234] ENV [CUSTOM_DEVICE_ROOT]=""
    I0422 11:09:13.208043 69 init.cc:143] Try loading custom device libs from: [""]
    I0422 11:09:13.208073 69 custom_kernel.cc:39] No custom kernel info found in loaded lib(s).
    I0422 11:09:13.208077 69 init.cc:155] Finished in LoadCustomDevice with libs_path: [""]
    -- PADDLE_CORE_LIB: /usr/local/lib/python3.10/dist-packages/paddle/base/libpaddle.so
    -- Run 'git submodule update --init Paddle' in /paddle/git/PaddleCustomDevice
    -- PADDLE_SOURCE_DIR=/paddle/git/PaddleCustomDevice/Paddle
    -- Paddle version is 0.0.0
    -- NEUWARE_HOME: /usr/local/neuware
    -- cntoolkit version is 3.8.2
    -- cnnl version is 1.23.2
    -- cnnl_extra version is 1.6.1
    -- cncl version is 1.14.0
    -- mluops version is 0.11.0
    -- CXX compiler: /usr/bin/c++, version: GNU 8.4.0
    -- C compiler: /usr/bin/cc, version: GNU 8.4.0
    -- AR tools: /usr/bin/ar
    -- Run 'git submodule update --init gflags' in /paddle/git/PaddleCustomDevice/Paddle/third_party
    -- Run 'git submodule update --init glog' in /paddle/git/PaddleCustomDevice/Paddle/third_party
    -- Run 'git submodule update --init pybind' in /paddle/git/PaddleCustomDevice/Paddle/third_party
    -- Run 'git submodule update --init gtest' in /paddle/git/PaddleCustomDevice/Paddle/third_party
    -- Run 'git submodule update --init mkldnn' in /paddle/git/PaddleCustomDevice/Paddle/third_party
    -- Set /paddle/git/PaddleCustomDevice/backends/mlu/build/third_party/install/mkldnn/lib to runtime path
    -- MKLDNN library: /paddle/git/PaddleCustomDevice/backends/mlu/build/third_party/install/mkldnn/lib/libdnnl.so
    -- CONCURRENTQUEUE_VERSION: v1.0.3, CONCURRENTQUEUE_URL: https://github.com/cameron314/concurrentqueue/archive/refs/tags/v1.0.3.tar.gz
    CMake Warning (dev) at /opt/cmake-3.27.7/share/cmake-3.27/Modules/ExternalProject.cmake:3136 (message):
    The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
    not set. The policy's OLD behavior will be used. When using a URL
    download, the timestamps of extracted files should preferably be that of
    the time of extraction, otherwise code that depends on the extracted
    contents might not be rebuilt if the URL changes. The OLD behavior
    preserves the timestamps from the archive instead, but this is usually not
    what you want. Update your project to the NEW behavior or specify the
    DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
    robustness issue.
    Call Stack (most recent call first):
    /opt/cmake-3.27.7/share/cmake-3.27/Modules/ExternalProject.cmake:4345 (_ep_add_download_command)
    cmake/external/concurrentqueue.cmake:35 (ExternalProject_Add)
    CMakeLists.txt:73 (include)
    This warning is for project developers. Use -Wno-dev to suppress it.

CMake Warning (dev) at CMakeLists.txt:81 (find_package):
Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules
are removed. Run "cmake --help-policy CMP0148" for policy details. Use
the cmake_policy command to set the policy and suppress this warning.

This warning is for project developers. Use -Wno-dev to suppress it.

CMake Warning (dev) at CMakeLists.txt:82 (find_package):
Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules
are removed. Run "cmake --help-policy CMP0148" for policy details. Use
the cmake_policy command to set the policy and suppress this warning.

This warning is for project developers. Use -Wno-dev to suppress it.

-- Git commit id is: 7c4db6e
-- Configuring done (2.2s)
-- Generating done (0.0s)
-- Build files have been written to: /paddle/git/PaddleCustomDevice/backends/mlu/build

  • cmake_error=0
  • '[' 0 '!=' 0 ']'
  • '[' x86_64 == x86_64 ']'
  • make -j30
    [ 6%] Built target extern_pybind
    [ 11%] Built target extern_gflags
    [ 19%] Built target extern_mkldnn
    [ 22%] Built target extern_concurrentqueue
    [ 23%] Built target mkldnn_cmd
    [ 28%] Built target extern_glog
    [ 28%] Built target third_party
    [ 28%] Building C object CMakeFiles/mkldnn.dir/mkldnn_dummy.c.o
    [ 29%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/abs_kernel.cc.o
    [ 29%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/arg_max_kernel.cc.o
    [ 30%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/batch_norm_kernel.cc.o
    [ 30%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/accuracy_kernel.cc.o
    [ 31%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/add_n_kernel.cc.o
    [ 32%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/assign_kernel.cc.o
    [ 32%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/activation_kernel.cc.o
    [ 34%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/check_finite_and_unscale_kernel.cc.o
    [ 34%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/bce_loss_kernel.cc.o
    [ 34%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/cast_kernel.cc.o
    [ 35%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/bitwise_kernel.cc.o
    [ 35%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/runtime/runtime.cc.o
    [ 36%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/compare_kernel.cc.o
    [ 38%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/argsort_kernel.cc.o
    [ 38%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/adam_kernel.cc.o
    [ 39%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/clip_kernel.cc.o
    [ 40%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/conv_transpose_kernel.cc.o
    [ 42%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/cumsum_kernel.cc.o
    [ 41%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/cross_entropy_kernel.cc.o
    [ 41%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/coalesce_tensor_kernel.cc.o
    [ 42%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/conv_kernel.cc.o
    [ 43%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/concat_kernel.cc.o
    [ 43%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/contiguous_kernel.cc.o
    [ 44%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/data_kernel.cc.o
    [ 45%] Linking C static library libmkldnn.a
    [ 45%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/deformable_conv_kernel.cc.o
    [ 46%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/dropout_kernel.cc.o
    [ 47%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_add_kernel.cc.o
    [ 47%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_div_kernel.cc.o
    [ 48%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_max_kernel.cc.o
    [ 48%] Built target mkldnn
    [ 49%] Building CXX object CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_min_kernel.cc.o
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_funcs.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/assign_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/contiguous_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:188: CMakeFiles/paddle-custom-mlu.dir/kernels/assign_kernel.cc.o] Error 1
    make[2]: *** Waiting for unfinished jobs....
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:328: CMakeFiles/paddle-custom-mlu.dir/kernels/contiguous_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/argsort_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:174: CMakeFiles/paddle-custom-mlu.dir/kernels/argsort_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/clip_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:272: CMakeFiles/paddle-custom-mlu.dir/kernels/clip_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/data_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/cross_entropy_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:370: CMakeFiles/paddle-custom-mlu.dir/kernels/cross_entropy_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:398: CMakeFiles/paddle-custom-mlu.dir/kernels/data_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/cumsum_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/elementwise_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/elementwise_min_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:384: CMakeFiles/paddle-custom-mlu.dir/kernels/cumsum_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:482: CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_min_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/add_n_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_funcs.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/cast_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/coalesce_tensor_kernel.cc:18:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/batch_norm_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/concat_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:146: CMakeFiles/paddle-custom-mlu.dir/kernels/add_n_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:244: CMakeFiles/paddle-custom-mlu.dir/kernels/cast_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:286: CMakeFiles/paddle-custom-mlu.dir/kernels/coalesce_tensor_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:202: CMakeFiles/paddle-custom-mlu.dir/kernels/batch_norm_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:314: CMakeFiles/paddle-custom-mlu.dir/kernels/concat_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/elementwise_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/elementwise_div_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:454: CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_div_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/elementwise_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/elementwise_max_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/logic_op.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/compare_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/deformable_conv_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:468: CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_max_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:300: CMakeFiles/paddle-custom-mlu.dir/kernels/compare_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:412: CMakeFiles/paddle-custom-mlu.dir/kernels/deformable_conv_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/elementwise_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/dropout_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:426: CMakeFiles/paddle-custom-mlu.dir/kernels/dropout_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/bitwise_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/arg_max_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:230: CMakeFiles/paddle-custom-mlu.dir/kernels/bitwise_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/elementwise_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/activation_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:160: CMakeFiles/paddle-custom-mlu.dir/kernels/arg_max_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:118: CMakeFiles/paddle-custom-mlu.dir/kernels/activation_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/bce_loss_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:216: CMakeFiles/paddle-custom-mlu.dir/kernels/bce_loss_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/adam_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/check_finite_and_unscale_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:132: CMakeFiles/paddle-custom-mlu.dir/kernels/adam_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/accuracy_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/elementwise_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/elementwise_add_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/abs_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:258: CMakeFiles/paddle-custom-mlu.dir/kernels/check_finite_and_unscale_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/conv_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/conv_transpose_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:104: CMakeFiles/paddle-custom-mlu.dir/kernels/accuracy_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:90: CMakeFiles/paddle-custom-mlu.dir/kernels/abs_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:440: CMakeFiles/paddle-custom-mlu.dir/kernels/elementwise_add_kernel.cc.o] Error 1
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:356: CMakeFiles/paddle-custom-mlu.dir/kernels/conv_transpose_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/mlu_baseop.h:23,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/funcs/conv_utils.h:17,
    from /paddle/git/PaddleCustomDevice/backends/mlu/kernels/conv_kernel.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:342: CMakeFiles/paddle-custom-mlu.dir/kernels/conv_kernel.cc.o] Error 1
    In file included from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/common.h:20,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/common_shape.h:18,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/reduce_as_kernel.h:19,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/include/kernels.h:241,
    from /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/extension.h:11,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.h:26,
    from /paddle/git/PaddleCustomDevice/backends/mlu/runtime/runtime.cc:15:
    /usr/local/lib/python3.10/dist-packages/paddle/include/paddle/phi/kernels/funcs/eigen/extensions.h:23:10: fatal error: unsupported/Eigen/CXX11/Tensor: No such file or directory
    #include "unsupported/Eigen/CXX11/Tensor"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    compilation terminated.
    make[2]: *** [CMakeFiles/paddle-custom-mlu.dir/build.make:76: CMakeFiles/paddle-custom-mlu.dir/runtime/runtime.cc.o] Error 1
    make[1]: *** [CMakeFiles/Makefile2:105: CMakeFiles/paddle-custom-mlu.dir/all] Error 2
    make: *** [Makefile:91: all] Error 2
  • make_error=2
  • '[' 2 '!=' 0 ']'
  • echo 'Make Error Found !!!'
    Make Error Found !!!
  • exit 7

升级驱动为5.10.26后,还是会出现kernel:NMI watchdog: BUG: soft lockup - CPU#8 stuck for 23s! [kworker/u224:5:53381]
对比版本,主要是cntoolkit 3.8.2 vs 3.8.4。cnnlextra当前代码的setup.py.in里没有写,加上后是一致的1.6.1,能否更新一下基础镜像以及mlu部分的代码呢?

看现象,感觉是transformer相关的算子无法在mlu上计算,回滚到cpu上。是否和cntoolkit的版本有关呢?

您好,Eigen的编译问题可以参考 #1177 的回复。

MLU驱动的问题,可能需要请寒武纪同学帮忙确认下,辛苦 @ShawnNew 能帮忙确认 cntoolkit 的版本问题吗?

python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"

您好,cntoolkit的版本号目前存在误读 neuware.cmake (不应该读取cndev的版本),正确的检查方法是 cntoolkit docs,后续我们会改进正确的版号信息打印。

关于transformer的训练,方便对齐下paddlepaddle,paddlenlp的版本以及训练命令吗?

您好,请问这个问题是否已经解决,谢谢!

您好,请问这个问题是否已经解决,谢谢!

还未解决

您好,请问这个问题是否已经解决,谢谢!

还未解决

请问您能再提供一下环境信息吗?

  1. MLU的驱动等版本信息
rpm -qa | grep mlu
rpm -qa | grep cntoolkit
rpm -qa | grep cnnl
rpm -qa | grep cncl
rpm -qa | grep mluops
  1. 参考这个文档,重新安装和编译一下Paddle MLU安装包,之前编译EIGNE报错的问题已经解决了
    https://github.com/PaddlePaddle/PaddleCustomDevice/blob/develop/backends/mlu/README_cn.md

运行如下命令输出,Paddle版本信息

python -c "import paddle; paddle.version.show()"
python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"
  1. 提供下您这里训练transformer模型的时候用到的PaddleNLP代码信息

包括PaddleNLP的分支,和Commit ID,以及训练启动脚本的命令

我们通过以上信息再尝试定位下问题,谢谢!

您好,请问这个问题是否已经解决,谢谢!

还未解决

请问您能再提供一下环境信息吗?

  1. MLU的驱动等版本信息
rpm -qa | grep mlu
rpm -qa | grep cntoolkit
rpm -qa | grep cnnl
rpm -qa | grep cncl
rpm -qa | grep mluops
  1. 参考这个文档,重新安装和编译一下Paddle MLU安装包,之前编译EIGNE报错的问题已经解决了
    https://github.com/PaddlePaddle/PaddleCustomDevice/blob/develop/backends/mlu/README_cn.md

运行如下命令输出,Paddle版本信息

python -c "import paddle; paddle.version.show()"
python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"
  1. 提供下您这里训练transformer模型的时候用到的PaddleNLP代码信息

包括PaddleNLP的分支,和Commit ID,以及训练启动脚本的命令

我们通过以上信息再尝试定位下问题,谢谢!

  1. MLU的驱动等版本信息
    mlu驱动5.10.26
    docker容器内的cn系列软件包信息:
    cntoolkit:
    Installed: 3.8.4-1.ubuntu20.04
    Candidate: 3.8.4-1.ubuntu20.04
    Version table:
    *** 3.8.4-1.ubuntu20.04 100
    100 /var/lib/dpkg/status
    cnnl:
    Installed: 1.23.2-1.ubuntu20.04
    Candidate: 1.23.2-1.ubuntu20.04
    Version table:
    *** 1.23.2-1.ubuntu20.04 100
    100 /var/lib/dpkg/status
    cncl:
    Installed: 1.14.0-1.ubuntu20.04
    Candidate: 1.14.0-1.ubuntu20.04
    Version table:
    *** 1.14.0-1.ubuntu20.04 100
    100 /var/lib/dpkg/status
    mluops:
    Installed: 0.11.0-1.ubuntu20.04
    Candidate: 0.11.0-1.ubuntu20.04
    Version table:
    *** 0.11.0-1.ubuntu20.04 100
    100 /var/lib/dpkg/status

  2. Paddle版本信息
    python -c "import paddle; paddle.version.show()"
    I0510 15:24:01.231060 15923 init.cc:236] ENV [CUSTOM_DEVICE_ROOT]=/paddle/VirutalEnv/PaddleNLP/lib/python3.10/site-packages/paddle_custom_device
    I0510 15:24:01.231109 15923 init.cc:145] Try loading custom device libs from: [/paddle/VirutalEnv/PaddleNLP/lib/python3.10/site-packages/paddle_custom_device]
    I0510 15:24:01.322983 15923 custom_device.cc:1099] Succeed in loading custom runtime in lib: /paddle/VirutalEnv/PaddleNLP/lib/python3.10/site-packages/paddle_custom_device/libpaddle-custom-mlu.so
    I0510 15:24:01.324756 15923 custom_kernel.cc:63] Succeed in loading 262 custom kernel(s) from loaded lib(s), will be used like native ones.
    I0510 15:24:01.324846 15923 init.cc:157] Finished in LoadCustomDevice with libs_path: [/paddle/VirutalEnv/PaddleNLP/lib/python3.10/site-packages/paddle_custom_device]
    I0510 15:24:01.324887 15923 init.cc:242] CustomDevice: mlu, visible devices count: 8
    commit: df85e2d414a0fd0cac792edb8fcc39f18549caf4
    cuda: False
    cudnn: False
    nccl: 0
    xpu: False
    xpu_xccl: False
    xpu_xhpc: False
    cinn: False

python -c "import paddle_custom_device; paddle_custom_device.mlu.version()"
version: 0.0.0
commit: 024f62a
cntoolkit: 3.8.2
cnnl: 1.23.2
cnnl_extra: 1.6.1
cncl: 1.14.0
mluops: 0.11.0

  1. PaddleNLP相关信息
    分支 Develop
    commit d6ac1bd5daf59c85e25635c9efc6835019c2901a
    训练命令和数据,可参考model_zoo/uie
    将“--device gpu”改为“--device mlu",并去掉'--gpus "0,1,2,3,4,5,6,7"'单卡训练不会出现cpu softlock,但是多卡时会
    微信截图_20240510153859
    (如果不去掉'--gpus "0,1,2,3,4,5,6,7"',会报错找不到device)

@qili93 @ShawnNew 升级到commit 3631514,问题解决了