NVIDIA/DALI

docker compile failed about pybind11

Closed this issue · 4 comments

Version

release_v1.38

Describe the bug.

docker compile failed about pybind11

Minimum reproducible example

git clone -b release_v1.38 https://github.com/NVIDIA/DALI.git

git submodule sync --recursive
git submodule update --init --recursive

sudo
REBUILD_BUILDERS=NO
CUDA_VERSION=11.8
PYVER=3.8
./build.sh

Relevant log output

+ cmake ../ -DCMAKE_INSTALL_PREFIX=. -DARCH=x86_64 -DCUDA_TARGET_ARCHS= -DCMAKE_BUILD_TYPE=Release -DBUILD_TEST=ON -DBUILD_BENCHMARK=ON -DBUILD_NVTX= -DBUILD_PYTHON=ON -DBUILD_LMDB=ON -DBUILD_JPEG_TURBO=ON -DBUILD_OPENCV=ON -DBUILD_PROTOBUF=ON -DBUILD_NVJPEG=ON -DBUILD_NVJPEG2K=ON -DBUILD_LIBTIFF=ON -DBUILD_NVOF=ON -DBUILD_NVDEC=ON -DBUILD_LIBSND=ON -DBUILD_NVML=ON -DBUILD_FFTS=ON -DBUILD_CFITSIO=ON -DBUILD_CUFILE=OFF -DBUILD_NVCOMP=OFF -DBUILD_CVCUDA=ON -DLINK_LIBCUDA=OFF -DWITH_DYNAMIC_CUDA_TOOLKIT=OFF -DWITH_DYNAMIC_NVJPEG=ON -DWITH_DYNAMIC_CUFFT=ON -DWITH_DYNAMIC_NPP=ON -DWITH_DYNAMIC_NVIMGCODEC=ON -DVERBOSE_LOGS=OFF -DWERROR=ON -DBUILD_WITH_ASAN=OFF -DBUILD_WITH_LSAN=OFF -DBUILD_WITH_UBSAN=OFF -DPYTHON_VERSIONS= -DDALI_BUILD_FLAVOR= -DTIMESTAMP=20240615 -DGIT_SHA=8f2a43f3436cafeafa4774513f7daf68ebbffad8
CUDA_TARGET_ARCHS cannot be empty, setting to the default
-- CUDA version: 11.8.89, major: 11, minor: 8, patch: , short: , digit-only: 
-- Compatible CUDA version: major: 11, minor: 0, patch: 0, short: , digit-only: 
-- DALI_CLANG_ONLY -- OFF
-- Building DALI for Python versions: 3.8;3.9;3.10;3.11;3.12
-- Generating python stubs using interpreter: 
-- Building shared-object libraries
-- Add to rpath: $ORIGIN
-- Add to rpath: $ORIGIN/../cufft/lib
-- Add to rpath: $ORIGIN/../npp/lib
-- Add to rpath: $ORIGIN/../nvjpeg/lib
-- Add to rpath: $ORIGIN/../nvimgcodec
-- Add to rpath: /opt/nvidia/nvimgcodec_cuda11/lib64
-- DALI version: 1.38.0
-- DALI_extra version: 4d95e862cc8aa6495707a3a6d84cbf75dff812ef
-- Build configuration: Release
-- CUDA .cu files compiler: /usr/local/cuda/bin/nvcc
-- CUDA supported archs: 35;50;60;70;80;90
-- CUDA targeted archs: 35;50;60;70;80;90
-- Generated CMAKE_CUDA_ARCHITECTURES: 35-real;50-real;60-real;70-real;80-real;90-real;90-virtual
nvJPEG found in /usr/local/cuda/targets/x86_64-linux/include
nvJPEG is using new API
nvJPEG lossless NOT supported
OpenCV libraries: opencv_core;opencv_imgproc;opencv_imgcodecs
-- Found OpenCV: /usr/local/include/opencv4 (found suitable version "4.9.0", minimum required is "3.0")
-- Failed to find LLVM FileCheck
-- git version: v1.8.3 normalized to 1.8.3
-- Google Benchmark version: 1.8.3
-- Performing Test HAVE_STD_REGEX -- success
-- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
-- Performing Test HAVE_POSIX_REGEX -- success
-- Performing Test HAVE_STEADY_CLOCK -- success
-- Performing Test HAVE_PTHREAD_AFFINITY -- failed to compile
Using libjpeg-turbo at /usr/local/lib/libjpeg.so
Using libtiff at /usr/local/lib/libtiff.so
-- pybind11 v2.13.0 dev1
-- Found libsnd: /usr/local/lib/libsndfile.so
-- Found libtar: /usr/local/lib/libtar.a
-- Found cfitsio: /usr/local/lib/libcfitsio.so
-- Found CUDAToolkit: /usr/local/cuda/include (found suitable version "11.8.89", minimum required is "11.8") 
-- Found CUDAToolkit: /usr/local/cuda/include (found version "11.8.89") 
-- 
-- General configuration for CVCUDA-0.7.0-beta
-- 
-- Build options
--     CMAKE_INSTALL_PREFIX     : /opt/dali/build-docker-Release-118_x86_64
--     WARNINGS_AS_ERRORS       : off
--     ENABLE_COMPAT_OLD_GLIBC  : ON
--     BUILD_TESTS              : off
--     BUILD_PYTHON             : off
--     ENABLE_SANITIZER         : off
--     BUILD_BENCH              : off
--     ENABLE_TEGRA             : off
--     Compilers used in public API header compatibility tests:
--         (none)
-- 
-- Platform
--     Host             : Linux 5.15.146.1-microsoft-standard-WSL2 x86_64
--     Target           : Linux 5.15.146.1-microsoft-standard-WSL2 x86_64
--     CMake            : 3.20.1
--     CMake generator  : Unix Makefiles
--     CMake build tool : /usr/bin/gmake
--     Configuration    : Release
--     ccache           : CCACHE_EXEC-NOTFOUND
--     ccache stats log : 
-- 
-- Default compiler/linker config
--     C++ Compiler : /opt/rh/devtoolset-10/root/usr/bin/c++ (10.2.1)
--     C++ Standard : 17
--     C++ Flags    :   -Wall -Wno-unknown-pragmas -Wpointer-arith -Wmissing-declarations -Wredundant-decls -Wmultichar -Wno-unused-local-typedefs -Wunused -Wsuggest-override -O3 -DNDEBUG
-- 
--     C Compiler   : /opt/rh/devtoolset-10/root/usr/bin/cc (10.2.1)
--     C Flags      :   -Wall -Wno-unknown-pragmas -Wpointer-arith -Wmissing-declarations -Wredundant-decls -Wmultichar -Wno-unused-local-typedefs -Wunused -O3 -DNDEBUG
-- 
--     CUDA Compiler : /usr/local/cuda/bin/nvcc (11.8.89)
--     CUDA Arch     : 35-real;50-real;60-real;70-real;80-real;90-real;90-virtual
--     CUDA flags    :  --compiler-options "-fvisibility=hidden -Wno-free-nonheap-object" --Wno-deprecated-gpu-targets -Xfatbin -compress-all  -Wall -Wno-unknown-pragmas -Wpointer-arith -Wmissing-declarations -Wredundant-decls -Wmultichar -Wno-unused-local-typedefs -Wunused -Wsuggest-override -Wno-tautological-compare -Xfatbin=--compress-all -O3 -DNDEBUG
--     CUDA toolkit target dir : /usr/local/cuda
-- 
--     Compiler Options    : 
--     Definitions         :  -DDALI_USE_NVJPEG -DNVJPEG_LIBRARY_0_2_0 -DNVJPEG_PREALLOCATE_API -DDALI_USE_JPEG_TURBO
-- 
--     Linker flags (exec) :  
--     Linker flags (lib)  :  
-- 
--     Link-time optim.: supported YES, enabled ON
-- 
-- nvImageCodec - dynamic load
-- Using nvimgcodec_INCLUDE_DIR=/opt/dali/build-docker-Release-118_x86_64/_deps/nvimgcodec_headers-src/11/include
-- NVIMGCODEC_DEFAULT_INSTALL_PATH=/opt/nvidia/nvimgcodec_cuda11
-- AWSSDK_INCLUDE_DIR=/usr/local/include
-- AWSSDK_LIBRARIES=/usr/local/lib/libaws-cpp-sdk-s3.so;/usr/local/lib/libaws-cpp-sdk-core.so
-- Enabling TensorFlow TFRecord file format support
-- BUILD_NVTX -- ON
-- BUILD_PYTHON -- ON
-- BUILD_SHM_WRAPPER -- ON
-- BUILD_LMDB -- ON
-- BUILD_JPEG_TURBO -- ON
-- BUILD_LIBTIFF -- ON
-- BUILD_LIBSND -- ON
-- BUILD_LIBTAR -- ON
-- BUILD_FFTS -- ON
-- BUILD_CFITSIO -- ON
-- BUILD_CVCUDA -- ON
-- BUILD_NVJPEG -- ON
-- BUILD_NVJPEG2K -- ON
-- BUILD_NVOF -- ON
-- BUILD_NVDEC -- ON
-- BUILD_FFMPEG -- ON
-- BUILD_NVCOMP -- OFF
-- BUILD_NVML -- ON
-- BUILD_CUFILE -- OFF
-- BUILD_NVIMAGECODEC -- ON
-- BUILD_AWSSDK -- ON
-- LINK_DRIVER -- OFF
-- WITH_DYNAMIC_NVJPEG -- OFF
-- WITH_DYNAMIC_CUFFT -- OFF
-- WITH_DYNAMIC_NPP -- OFF
-- WITH_DYNAMIC_NVIMGCODEC -- ON
-- CUDA Compiler: /usr/local/cuda/bin/nvcc

Include directories = /usr/local/include/opencv4;/opt/dali/third_party/googletest/googletest/include;/opt/dali/third_party/benchmark/include/benchmark;/usr/local/include;/usr/local/include;/usr/local/include;/usr/local/include;/opt/dali/third_party/boost/preprocessor/include;/opt/dali/third_party/rapidjson/include;/opt/dali/third_party/ffts/include;/opt/dali/third_party/cutlass/include;/opt/dali/third_party/cutlass/tools/util/include;/opt/dali/build-docker-Release-118_x86_64/_deps/nvimgcodec_headers-src/11/include;/usr/local/include;/opt/dali/third_party/turing_of;/opt/dali;/opt/dali/include;/opt/dali/build-docker-Release-118_x86_64;/usr/local/cuda/targets/x86_64-linux/include

-- Adding dependencies to target `dali`: '/usr/local/cuda/targets/x86_64-linux/lib/libnvjpeg_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libnvjpeg2k_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libnppicc_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libnppig_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libnppc_static.a;/usr/local/cuda/targets/x86_64-linux/lib/libculibos.a;opencv_core;opencv_imgproc;opencv_imgcodecs;/usr/local/lib/libjpeg.so;/usr/local/lib/libtiff.so;/usr/local/lib/liblmdb.a;/usr/local/lib/libsndfile.so;/usr/local/lib/libtar.a;avformat;avformat;avcodec;avfilter;avutil;swscale;ffts;cocoapi;/usr/local/lib/libcfitsio.so;protobuf::libprotobuf;/usr/local/cuda/targets/x86_64-linux/lib/libcudart_static.a;rt;pthread;m;dl'
-- Configuring done
-- Generating done

image

/opt/dali/version:1:1: error: too many decimal points in number
1 | 1.38.0

/opt/dali/version:1:1: error: expected unqualified-id before numeric constant

/opt/dali/dali/operators/python_function/dltensor_function.h:175:8: required from here
/opt/dali/dali/operators/python_function/dltensor_function.h:188:18: error: no matching function for call to ‘pybind11::list::list()’

/opt/dali/third_party/pybind11/include/pybind11/detail/../detail/../pytypes.h:1210:17: error: inline function ‘bool pybind11::detail::operator!=(const It&, const It&)’ used but never defined [-Werror]

[ 87%] Linking CXX shared library ../python/nvidia/dali/libdali_operators.so
[ 87%] Built target dali_operators
make: *** [all] Error 2

Other/Misc.

env:
window wsl2 ubuntu20.04

image

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report

Further details:

The same problem also exists in window wsl2 ubuntu22.04 dali1.39.0:

/opt/dali/version:1:1: error: too many decimal points in number
1 | 1.39.0

/opt/dali/third_party/pybind11/include/pybind11/detail/../detail/descr.h:47:52: error: ‘index_sequence’ has not been declared
47 | index_sequence<Is1...>,
| ^~~~~~~~~~~~~~

Consolidate compiler generated dependencies of target dali_operators
[ 77%] Built target dali_operators
[ 87%] Built target dali_kernel_test
make: *** [all] Error 2

Hi @baotou5937,

Thank you for reaching us. Let me try to build DALI using the provided command and see what happens.
We don't build DALI on wsl, still, I don't see any reason why it should not work.

Hi @baotou5937,

The issue is caused by the case insensitivity of the Windows file system. Please check this guide to make it case-sensitive for the DALI repository.

Thank you, I have completed the compilation. The command I used was: Windows PowerShell (Administrator) -> cd <DALI repository path>; (Get-ChildItem -Recurse -Directory).FullName | ForEach-Object { fsutil.exe file setCaseSensitiveInfo $_ enable }