Unclear buildbot failure email from clang-cuda-l4
jayfoad opened this issue · 3 comments
jayfoad commented
I got this buildbot failure email:
The Buildbot has detected a new failure on builder clang-cuda-l4 while building llvm.
Full details are available at:
https://lab.llvm.org/buildbot/#/builders/101/builds/364
Worker for this Build: cuda-l4-0
Blamelist:
Jay Foad <jay.foad@amd.com>,
Shengchen Kan <shengchen.kan@intel.com>,
Stephen Tozer <stephen.tozer@sony.com>
BUILD FAILED: failed '/buildbot/cuda-build --jobs=' (failure)
Step 3 (annotate) failure: '/buildbot/cuda-build --jobs=' (failure)
...
NV_LIBCUBLAS_PACKAGE_NAME=libcublas-12-2
NV_LIBCUBLAS_VERSION=12.2.5.6-1
NV_LIBCUSPARSE_VERSION=12.1.2.141-1
NV_LIBNCCL_PACKAGE=libnccl2=2.18.5-1+cuda12.2
NV_LIBNCCL_PACKAGE_NAME=libnccl2
NV_LIBNCCL_PACKAGE_VERSION=2.18.5-1
NV_LIBNPP_PACKAGE=libnpp-12-2=12.2.1.4-1
NV_LIBNPP_VERSION=12.2.1.4-1
NV_NVTX_VERSION=12.2.140-1
PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/buildbot
PWD=/buildbot/cuda-l4-0/work/cuda-l4-0/clang-cuda-l4/build
SHLVL=1
TERM=dumb
WORK_DIR=/buildbot/cuda-l4-0/work
_=/usr/local/bin/buildbot-worker
using PTY: False
++ echo @@@HALT_ON_FAILURE@@@
++ readlink -f ..
+ buildbot_dir=/buildbot/cuda-l4-0/work/cuda-l4-0/clang-cuda-l4
+ revision=919c547130cfd1cd75ccf148cbf2334b27b2f37f
+ GPU_ARCH=sm_89
+ CUDA_TEST_JOBS=1
+ build_base=/buildbot/cuda-l4-0/work/clang-cuda-l4
+ mkdir -p /buildbot/cuda-l4-0/work/clang-cuda-l4
+ build_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/build
+ libc_build_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/build-libc
+ clang_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/clang
+ testsuite_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/llvm-test-suite
+ llvm_src_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/llvm
+ ext_dir=/buildbot/cuda-l4-0/work/clang-cuda-l4/external
+ inner_pid=342838
+ do_build_and_test
+ trap 'handle_termination $inner_pid' TERM
+ wait 342838
+ fetch_prebuilt_clang 919c547130cfd1cd75ccf148cbf2334b27b2f37f /buildbot/cuda-l4-0/work/clang-cuda-l4/clang
+ local revision=919c547130cfd1cd75ccf148cbf2334b27b2f37f
+ local destdir=/buildbot/cuda-l4-0/work/clang-cuda-l4/clang
+ local 'timeout=10 minutes'
++ date -ud '10 minutes' +%s
+ local endtime=1718876716
++ storage_location llvm-919c547130cfd1cd75ccf148cbf2334b27b2f37f
++ local file=llvm-919c547130cfd1cd75ccf148cbf2334b27b2f37f
++ local default_storage_prefix=gs://cudabot-gce-artifacts/
++ echo gs://cudabot-gce-artifacts/llvm-919c547130cfd1cd75ccf148cbf2334b27b2f37f
+ local snapshot=gs://cudabot-gce-artifacts/llvm-919c547130cfd1cd75ccf148cbf2334b27b2f37f
+ step 'Waiting for LLVM & Clang snapshot to be built. '
+ local 'name=Waiting for LLVM & Clang snapshot to be built. '
+ local summary=
+ echo '@@@BUILD_STEP Waiting for LLVM & Clang snapshot to be built. @@@'
+ step_summary_clear
Sincerely,
LLVM Buildbot
The email does not explain why the build failed. If I look into the logs, I see things like:
FAIL: test-suite :: External/CUDA/cmath-cuda-11.8-c++11-libc++.test (5 of 12)
******************** TEST 'test-suite :: External/CUDA/cmath-cuda-11.8-c++11-libc++.test' FAILED ********************
/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/cmath-cuda-11.8-c++11-libc++.test.out --redirect-input /dev/null --summary /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/cmath-cuda-11.8-c++11-libc++.test.time /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/cmath-cuda-11.8-c++11-libc++
cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA ; /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/cmath-cuda-11.8-c++11-libc++.test.out cmath.reference_output-cuda-11.8-c++11-libc++
+ cd /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA
+ /buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target /buildbot/cuda-l4-0/work/clang-cuda-l4/build/External/CUDA/Output/cmath-cuda-11.8-c++11-libc++.test.out cmath.reference_output-cuda-11.8-c++11-libc++
/buildbot/cuda-l4-0/work/clang-cuda-l4/build/tools/fpcmp-target: Comparison failed, textual difference between 'C' and 'S'
and
Failed Tests (8):
test-suite :: External/CUDA/algorithm-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/assert-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/axpy-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/cmath-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/complex-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/math_h-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/new-cuda-11.8-c++11-libc++.test
test-suite :: External/CUDA/printf-cuda-11.8-c++11-libc++.test
jayfoad commented
gkistanova commented
Thanks for reporting this, Jay!
This has been fixed. Feel free to reopen if you still see this.