Unable to compile the test example
dborowiec10 opened this issue · 17 comments
I have been trying to follow the instructions on Ubuntu 19.10 with CUDA 10.1 and LLVM 10
This is the output that I get when I compile using clang_cf++:
clang version 10.0.0 (https://github.com/llvm/llvm-project.git 90c78073f73eac58f4f8b4772a896dc8aac023bc)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm-10.0/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Candidate multilib: .;@m64
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda, version 10.1
"/opt/llvm-10.0/bin/clang-10" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-linux-gnu -S -disable-free -main-file-name saxpy.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -no-integrated-as -fcuda-is-device -mlink-builtin-bitcode /usr/local/cuda/nvvm/libdevice/libdevice.10.bc -target-feature +ptx64 -target-sdk-version=10.1 -target-cpu sm_30 -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /opt/llvm-10.0/lib/clang/10.0.0 -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /usr/local/cuda/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -std=c++11 -fdeprecated-macro -fno-dwarf-directory-asm -fno-autolink -fdebug-compilation-dir /home/clusteradmin/cuda-flux -ferror-limit 19 -fmessage-length 0 -fgnuc-version=4.2.1 -finline-functions -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -load /opt/cuda-flux/lib/libcuda_flux_pass.so -o /tmp/saxpy-cfdccc.s -x cuda test/saxpy.cu
clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/include"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward"
ignoring duplicate directory "/usr/local/include"
ignoring duplicate directory "/opt/llvm-10.0/lib/clang/10.0.0/include"
ignoring duplicate directory "/usr/include/x86_64-linux-gnu"
ignoring duplicate directory "/usr/include"
#include "..." search starts here:
#include <...> search starts here:
/opt/llvm-10.0/lib/clang/10.0.0/include/cuda_wrappers
/usr/local/cuda/include
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward
/usr/local/include
/opt/llvm-10.0/lib/clang/10.0.0/include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
CUDA Flux: Instrumenting device code...
CUDA Flux: Module prefix: saxpy.cu_6bc7afe4
sh: 1: llc: not found
CUDA Flux: Working on kernel: _Z5saxpyifPfS_
CUDA Flux: BlockCount: 3
"/usr/local/cuda/bin/ptxas" -m64 -O0 -v --gpu-name sm_30 --output-file /tmp/saxpy-f2566b.o /tmp/saxpy-cfdccc.s
ptxas /tmp/saxpy-cfdccc.s, line 221; warning : Instruction 'vote' without '.sync' is deprecated since PTX ISA version 6.0 and will be discontinued in a future PTX ISA version
ptxas info : 3 bytes gmem
ptxas info : Compiling entry function '_Z5saxpyifPfS__clone' for 'sm_30'
ptxas info : Function properties for _Z5saxpyifPfS__clone
32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 23 registers, 360 bytes cmem[0]
ptxas info : Function properties for incBlockCounter_mt
0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Compiling entry function '_Z5saxpyifPfS_' for 'sm_30'
ptxas info : Function properties for _Z5saxpyifPfS_
32 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 11 registers, 344 bytes cmem[0]
"/usr/local/cuda/bin/fatbinary" -64 --create /tmp/saxpy-4b34f0.fatbin --image=profile=sm_30,file=/tmp/saxpy-f2566b.o --image=profile=compute_30,file=/tmp/saxpy-cfdccc.s
"/opt/llvm-10.0/bin/clang-10" -cc1 -triple x86_64-unknown-linux-gnu -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name saxpy.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fmath-errno -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /opt/llvm-10.0/lib/clang/10.0.0 -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /usr/local/cuda/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -std=c++11 -fdeprecated-macro -fdebug-compilation-dir /home/clusteradmin/cuda-flux -ferror-limit 19 -fmessage-length 0 -fgnuc-version=4.2.1 -finline-functions -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -load /opt/cuda-flux/lib/libcuda_flux_pass.so -fcuda-include-gpubinary /tmp/saxpy-4b34f0.fatbin -faddrsig -o /tmp/saxpy-498574.o -x cuda test/saxpy.cu
clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring nonexistent directory "/include"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8"
ignoring duplicate directory "/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward"
ignoring duplicate directory "/usr/local/include"
ignoring duplicate directory "/opt/llvm-10.0/lib/clang/10.0.0/include"
ignoring duplicate directory "/usr/include/x86_64-linux-gnu"
ignoring duplicate directory "/usr/include"
#include "..." search starts here:
#include <...> search starts here:
/opt/llvm-10.0/lib/clang/10.0.0/include/cuda_wrappers
/usr/local/cuda/include
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8
/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward
/usr/local/include
/opt/llvm-10.0/lib/clang/10.0.0/include
/usr/include/x86_64-linux-gnu
/usr/include
End of search list.
CUDA Flux: instrumenting host code...
CUDA Flux: CUDA Version 10.1
clang-10: /home/clusteradmin/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:819: void clang::BackendConsumer::DiagnosticHandlerImpl(const llvm::DiagnosticInfo&): Assertion `CurLinkModule' failed.
Stack dump:
0. Program arguments: /opt/llvm-10.0/bin/clang-10 -cc1 -triple x86_64-unknown-linux-gnu -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name saxpy.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fmath-errno -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /opt/llvm-10.0/lib/clang/10.0.0 -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /usr/local/cuda/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/x86_64-linux-gnu/c++/8 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/backward -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -std=c++11 -fdeprecated-macro -fdebug-compilation-dir /home/clusteradmin/cuda-flux -ferror-limit 19 -fmessage-length 0 -fgnuc-version=4.2.1 -finline-functions -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -load /opt/cuda-flux/lib/libcuda_flux_pass.so -fcuda-include-gpubinary /tmp/saxpy-4b34f0.fatbin -faddrsig -o /tmp/saxpy-498574.o -x cuda test/saxpy.cu
1. <eof> parser at end of file
2. Per-module optimization passes
3. Running pass 'Instrument nvptx kernel launches for basic block profiling' on module 'test/saxpy.cu'.
#0 0x00007eff491f26de llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/llvm-10.0/bin/../lib/libLLVMSupport.so.10git+0x19d6de)
#1 0x00007eff491f0134 llvm::sys::RunSignalHandlers() (/opt/llvm-10.0/bin/../lib/libLLVMSupport.so.10git+0x19b134)
#2 0x00007eff491f0278 SignalHandler(int) (/opt/llvm-10.0/bin/../lib/libLLVMSupport.so.10git+0x19b278)
#3 0x00007eff4759d470 (/lib/x86_64-linux-gnu/libc.so.6+0x46470)
#4 0x00007eff4759d3eb raise /build/glibc-t7JzpG/glibc-2.30/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#5 0x00007eff4757c899 abort /build/glibc-t7JzpG/glibc-2.30/stdlib/abort.c:81:7
#6 0x00007eff4757c769 get_sysdep_segment_value /build/glibc-t7JzpG/glibc-2.30/intl/loadmsgcat.c:509:8
#7 0x00007eff4757c769 _nl_load_domain /build/glibc-t7JzpG/glibc-2.30/intl/loadmsgcat.c:970:34
#8 0x00007eff4758e006 (/lib/x86_64-linux-gnu/libc.so.6+0x37006)
#9 0x00007eff483714c7 clang::BackendConsumer::DiagnosticHandlerImpl(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/bin/../lib/libclangCodeGen.so.10git+0x3994c7)
#10 0x00007eff48371511 clang::ClangDiagnosticHandler::handleDiagnostics(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/bin/../lib/libclangCodeGen.so.10git+0x399511)
#11 0x00007eff4a06e40d llvm::LLVMContext::diagnose(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/bin/../lib/libLLVMCore.so.10git+0x1c040d)
#12 0x00007eff4696e2f0 llvm::IRMover::move(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, llvm::ArrayRef<llvm::GlobalValue*>, std::function<void (llvm::GlobalValue&, std::function<void (llvm::GlobalValue&)>)>, bool) (/opt/llvm-10.0/bin/../lib/../lib/libLLVMLinker.so.10git+0x152f0)
#13 0x00007eff46975c27 (anonymous namespace)::ModuleLinker::run() (/opt/llvm-10.0/bin/../lib/../lib/libLLVMLinker.so.10git+0x1cc27)
#14 0x00007eff46976656 llvm::Linker::linkInModule(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, unsigned int, std::function<void (llvm::Module&, llvm::StringSet<llvm::MallocAllocator> const&)>) (/opt/llvm-10.0/bin/../lib/../lib/libLLVMLinker.so.10git+0x1d656)
#15 0x00007eff43d294f2 mekong::linkIR(llvm::StringRef, llvm::Module&) (/opt/cuda-flux/lib/libcuda_flux_pass.so+0x874f2)
#16 0x00007eff43d0f2a8 FluxHostPass::runOnModule(llvm::Module&) (/opt/cuda-flux/lib/libcuda_flux_pass.so+0x6d2a8)
#17 0x00007eff4a08b63e llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/llvm-10.0/bin/../lib/libLLVMCore.so.10git+0x1dd63e)
#18 0x00007eff48091fe9 (anonymous namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/opt/llvm-10.0/bin/../lib/libclangCodeGen.so.10git+0xb9fe9)
#19 0x00007eff48093862 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/opt/llvm-10.0/bin/../lib/libclangCodeGen.so.10git+0xbb862)
#20 0x00007eff483778bb clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/llvm-10.0/bin/../lib/libclangCodeGen.so.10git+0x39f8bb)
#21 0x00007eff45cd7791 clang::ParseAST(clang::Sema&, bool, bool) (/opt/llvm-10.0/bin/../lib/../lib/libclangParse.so.10git+0x37791)
#22 0x00007eff48376580 clang::CodeGenAction::ExecuteAction() (/opt/llvm-10.0/bin/../lib/libclangCodeGen.so.10git+0x39e580)
#23 0x00007eff47c954e9 clang::FrontendAction::Execute() (/opt/llvm-10.0/bin/../lib/libclangFrontend.so.10git+0xf24e9)
#24 0x00007eff47c4dd6a clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/llvm-10.0/bin/../lib/libclangFrontend.so.10git+0xaad6a)
#25 0x00007eff47b9f5e0 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/llvm-10.0/bin/../lib/libclangFrontendTool.so.10git+0x55e0)
#26 0x0000557d3a99fd72 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/llvm-10.0/bin/clang-10+0x16d72)
#27 0x0000557d3a99bf97 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) (/opt/llvm-10.0/bin/clang-10+0x12f97)
#28 0x0000557d3a99a65b main (/opt/llvm-10.0/bin/clang-10+0x1165b)
#29 0x00007eff4757e1e3 __libc_start_main /build/glibc-t7JzpG/glibc-2.30/csu/../csu/libc-start.c:342:3
#30 0x0000557d3a99ba4e _start (/opt/llvm-10.0/bin/clang-10+0x12a4e)
clang-10: error: unable to execute command: Aborted (core dumped)
clang-10: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 10.0.0 (https://github.com/llvm/llvm-project.git 90c78073f73eac58f4f8b4772a896dc8aac023bc)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm-10.0/bin
clang-10: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang-10: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-10: note: diagnostic msg: /tmp/saxpy-d43f11.cu
clang-10: note: diagnostic msg: /tmp/saxpy-5d03ac.cu
clang-10: note: diagnostic msg: /tmp/saxpy-d43f11.sh
clang-10: note: diagnostic msg:
********************
Hi,
the problem here is, that llc
is not found. This later leads to an error because the output of llc is required for CUDA Flux to work.
On my machine llc is located in /opt/llvm-10.0/bin/llc
. Is /opt/llvm-10.0/bin
in your path?
Yes, llvm and cuda both should be in path. This is my .profile
file, which I source before doing anything:
export CC=/opt/llvm-10.0/clang-10
export CXX=/opt/llvm-10.0/bin/clang++-10
#export CC=/usr/bin/gcc
#export CXX=/usr/bin/g++
export CPATH=/usr/local/cuda/include:$CPATH
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/llvm-10.0/lib:$LD_LIBRARY_PATH
export PATH=/opt/cuda-flux/bin:/opt/llvm-10.0/bin:$PATH
export CUDA_PATH=/usr/local/cuda
export LIBRARY_PATH=/usr/local/cuda/lib64:/opt/llvm-10.0/lib
export LLVM_DIR=/opt/llvm-10.0/
I have also inspected the environment variables right before running anything and the above paths are set.
Looks like /opt/llvm-10.0/bin is in your path. Is the llc binary there? I would expect the llc binary to be there, but maybe it depends on how llvm is installed.
Did you compile llvm yourself?
I have checked and the binary is there. Yes, I have compiled it myself from source using the cmake flags in your README.MD
Interestingly, when i try to run llc, it basically hangs. I.e. upon execution of llc, there is no output to stdout and the binary just remains open until I kill it.
Could this be the problem?
Not sure to be honest. In your first post there is this line:
sh: 1: llc: not found
Which means that the llc executable is not found, but according to your path etc. is should be. I'll try to test it myself soon.
To make sure my environment is correct, I've recompiled LLVM and again made sure the binaries are in their respective directories.
I think I might have missed something last time as I am getting a different error now:
CUDA Flux: Instrumenting device code...
CUDA Flux: Module prefix: saxpy.cu_6bc7afe4
CUDA Flux: Working on kernel: _Z5saxpyifPfS_
CUDA Flux: BlockCount: 3
ptxas /tmp/saxpy-2abe21.s, line 221; warning : Instruction 'vote' without '.sync' is deprecated since PTX ISA version 6.0 and will be discontinued in a future PTX ISA version
CUDA Flux: instrumenting host code...
CUDA Flux: CUDA Version 10.1
clang-10: /home/clusteradmin/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:819: void clang::BackendConsumer::DiagnosticHandlerImpl(const llvm::DiagnosticInfo&): Assertion `CurLinkModule' failed.
Stack dump:
0. Program arguments: /opt/llvm-10.0/bin/clang-10 -cc1 -triple x86_64-unknown-linux-gnu -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name saxpy.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fmath-errno -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -resource-dir /opt/llvm-10.0/lib/clang/10.0.1 -internal-isystem /opt/llvm-10.0/lib/clang/10.0.1/include/cuda_wrappers -internal-isystem /usr/local/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I/usr/local/cuda/include -I. -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.1/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.1/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -std=c++11 -fdeprecated-macro -fdebug-compilation-dir /home/clusteradmin/cuda-flux -ferror-limit 19 -fmessage-length 0 -fgnuc-version=4.2.1 -finline-functions -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -load /opt/cuda-flux/lib/libcuda_flux_pass.so -fcuda-include-gpubinary /tmp/saxpy-b2709a.fatbin -faddrsig -o /tmp/saxpy-ec0387.o -x cuda test/saxpy.cu
1. <eof> parser at end of file
2. Per-module optimization passes
3. Running pass 'Instrument nvptx kernel launches for basic block profiling' on module 'test/saxpy.cu'.
#0 0x00007fdeece097ee llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/llvm-10.0/lib/libLLVMSupport.so.10+0x1a97ee)
#1 0x00007fdeece07454 llvm::sys::RunSignalHandlers() (/opt/llvm-10.0/lib/libLLVMSupport.so.10+0x1a7454)
#2 0x00007fdeece07598 SignalHandler(int) (/opt/llvm-10.0/lib/libLLVMSupport.so.10+0x1a7598)
#3 0x00007fdeeb1ee470 (/lib/x86_64-linux-gnu/libc.so.6+0x46470)
#4 0x00007fdeeb1ee3eb raise /build/glibc-t7JzpG/glibc-2.30/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#5 0x00007fdeeb1cd899 abort /build/glibc-t7JzpG/glibc-2.30/stdlib/abort.c:81:7
#6 0x00007fdeeb1cd769 get_sysdep_segment_value /build/glibc-t7JzpG/glibc-2.30/intl/loadmsgcat.c:509:8
#7 0x00007fdeeb1cd769 _nl_load_domain /build/glibc-t7JzpG/glibc-2.30/intl/loadmsgcat.c:970:34
#8 0x00007fdeeb1df006 (/lib/x86_64-linux-gnu/libc.so.6+0x37006)
#9 0x00007fdeec03b953 clang::BackendConsumer::DiagnosticHandlerImpl(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3e7953)
#10 0x00007fdeec03b981 clang::ClangDiagnosticHandler::handleDiagnostics(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3e7981)
#11 0x00007fdeedbda4e5 llvm::LLVMContext::diagnose(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/lib/libLLVMCore.so.10+0x1d64e5)
#12 0x00007fdeea5bb7b7 llvm::IRMover::move(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, llvm::ArrayRef<llvm::GlobalValue*>, std::function<void (llvm::GlobalValue&, std::function<void (llvm::GlobalValue&)>)>, bool) (/opt/llvm-10.0/lib/libLLVMLinker.so.10+0x167b7)
#13 0x00007fdeea5c3086 (anonymous namespace)::ModuleLinker::run() (/opt/llvm-10.0/lib/libLLVMLinker.so.10+0x1e086)
#14 0x00007fdeea5c3c44 llvm::Linker::linkInModule(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, unsigned int, std::function<void (llvm::Module&, llvm::StringSet<llvm::MallocAllocator> const&)>) (/opt/llvm-10.0/lib/libLLVMLinker.so.10+0x1ec44)
#15 0x00007fdee78f6152 mekong::linkIR(llvm::StringRef, llvm::Module&) (/opt/cuda-flux/lib/libcuda_flux_pass.so+0x85152)
#16 0x00007fdee78dc848 FluxHostPass::runOnModule(llvm::Module&) (/opt/cuda-flux/lib/libcuda_flux_pass.so+0x6b848)
#17 0x00007fdeedbfad39 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/llvm-10.0/lib/libLLVMCore.so.10+0x1f6d39)
#18 0x00007fdeebd12332 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0xbe332)
#19 0x00007fdeec0435f9 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3ef5f9)
#20 0x00007fdee9908781 clang::ParseAST(clang::Sema&, bool, bool) (/opt/llvm-10.0/lib/libclangParse.so.10+0x37781)
#21 0x00007fdeec042068 clang::CodeGenAction::ExecuteAction() (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3ee068)
#22 0x00007fdeeb910279 clang::FrontendAction::Execute() (/opt/llvm-10.0/lib/libclangFrontend.so.10+0x108279)
#23 0x00007fdeeb8c42ae clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/llvm-10.0/lib/libclangFrontend.so.10+0xbc2ae)
#24 0x00007fdeeb8045f4 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/llvm-10.0/lib/libclangFrontendTool.so.10+0x55f4)
#25 0x000055c8cc67ff37 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/llvm-10.0/bin/clang-10+0x16f37)
#26 0x000055c8cc67bf07 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) (/opt/llvm-10.0/bin/clang-10+0x12f07)
#27 0x000055c8cc679b8c main (/opt/llvm-10.0/bin/clang-10+0x10b8c)
#28 0x00007fdeeb1cf1e3 __libc_start_main /build/glibc-t7JzpG/glibc-2.30/csu/../csu/libc-start.c:342:3
#29 0x000055c8cc67baae _start (/opt/llvm-10.0/bin/clang-10+0x12aae)
clang-10: error: unable to execute command: Aborted (core dumped)
clang-10: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 10.0.1 (https://github.com/llvm/llvm-project.git 6196695ec5819c0df7efe3fecca5c4ef9ea80b1c)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm-10.0/bin
clang-10: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang-10: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-10: note: diagnostic msg: /tmp/saxpy-7834b8.cu
clang-10: note: diagnostic msg: /tmp/saxpy-6d21c5.cu
clang-10: note: diagnostic msg: /tmp/saxpy-7834b8.sh
clang-10: note: diagnostic msg:
********************
This looks better. At least not finding llc is not a problem anymore. You could build a debug version of CUDA Flux. Then we will know which line in mekong:linkIR leads to the error.
I've rebuilt CUDA Flux with -DCMAKE_BUILD_TYPE=Debug however I'm not getting any different output when I want to test the saxpy.cu example. Am I missing something?
The output should be quite similar, but the stack trace shows the line and the names of the source files.
Should be looking like this:
#15 0x00007fdee78f6152 mekong::linkIR(llvm::StringRef, llvm::Module&) (utils.cpp:123)
Output with Debug ON:
CUDA Flux: Instrumenting device code...
CUDA Flux: Module prefix: saxpy.cu_6bc7afe4
CUDA Flux: Working on kernel: _Z5saxpyifPfS_
CUDA Flux: BlockCount: 3
ptxas /tmp/saxpy-26074d.s, line 221; warning : Instruction 'vote' without '.sync' is deprecated since PTX ISA version 6.0 and will be discontinued in a future PTX ISA version
CUDA Flux: instrumenting host code...
CUDA Flux: CUDA Version 10.1
clang-10: /home/clusteradmin/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:819: void clang::BackendConsumer::DiagnosticHandlerImpl(const llvm::DiagnosticInfo&): Assertion `CurLinkModule' failed.
Stack dump:
0. Program arguments: /opt/llvm-10.0/bin/clang-10 -cc1 -triple x86_64-unknown-linux-gnu -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name saxpy.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fmath-errno -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -dwarf-column-info -fno-split-dwarf-inlining -debugger-tuning=gdb -resource-dir /opt/llvm-10.0/lib/clang/10.0.1 -internal-isystem /opt/llvm-10.0/lib/clang/10.0.1/include/cuda_wrappers -internal-isystem /usr/local/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -I/usr/local/cuda/include -I. -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.1/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /usr/local/include -internal-isystem /opt/llvm-10.0/lib/clang/10.0.1/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -std=c++11 -fdeprecated-macro -fdebug-compilation-dir /home/clusteradmin/work/tvm-profiling/cuda-flux -ferror-limit 19 -fmessage-length 0 -fgnuc-version=4.2.1 -finline-functions -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -load /opt/cuda-flux/lib/libcuda_flux_pass.so -fcuda-include-gpubinary /tmp/saxpy-4354ae.fatbin -faddrsig -o /tmp/saxpy-30170d.o -x cuda test/saxpy.cu
1. <eof> parser at end of file
2. Per-module optimization passes
3. Running pass 'Instrument nvptx kernel launches for basic block profiling' on module 'test/saxpy.cu'.
#0 0x00007f1523f147ee llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/opt/llvm-10.0/lib/libLLVMSupport.so.10+0x1a97ee)
#1 0x00007f1523f12454 llvm::sys::RunSignalHandlers() (/opt/llvm-10.0/lib/libLLVMSupport.so.10+0x1a7454)
#2 0x00007f1523f12598 SignalHandler(int) (/opt/llvm-10.0/lib/libLLVMSupport.so.10+0x1a7598)
#3 0x00007f15222f9470 (/lib/x86_64-linux-gnu/libc.so.6+0x46470)
#4 0x00007f15222f93eb raise /build/glibc-t7JzpG/glibc-2.30/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#5 0x00007f15222d8899 abort /build/glibc-t7JzpG/glibc-2.30/stdlib/abort.c:81:7
#6 0x00007f15222d8769 get_sysdep_segment_value /build/glibc-t7JzpG/glibc-2.30/intl/loadmsgcat.c:509:8
#7 0x00007f15222d8769 _nl_load_domain /build/glibc-t7JzpG/glibc-2.30/intl/loadmsgcat.c:970:34
#8 0x00007f15222ea006 (/lib/x86_64-linux-gnu/libc.so.6+0x37006)
#9 0x00007f1523146953 clang::BackendConsumer::DiagnosticHandlerImpl(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3e7953)
#10 0x00007f1523146981 clang::ClangDiagnosticHandler::handleDiagnostics(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3e7981)
#11 0x00007f1524ce54e5 llvm::LLVMContext::diagnose(llvm::DiagnosticInfo const&) (/opt/llvm-10.0/lib/libLLVMCore.so.10+0x1d64e5)
#12 0x00007f15216c67b7 llvm::IRMover::move(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, llvm::ArrayRef<llvm::GlobalValue*>, std::function<void (llvm::GlobalValue&, std::function<void (llvm::GlobalValue&)>)>, bool) (/opt/llvm-10.0/lib/libLLVMLinker.so.10+0x167b7)
#13 0x00007f15216ce086 (anonymous namespace)::ModuleLinker::run() (/opt/llvm-10.0/lib/libLLVMLinker.so.10+0x1e086)
#14 0x00007f15216cec44 llvm::Linker::linkInModule(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >, unsigned int, std::function<void (llvm::Module&, llvm::StringSet<llvm::MallocAllocator> const&)>) (/opt/llvm-10.0/lib/libLLVMLinker.so.10+0x1ec44)
#15 0x00007f151ea01152 mekong::linkIR(llvm::StringRef, llvm::Module&) /home/clusteradmin/cuda-flux/mekong-utils/src/IRUtils.cpp:128:10
#16 0x00007f151e9e7848 FluxHostPass::runOnModule(llvm::Module&) /home/clusteradmin/cuda-flux/lib/fluxHostPass.cpp:40:3
#17 0x00007f1524d05d39 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/opt/llvm-10.0/lib/libLLVMCore.so.10+0x1f6d39)
#18 0x00007f1522e1d332 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0xbe332)
#19 0x00007f152314e5f9 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3ef5f9)
#20 0x00007f1520a13781 clang::ParseAST(clang::Sema&, bool, bool) (/opt/llvm-10.0/lib/libclangParse.so.10+0x37781)
#21 0x00007f152314d068 clang::CodeGenAction::ExecuteAction() (/opt/llvm-10.0/lib/libclangCodeGen.so.10+0x3ee068)
#22 0x00007f1522a1b279 clang::FrontendAction::Execute() (/opt/llvm-10.0/lib/libclangFrontend.so.10+0x108279)
#23 0x00007f15229cf2ae clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/llvm-10.0/lib/libclangFrontend.so.10+0xbc2ae)
#24 0x00007f152290f5f4 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/llvm-10.0/lib/libclangFrontendTool.so.10+0x55f4)
#25 0x000056249e093f37 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/opt/llvm-10.0/bin/clang-10+0x16f37)
#26 0x000056249e08ff07 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) (/opt/llvm-10.0/bin/clang-10+0x12f07)
#27 0x000056249e08db8c main (/opt/llvm-10.0/bin/clang-10+0x10b8c)
#28 0x00007f15222da1e3 __libc_start_main /build/glibc-t7JzpG/glibc-2.30/csu/../csu/libc-start.c:342:3
#29 0x000056249e08faae _start (/opt/llvm-10.0/bin/clang-10+0x12aae)
clang-10: error: unable to execute command: Aborted (core dumped)
clang-10: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 10.0.1 (https://github.com/llvm/llvm-project.git 6196695ec5819c0df7efe3fecca5c4ef9ea80b1c)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm-10.0/bin
clang-10: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
clang-10: note: diagnostic msg:
********************
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-10: note: diagnostic msg: /tmp/saxpy-e4d892.cu
clang-10: note: diagnostic msg: /tmp/saxpy-9c8a3c.cu
clang-10: note: diagnostic msg: /tmp/saxpy-e4d892.sh
clang-10: note: diagnostic msg:
********************
Thanks for the output. I will try to reproduce the error and fix it. If you don't want to wait an earlier release candidate of llvm-10.0 might work (At least it did on CentOS 7).
I'll give it a go, thanks for help.
Also, not sure if helpful, it seems like some of the tests actually pass.
Take a look at this output:
Start 1: Saxpy
1/6 Test #1: Saxpy ............................***Failed 2.09 sec
Start 2: BranchDivergence
2/6 Test #2: BranchDivergence .................***Failed 2.32 sec
Start 3: run_saxpy
3/6 Test #3: run_saxpy ........................***Failed 0.00 sec
Start 4: run_branch_divergence
4/6 Test #4: run_branch_divergence ............***Failed 0.04 sec
Start 5: rodinia_hotspot
5/6 Test #5: rodinia_hotspot .................. Passed 0.06 sec
Start 6: rodinia_nn
6/6 Test #6: rodinia_nn ....................... Passed 0.06 sec
33% tests passed, 4 tests failed out of 6
Total Test time (real) = 4.59 sec
The following tests FAILED:
1 - Saxpy (Failed)
2 - BranchDivergence (Failed)
3 - run_saxpy (Failed)
4 - run_branch_divergence (Failed)
Errors while running CTest
make: *** [Makefile:130: test] Error 8
rodinia_nn executable was not built however I've noticed the following line in there for the rodinia test targets:
warning: Linking two modules of different target triples: ' is 'x86_64-pc-linux-gnu' whereas 'rodinia_hotspot_hostcode.bc' is 'x86_64-unknown-linux-gnu'
Could it be that LLVM on my side for some reason has cross-compiled to "unknown" platform triplet and that confuses things?
Please find attached a full test log for all 6 test targets.
LastTest.log
Hi @dborowiec10 you have solved the issue?
Hi @dborowiec10 you have solved the issue?
@Dax009 No, unfortunately not.
Hi @dborowiec10 you have solved the issue?
@Dax009 No, unfortunately not.
Hi, you can try with to specify correctly your Target (Target: x86_64-unknown-linux-gnu).
In my case, i have to specify in cmake command for LLVM
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-redhat-linux
I hope that this is helpful.
Hi there,
i have looked at the test again and found out a few things. Test 5 and 6 will most likely fail on most systems because the bytecode can have different target triples and links so symbols which may not be available. So don't worry if those do not work.
Regarding the error in linkIR, i still don't know where this error comes from and since i am developing with llvm 11.0 now, i will not investigate this further. Today i build cuda flux on debian buster with llvm 11.0 and cuda 11.0. It just works fine and cuda 10 should also be no problem. I need to update the test to be build for sm_35 instead of sm_30 because cuda 11 has deprecated sm_30.
Also if you are building for sm_70 and higher, the instrumentation can not be build because of deprecated instructions.
Hopefully i will be able to fix that this week. If you are willing to try again with llvm 11, i am happy to help. I will close this issue because of its age and the old llvm version, for which i don't develop anymore. Just open a new issue if there are any problem.
Best regards
Lorenz