[Bug] Failed to build task
jimgao1 opened this issue · 1 comments
jimgao1 commented
Describe the bug
Hidet crashes when building the task take
. The model is an ONNX version of RoBERTa.
Detailed logs is attached below:
RuntimeError: Failed to build 1 tasks:
[cuda] take(data=float32(50265, 768), indices=float32(1, 256), output=float32(1, 256, 768))
Traceback (most recent call last):
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/drivers/build_task.py", line 303, in build_job
task.build(target, load=False)
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/ir/task.py", line 235, in build
return build_task(self, target=target, load=load)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/drivers/build_task.py", line 288, in build_task
build_task_module(task, candidates, task_dir, target)
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/drivers/build_task.py", line 164, in build_task_module
build_ir_module(ir_module=task_ir_module, output_dir=task_dir, output_kind='.so', target=target)
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/drivers/build_module.py", line 118, in build_ir_module
compile_source(
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/backend/build.py", line 301, in compile_source
compiler.compile(
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/backend/build.py", line 188, in compile
self.run_compile_command(" ".join(command), src_path, out_lib_path)
File "/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/backend/build.py", line 77, in run_compile_command
raise CompilationFailed(src_path, message)
hidet.backend.build.CompilationFailed: failed to compile file:///home/ybgao/A/.hidet_cache/ops/cuda_space_0/take/902658c2d53174dd/source.cu
Command: /opt/cuda/bin/nvcc -I/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/include -L/home/ybgao/A/B/venv2/lib/python3.11/site-packages/hidet/lib -O3 -Xcompiler -fPIC,-m64,-O3,-funroll-loops,-ffast-math -std=c++11 -gencode arch=compute_89,code=sm_89 --ptxas-options=-v -lineinfo -ftz=true -prec-div=false -lhidet_runtime --cudart shared --diag-suppress 177 --diag-suppress 179 --diag-suppress 39 --shared /home/ybgao/A/.hidet_cache/ops/cuda_space_0/take/902658c2d53174dd/source.cu -o /home/ybgao/A/.hidet_cache/ops/cuda_space_0/take/902658c2d53174dd/lib.so
/home/ybgao/A/.hidet_cache/ops/cuda_space_0/take/902658c2d53174dd/source.cu(11): error: expression must have integral or unscoped enum type
output[((((((int)blockIdx.x * 512) + (int)threadIdx.x) / 768) * 768) + ((((int)blockIdx.x * 512) + (int)threadIdx.x) % 768))] = data[((((indices[((((int)blockIdx.x * 512) + (int)threadIdx.x) / 768)] < 0.0f) ? (indices[((((int)blockIdx.x * 512) + (int)threadIdx.x) / 768)] + 50265.0f) : indices[((((int)blockIdx.x * 512) + (int)threadIdx.x) / 768)]) * 768.0f) + ((float)(((((int)blockIdx.x * 512) + (int)threadIdx.x) % 768))))];
^
1 error detected in the compilation of "/home/ybgao/A/.hidet_cache/ops/cuda_space_0/take/902658c2d53174dd/source.cu".
To Reproduce
The ONNX file can be found here.
Expected behavior
The build should succeed.
Enviroment
- OS: Arch Linux
- GPU: RTX 4090
- Others: Driver 535.113.01, Python 3.11.5, Hidet 0.3.0
Additional context
N/A
jimgao1 commented
Apologies. Turns out the error is caused by one of the inputs to take
being a float32
instead of int32
. The compilation succeeds when this is changed.