CHIP-SPV/chipStar

PoCL - Device library link step failed

Closed this issue · 6 comments

The run: https://github.com/CHIP-SPV/chipStar/actions/runs/8030196087

These don't fail for me locally

list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rd_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rn_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_ru_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rz_double") # Failedlist(APPEND CPU_POCL_FAILED_TESTS "fp16_math") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "fp16_half2_math") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_hipGraphAddMemcpyNodeToSymbol_MemcpyToSymbolNodeWithKernel") # Failed
CHIP error [TID 36443] [1708774709.417000043] : hipErrorNotInitialized (Device library link step failed.) in /home/runner/work/chipStar/chipStar/src/backend/OpenCL/CHIPBackendOpenCL.cc:840:compile

CHIP error [TID 36443] [1708774709.417225324] : Caught Error: hipErrorNotInitialized

Do you mean there could be a bug in PoCL or the CI?

from UnitTests.cmake

# The following tests fail for LLVM 15 Debug & Release : Cannot find symbol _Z4sqrtDh in kernel library
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rd_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rn_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_ru_double") # Failed
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest___dsqrt_rz_double") # Failed

# Fails for LLVM 15 Debug: SPIR-V Parser: Failed to find size for type id 83
list(APPEND CPU_POCL_FAILED_TESTS "Unit_deviceFunctions_CompileTest_rnorm_double") # Failed

PoCL CPU device has FP16 support only when compiled with LLVM 16 and higher (and that support is quite incomplete).

Do we need to support LLVM 15 still?

Do we need to support LLVM 15 still?

I don't think so

OK. So this issue is either invalid or should be a PR to PoCL to add the missing fp16 bits, right?