intel/mlir-extensions

TEST 'IMEX :: Integration/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir' FAILED

Opened this issue · 3 comments

Hi,

Building IMEX as in README and tests fail. Here's one of the error:

FAIL: IMEX :: Integration/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir (97 of 234)                                                                                                               
******************** TEST 'IMEX :: Integration/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir' FAILED ********************                                                                         
Exit Code: 2                                                                                                                                                                                                       
                                                                                                                                                                                                                   
Command Output (stderr):                                                                                                                                                                                           
--                                                                                                                                                                                                                 
RUN: at line 1: IMEX_ENABLE_LARGE_REG_FILE=1 /home/sasank/miniconda3/bin/python3 /home/sasank/code/llvm-project/build/bin/imex-runner.py --requires=l0-runtime -i /home/sasank/code/mlir-extensions/test/Integratio
n/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir --pass-pipeline-file=/home/sasank/code/mlir-extensions/test/Integration/Dialect/XeGPU/xegpu-to-llvm.pp                                        --ru
nner imex-cpu-runner -e main                                        --entry-point-result=void                                        --shared-libs=/home/sasank/code/llvm-project/build/lib/libimex_runner_utils.so
,/home/sasank/code/llvm-project/build/lib/libmlir_runner_utils.so,/home/sasank/code/llvm-project/build/lib/libmlir_c_runner_utils.so,/home/sasank/code/llvm-project/build/lib/liblevel-zero-runtime.so --filecheck 
+ IMEX_ENABLE_LARGE_REG_FILE=1                                                                                                                                                                                     
+ /home/sasank/miniconda3/bin/python3 /home/sasank/code/llvm-project/build/bin/imex-runner.py --requires=l0-runtime -i /home/sasank/code/mlir-extensions/test/Integration/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w
_8x32xf16_stores.mlir --pass-pipeline-file=/home/sasank/code/mlir-extensions/test/Integration/Dialect/XeGPU/xegpu-to-llvm.pp --runner imex-cpu-runner -e main --entry-point-result=void --shared-libs=/home/sasank/
code/llvm-project/build/lib/libimex_runner_utils.so,/home/sasank/code/llvm-project/build/lib/libmlir_runner_utils.so,/home/sasank/code/llvm-project/build/lib/libmlir_c_runner_utils.so,/home/sasank/code/llvm-proj
ect/build/lib/liblevel-zero-runtime.so --filecheck                                                                                                                                                                 
error: LLVM ERROR: VISA builder API call failed: CisaBuilder->Compile( BC->isaDumpsEnabled() && BC->hasShaderDumper() ? BC->getShaderDumper().composeDumpPath("final.isaasm").c_str() : "", BC->emitVisaOnly())    
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.                                                                                                        
Stack dump:                                                                                                                                                                                                        
0.      Program arguments: /home/sasank/code/llvm-project/build/bin/imex-cpu-runner -e main --entry-point-result=void --shared-libs=/home/sasank/code/llvm-project/build/lib/libimex_runner_utils.so,/home/sasank/c
ode/llvm-project/build/lib/libmlir_runner_utils.so,/home/sasank/code/llvm-project/build/lib/libmlir_c_runner_utils.so,/home/sasank/code/llvm-project/build/lib/liblevel-zero-runtime.so                            
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):                                                                   
0  imex-cpu-runner          0x0000564daf2765a0                                                                                                                                                                     
1  imex-cpu-runner          0x0000564daf27369f                                                                                                                                                                     
2  imex-cpu-runner          0x0000564daf2737f5                                                                                                                                                                     
3  libc.so.6                0x00007fb9361ef520                                                                                                                                                                     
4  libc.so.6                0x00007fb9362439fc pthread_kill + 300                                                                                                                                                  
5  libc.so.6                0x00007fb9361ef476 raise + 22                                                                                                                                                          
6  libc.so.6                0x00007fb9361d57f3 abort + 211                                                                                                                                                         
7  libigc.so.1              0x00007fb9141347e7 llvm::report_fatal_error(llvm::Twine const&, bool) + 151                                                                                                            
8  libigc.so.1              0x00007fb914134928                                                                                                                                                                     
9  libigc.so.1              0x00007fb9146b5014                                                                                                                                                                     
10 libigc.so.1              0x00007fb913c0d48f llvm::LLVMContext::diagnose(llvm::DiagnosticInfo const&) + 431
11 libigc.so.1              0x00007fb9146e5749                                                                                                                                                                     
12 libigc.so.1              0x00007fb91475ba3f                                                                                                                                                                     
13 libigc.so.1              0x00007fb9147715a9                                                                                                                                                                     
14 libigc.so.1              0x00007fb913c2895c llvm::legacy::PassManagerImpl::run(llvm::Module&) + 812                                                                                                             
15 libigc.so.1              0x00007fb914677aa0                                                                                                                                                                     
16 libigc.so.1              0x00007fb9137ab16d                                                                                                                                                                     
17 libigc.so.1              0x00007fb912e5a9c7                                                                                                                                                                     
18 libigc.so.1              0x00007fb912ec67f5                                                                                                                                                                     
19 libze_intel_gpu.so.1     0x00007fb91853cf27                                                                                                                                                                     
20 libze_intel_gpu.so.1     0x00007fb91821fe80                                                                                                                                                                     
21 libze_intel_gpu.so.1     0x00007fb918222433                                                                                                                                                                     
22 libze_intel_gpu.so.1     0x00007fb918227662                                                                                                                                                                     
23 libze_intel_gpu.so.1     0x00007fb918228d4e                                                                                                                                                                     
24 libze_intel_gpu.so.1     0x00007fb91818798e                                                                                                                                                                     
25 liblevel-zero-runtime.so 0x00007fb93618202a gpuModuleLoad + 266                                                                                                                                                 
26 liblevel-zero-runtime.so 0x00007fb93674b153 gpuModuleLoad + 6066739                                                                                                                                             
27 liblevel-zero-runtime.so 0x00007fb93674b632 gpuModuleLoad + 6067986                                                                                                                                             
28 liblevel-zero-runtime.so 0x00007fb93674bb51 gpuModuleLoad + 6069297                                                                                                                                             
29 imex-cpu-runner          0x0000564daf80c884                                                                                                                                                                     
30 imex-cpu-runner          0x0000564daf80cc7d                                                                                                                                                                     
31 imex-cpu-runner          0x0000564daf80fb35                                                                                                                                                                     
32 imex-cpu-runner          0x0000564daf1c9ecb                                                                                                                                                                     
33 libc.so.6                0x00007fb9361d6d90                                                                                                                                                                     
34 libc.so.6                0x00007fb9361d6e40 __libc_start_main + 128                                                                                                                                             
35 imex-cpu-runner          0x0000564daf25e1e5                                                                                                                                                                     
FileCheck error: '<stdin>' is empty.                                                                                                                                                                               
FileCheck command line:  /home/sasank/code/llvm-project/build/bin/FileCheck /home/sasank/code/mlir-extensions/test/Integration/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir                                                                                                                                                                                                                                       
--

Other tests that failed for me:

  • IMEX :: Integration/Dialect/XeGPU/large_stores_8x32xf16_w_constant_vector_shuffle.vc.mlir
  • MEX :: Integration/Dialect/XeGPU/large_stores_8x32xf16_w_2d_vector_shuffle.vc.mlir
  • IMEX :: Integration/Dialect/XeGPU/large_stores_8x32xf16_w_1d_vector_shuffle.vc.mlir
  • IMEX :: Integration/Dialect/XeGPU/load_with_block_array_8_16_2.vc.mlir

Can you tell me how do I fix these?

These steps don't end up with that error for me:
activate imex conda env
Make sure llvm is updated and clean - git pull origin main
Make sure mlir-project is updated - git pull origin main
cd llvm-project
git checkout ``cat ../mlir-extensions/build_tools/llvm_version.txt``
git apply ../mlir-extensions/build_tools/patches/*
rm -rf build/*;cmake -G Ninja -B build -S llvm -DLLVM_ENABLE_PROJECTS=mlir -DLLVM_BUILD_EXAMPLES=ON -DLLVM_TARGETS_TO_BUILD="X86" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_EXTERNAL_PROJECTS="Imex" -DLLVM_EXTERNAL_IMEX_SOURCE_DIR=../mlir-extensions/ -DIMEX_ENABLE_L0_RUNTIME=1 -DIMEX_ENABLE_SYCL_RUNTIME=1
cmake --build build --target check-imex | tee build/tests.txt

My command line to run the test
IMEX_ENABLE_LARGE_REG_FILE=1 python3.9 /mydir/llvm-project/build/bin/imex-runner.py --requires=l0-runtime -i /mydir/mlir-extensions/test/Integration/Dialect/XeGPU/gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir --pass-pipeline-file=/mydir/mlir-extensions/test/Integration/Dialect/XeGPU/xegpu-to-llvm.pp --runner imex-cpu-runner -e main --entry-point-result=void --shared-libs=/mydir/llvm-project/build/lib/libimex_runner_utils.so,/mydir/llvm-project/build/lib/libmlir_runner_utils.so,/mydir/llvm-project/build/lib/libmlir_c_runner_utils.so,/mydir/llvm-project/build/lib/liblevel-zero-runtime.so

I will retry again. For reference, I have installed drivers/compiler/libs etc from debian repos as described here: https://chsasank.com/intel-arc-gpu-driver-oneapi-installation.html

I have not used conda environment to install dependencies and didn't run pre-commit config as well.

The command you shared - is that how I run these benchmarks on Intel GPU?

The command you shared - is that how I run these benchmarks on Intel GPU?

Yes, I'm on the machine with IntelGPU. Note that gemm_4kx4kx4k_f16_f16_f16_w_8x32xf16_stores.mlir is PVC-specific