intel/neural-speed

Neural Speed compilation failing in ORT

Opened this issue · 3 comments

OS:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 23.10
Release:        23.10
Codename:       mantic

GCC Version

$ gcc --version
gcc (Ubuntu 13.2.0-4ubuntu3) 13.2.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Cmake Version

$ cmake --version
cmake version 3.27.4

CMake suite maintained and supported by Kitware (kitware.com/cmake).

Onnxruntime Tag

commit 8f5c79cb63f09ef1302e85081093a3fe4da1bc7d (HEAD -> v1p17p1, tag: v1.17.1, origin/rel-1.17.1)
Author: Rachel Guo <35738743+YUNQIUGUO@users.noreply.github.com>
Date:   Fri Feb 23 16:10:36 2024 -0800

    Update 1.17.1 patch release version (#19622)

    ### Description
    <!-- Describe your changes. -->

    Need to update patch release version.


    ### Motivation and Context
    <!-- - Why is this change required? What problem does it solve?
    - If it fixes an open issue, please link to the issue here. -->

    ---------

    Co-authored-by: rachguo <rachguo@rachguos-Mac-mini.local>
    Co-authored-by: rachguo <rachguo@rachguos-Mini.attlocal.net>

Command:

./build.sh --config RelWithDebInfo --parallel  --build_shared_lib --skip_tests

Error:

/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h: In instantiation of ‘class bestla::parallel::gemm::SchedulerBase<bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2> >’:
/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h:476:7:   required from ‘class bestla::parallel::gemm::SchedulerKBlockS<bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2> >’
/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h:657:14:   required from ‘void bestla::parallel::GemmRun(Launch_T&, const typename Launch_T::Param&, IThreading*) [with Parallel_T = gemm::SchedulerKBlockS<bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2> >; Launch_T = bestla::wrapper::gemm::LauncherIntKBlock<BTLA_ISA::AVX_VNNI, bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2>, bestla::prologue_a::gemm::ActivationF32KBlockQuantize, bestla::prologue_b::gemm::WeightKBlockNInteger, bestla::epilogue::gemm::AccumulatorWriteBackFp32>; typename Launch_T::Param = bestla::wrapper::gemm::LauncherIntKBlock<BTLA_ISA::AVX_VNNI, bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2>, bestla::prologue_a::gemm::ActivationF32KBlockQuantize, bestla::prologue_b::gemm::WeightKBlockNInteger, bestla::epilogue::gemm::AccumulatorWriteBackFp32>::Param]’
/onnxruntime/onnxruntime/contrib_ops/cpu/quantization/neural_speed_gemm.cc:98:30:   required from ‘void bestla::NSSQ4GemmCompInt8(size_t, size_t, size_t, const float*, size_t, storage::gemm::StorageWeightKBlockNInteger*, float*, size_t, int8_t*, parallel::IThreading*) [with GemmCore_T = gemm::ICoreRowNAvxvnniKBlock<24, 2>; size_t = long unsigned int; int8_t = signed char]’
/onnxruntime/onnxruntime/contrib_ops/cpu/quantization/neural_speed_gemm.cc:183:64:   required from here
/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h:49:16: error: ‘virtual void bestla::parallel::Scheduler2D::getIndex(ThreadProblem&) const’ was hidden [-Werror=overloaded-virtual=]
   49 |   virtual void getIndex(ThreadProblem& problem) const {
      |                ^~~~~~~~
/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h:142:16: note:   by ‘void bestla::parallel::gemm::SchedulerBase<_GemmCore_T>::getIndex(ThreadProblem&) [with _GemmCore_T = bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2>; ThreadProblem = bestla::parallel::gemm::ThreadProblemBase]’
  142 |   virtual void getIndex(ThreadProblem& problem) {
      |                ^~~~~~~~
/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h:66:16: error: ‘virtual void bestla::parallel::Scheduler2D::update(const bestla::parallel::Config2D&)’ was hidden [-Werror=overloaded-virtual=]
   66 |   virtual void update(const Config2D& config) {
      |                ^~~~~~
/onnxruntime/build/Linux/RelWithDebInfo/_deps/neural_speed-src/bestla/bestla_parallel.h:151:16: note:   by ‘void bestla::parallel::gemm::SchedulerBase<_GemmCore_T>::update(const bestla::parallel::gemm::Config&) [with _GemmCore_T = bestla::gemm::ICoreRowNAvxvnniKBlock<24, 2>]’
  151 |   virtual void update(const Config& config) {
      |                ^~~~~~
[ 78%] Built target onnx_test_runner_common
[ 78%] Built target onnxruntime_session
[ 78%] Linking CXX static library libonnxruntime_framework.a
[ 78%] Linking CXX shared module libtest_execution_provider.so
[ 78%] Built target test_execution_provider
[ 78%] Built target onnxruntime_framework
[ 78%] Linking CXX static library libonnxruntime_graph.a
[ 78%] Linking CXX static library libonnxruntime_util.a
[ 78%] Built target onnxruntime_graph
[ 78%] Built target onnxruntime_util
cc1plus: all warnings being treated as errors
gmake[2]: *** [CMakeFiles/onnxruntime_providers.dir/build.make:2498: CMakeFiles/onnxruntime_providers.dir/onnxruntime/onnxruntime/contrib_ops/cpu/quantization/neural_speed_gemm.cc.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:2010: CMakeFiles/onnxruntime_providers.dir/all] Error 2
gmake: *** [Makefile:166: all] Error 2
Traceback (most recent call last):
  File "/onnxruntime/tools/ci_build/build.py", line 2887, in <module>
    sys.exit(main())
             ^^^^^^
  File "/onnxruntime/tools/ci_build/build.py", line 2779, in main
    build_targets(args, cmake_path, build_dir, configs, num_parallel_jobs, args.target)
  File "/onnxruntime/tools/ci_build/build.py", line 1659, in build_targets
    run_subprocess(cmd_args, env=env)
  File "/onnxruntime/tools/ci_build/build.py", line 839, in run_subprocess
    return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/onnxruntime/tools/python/util/run.py", line 49, in run
    completed_process = subprocess.run(
                        ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/bin/cmake', '--build', '/onnxruntime/build/Linux/RelWithDebInfo', '--config', 'RelWithDebInfo', '--', '-j344']' returned non-zero exit status 2.

it's a warning from GCC13, and it's treated as an error by the compiler flag.

add this --compile_no_warning_as_error to your build.sh options should ignore this warning.

Thanks for the reply, am currently disabling the neural speed compilation using the following flag in the build command for ORT.

--cmake_extra_defines onnxruntime_USE_NEURAL_SPEED=OFF

Let's keep this issue open If there is any plan for the future to add changes to neural speed to avoid these errors/warnings.