google/XNNPACK

Failed to compile XNNPACK on WoA(Windows on ARM) device.

zhanweiw opened this issue · 5 comments

It seems part of the code haven't been compiled. Any idea on how to fix it? Thanks in advance!

FAILED: subgraph-size-test.exe
C:\windows\system32\cmd.exe /C "cd . && C:\Programs\Python\Python311-arm64\Lib\site-packages\cmake\data\bin\cmake.exe -E vs_link_exe --intdir=CMakeFiles\subgraph-size-test.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\arm64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\arm64\mt.exe --manifests  -- C:\Programs\LLVM\bin\lld-link.exe /nologo CMakeFiles\subgraph-size-test.dir\test\subgraph-size.c.obj  /out:subgraph-size-test.exe /implib:subgraph-size-test.lib /pdb:subgraph-size-test.pdb /version:0.0 /machine:ARM64 /debug /INCREMENTAL /subsystem:console  XNNPACK.lib  cpuinfo\cpuinfo.lib  pthreadpool\pthreadpool.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
LINK Pass 1: command "C:\Programs\LLVM\bin\lld-link.exe /nologo CMakeFiles\subgraph-size-test.dir\test\subgraph-size.c.obj /out:subgraph-size-test.exe /implib:subgraph-size-test.lib /pdb:subgraph-size-test.pdb /version:0.0 /machine:ARM64 /debug /INCREMENTAL /subsystem:console XNNPACK.lib cpuinfo\cpuinfo.lib pthreadpool\pthreadpool.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:CMakeFiles\subgraph-size-test.dir/intermediate.manifest CMakeFiles\subgraph-size-test.dir/manifest.res" failed (exit code 1) with the following output:
lld-link: error: undefined symbol: xnn_f16_vabs_ukernel__neonfp16arith_u16
>>> referenced by C:\zhanweiw\tf_lite\XNNPACK\src\configs\unary-elementwise-config.c:181
>>>               XNNPACK.lib(unary-elementwise-config.c.obj):(init_f16_abs_config)
>>> referenced by C:\zhanweiw\tf_lite\XNNPACK\src\configs\unary-elementwise-config.c:181
>>>               XNNPACK.lib(unary-elementwise-config.c.obj):(init_f16_abs_config)

Hi thanks for the report.

When I give a quick try with blaze which is like bazel, I'm able to build the abs bench
blaze build --config=lexan_x86_64 -c opt //third_party/XNNPACK/bench:abs_bench

The microkernel is checked in, declared and used:
grep xnn_f16_vabs_ukernel__neonfp16arith_u16 . -r
./src/amalgam/gen/neonfp16arith.c:void xnn_f16_vabs_ukernel__neonfp16arith_u16(
./src/configs/unary-elementwise-config.c: f16_abs_config.ukernel = (xnn_vunary_ukernel_fn) xnn_f16_vabs_ukernel__neonfp16arith_u16;
./src/configs/unary-elementwise-config.c: f16_abs_config.ukernel = (xnn_vunary_ukernel_fn) xnn_f16_vabs_ukernel__neonfp16arith_u16;
./src/xnnpack/vunary.h:DECLARE_F16_VABS_UKERNEL_FUNCTION(xnn_f16_vabs_ukernel__neonfp16arith_u16)
./src/f16-vunary/gen/f16-vabs-neonfp16arith-u16.c:void xnn_f16_vabs_ukernel__neonfp16arith_u16(
./bench/f16-vabs.cc: xnn_f16_vabs_ukernel__neonfp16arith_u16,
./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernel__neonfp16arith_u16);
./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernel__neonfp16arith_u16);
./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernel__neonfp16arith_u16);
./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernel__neonfp16arith_u16);
./test/f16-vabs.cc: .TestAbs(xnn_f16_vabs_ukernel__neonfp16arith_u16);
./test/f16-vabs.yaml:- name: xnn_f16_vabs_ukernel__neonfp16arith_u16

The important one for linking is the kernel is in
./src/amalgam/gen/neonfp16arith.c
which gets built and linked on arm systems unless fp16 is disabled.

For CMakeList.txt around line 509
IF(XNNPACK_ENABLE_ARM_FP16_VECTOR)
LIST(APPEND PROD_MICROKERNEL_SRCS ${PROD_NEONFP16ARITH_MICROKERNEL_SRCS})
LIST(APPEND PROD_MICROKERNEL_SRCS ${PROD_NEONFP16ARITH_AARCH64_MICROKERNEL_SRCS})
ENDIF()
appends the fp16 microkernels

Its possible our cmake is missing something for Windows. in scripts/build-windows-arm64.cmd
There are CMake parameters for a VS2017 build:

mkdir build\windows
mkdir build\windows\arm64

set CMAKE_ARGS=-DXNNPACK_LIBRARY_TYPE=static -DXNNPACK_ENABLE_ASSEMBLY=OFF -DXNNPACK_ENABLE_ARM_FP16_SCALAR=OFF -DXNNPACK_ENABLE_ARM_BF16=OFF
set CMAKE_ARGS=%CMAKE_ARGS% -G="Visual Studio 17 2022" -A=ARM64

rem Use-specified CMake arguments go last to allow overridding defaults
set CMAKE_ARGS=%CMAKE_ARGS% %*

echo %CMAKE_ARGS%

cd build\windows\arm64 && cmake ..\..\.. %CMAKE_ARGS%
cmake --build . -j %NUMBER_OF_PROCESSORS% --config Release

Thanks for your supporting!
I've tried to disable the 'XNNPACK_ENABLE_ASSEMBLY' and it works. But if we disabled this feature, it will impact the performance, right?
Is that possible to enable 'XNNPACK_ENABLE_ASSEMBLY' on ARM64 windows?

The arm assembly is in .S files meant to be compiled with gcc or clang.
As far as I know theres no way to assemble them with Visual Studio.

The best solution is compiling with clang or clangcl

I tried with ClangCL from Visual Studio 2022 using:
cmake -T"ClangCL"

But I got the same error as llvm/llvm-project#52964

The version of clang is:

clang --version
clang version 17.0.3
Target: aarch64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\Llvm\ARM64\bin

So either we need to wait for the intrinsic support to get into a released toolchain, or we need to find a workaround.

looking at this function in particular, its not actually using fp16 arithmetics
xnn_f16_vabs_ukernel__neonfp16arith_u16

the type is f16, but the implementation is actually neon. the file name is neon, and inconsistent with the kernel name.
its not clear if that explains your link error, but the isa should be consistent, because that determines which library/amalgam it goes in.