mlc-ai/tokenizers-cpp

SentencePiece Build Error - ld: error: undefined symbol: __android_log_write

zjc664656505 opened this issue · 6 comments

Dear mlc-ai developers,

Recently, I'm deploying the tokenizer to my android environment. I have successfully built the Huggingface Tokenizer. However, when I try to build the sentencepiece tokenizer I met this error:

: && /Users/junchenzhao/Library/Android/sdk/ndk/25.1.8937393/toolchains/llvm/prebuilt/darwin-x86_64/bin/clang++ --target=aarch64-none-linux-android24 --sysroot=/Users/junchenzhao/Library/Android/sdk/ndk/25.1.8937393/toolchains/llvm/prebuilt/darwin-x86_64/sysroot -O3 -Wall -fPIC -g -DANDROID -fdata-sections -ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security  -std=c++17 -fmacro-prefix-map=/Users/junchenzhao/Dist-CPU-Learn/android/distributed_inference_demo/test1/src/main/cpp/='' -fno-limit-debug-info -static-libstdc++ -Wl,--build-id=sha1 -Wl,--no-rosegment -Wl,--fatal-warnings -Wl,--gc-sections -Wl,--no-undefined -Qunused-arguments -Wl,--gc-sections tokenizers_cpp/sentencepiece/src/CMakeFiles/spm_decode.dir/spm_decode_main.cc.o -o /Users/junchenzhao/Dist-CPU-Learn/android/distributed_inference_demo/test1/build/intermediates/cxx/Debug/6m5u3o15/obj/arm64-v8a/spm_decode  tokenizers_cpp/sentencepiece/src/libsentencepiece.a  -pthread  -latomic -lm && :

ld: error: undefined symbol: __android_log_write
>>> referenced by common.cc:150 (/test1/src/main/cpp/tokenizers-cpp/sentencepiece/third_party/protobuf-lite/common.cc:150)
>>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)) in archive tokenizers_cpp/sentencepiece/src/libsentencepiece.a
>>> referenced by common.cc:158 (/test1/src/main/cpp/tokenizers-cpp/sentencepiece/third_party/protobuf-lite/common.cc:158)
>>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)) in archive tokenizers_cpp/sentencepiece/src/libsentencepiece.a

Here is my CMakeList.txt under the src/main/cpp folder:

# Sets the minimum version of CMake required to build the native library.
cmake_minimum_required(VERSION 3.18.1)

project(distributed_inference_demo C CXX)

add_library(
        # Sets the name of the library.
        distributed_inference_demo
        # Sets the library as a shared library.
        SHARED
        # Provides a relative path to your source file(s).
        native-lib.cpp
        utils.cpp
        inference.cpp
)

set(TOKENIZER_CPP_PATH ${CMAKE_SOURCE_DIR}/tokenizers-cpp)
add_subdirectory(${TOKENIZER_CPP_PATH} tokenizers_cpp)

target_include_directories(distributed_inference_demo PRIVATE
        ${CMAKE_SOURCE_DIR}/include/
        ${TOKENIZER_CPP_PATH}/include/)

add_library(onnxruntime SHARED IMPORTED)
set_target_properties(onnxruntime PROPERTIES IMPORTED_LOCATION ${CMAKE_SOURCE_DIR}/lib/libonnxruntime.so)

# Searches for a specified prebuilt library and stores the path as a
# variable. Because CMake includes system libraries in the search path by
# default, you only need to specify the name of the public NDK library
# you want to add. CMake verifies that the library exists before
# completing its build.
find_library(
        # Sets the name of the path variable.
        log-lib
        # Specifies the name of the NDK library that
        # you want CMake to locate.
        log
)

# Specifies libraries CMake should link to your target library. You
# can link multiple libraries, such as libraries you define in this
# build script, prebuilt third-party libraries, or system libraries.
target_link_libraries(
        distributed_inference_demo
        sentencepiece-static
        tokenizers_c
        tokenizers_cpp
        ${log-lib}
        onnxruntime

)

I'm not sure why this error keeps coming up. I directly cloned this repo and the corresponding repo from sentencepiece and

Please let me know how to solve this issue.

Thanks a lot!

Hi Junru,

Thanks for your kindly response.

I have checked this answer and don't think this approach works. Since my Android C++ backend already has the CMakeList.txt and I have already included the log lib in it, by configuring the Android.mk file will not solve my issue.

I have built the sentencepiece again, but still meet the same error under tokenizers-cpp in Android Studio, but still get the same error message.

tqchen commented

Likely due to a missing dep of the log and working on getting the cmake log in your end, not sure how to debug further, maybe try different ways to link logs

Thanks for your reply. I find a way around. In sentencepiece, if we manually link logs with it in the sentencepiece/src/CMakeList.txt, then it should work.

Original:
image

Modified:
image

I will close this issue since it's resolved.

Interesting! Thanks for sharing your workaround. Definitely something useful if anyone encounters this issue in the future