
TF RPI build fails with AWS lib related linker errors

sdeoras opened this issue · 7 comments

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Linux rpi3-0 4.14.79-v7+ #1159 SMP Sun Nov 4 17:50:20 GMT 2018 armv7l GNU/Linux

  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:

  • TensorFlow installed from (source or binary):

  • TensorFlow version:

  • Python version:
    Python 2.7.13

  • Installed using virtualenv? pip? conda?:

  • Bazel version (if compiling from source):
    $ bazel version
    WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
    INFO: Invocation ID: a32f1f82-b910-4312-b73c-335b3090412a
    Build label: 0.20.0- (@non-git)
    Build target: bazel-out/arm-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
    Build time: Thu Dec 13 03:56:33 2018 (1544673393)
    Build timestamp: 1544673393
    Build timestamp as int: 1544673393

  • GCC/Compiler version (if compiling from source):
    gcc (Raspbian 4.8.5-4) 4.8.5

  • CUDA/cuDNN version:

  • GPU model and memory:

Describe the problem
My use case is running a binary, on Raspberry Pi, serving a TF model that has dynamic dependency on libtensorflow.so. Since I could not find pre-build TF C-library for RPI I tried to build it from source. After much pain and suffering I was able to get the code to compile, but saw linker errors related to AWS library. It would be great if users could download a pre-build library but in the mean time how can I resolve these errors. I do not need cloud support in TF, so if easier I would simply like to switch off such functionality.

Provide the exact sequence of commands / steps that you executed before running into the problem

  • Built proto (success)
  • Build bazel (success)
  • Check out v1.12.0 for TF code

Manually edit files with these commands:

  • grep -Rl 'lib64' | xargs sed -i 's/lib64/lib/g'
  • sed -i "s|#define IS_MOBILE_PLATFORM|//#define IS_MOBILE_PLATFORM|g" tensorflow/core/platform/platform.h

Install a few after seeing compile issues

  • sudo apt-get install gcc-4.8 g++-4.8
  • sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 100
  • sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.8 100
  • sudo apt-get install libc-ares-dev

No compile issues at this point after I added following flags to bazel command: --copt="-std=gnu99" --define=grpc_no_ares=true. So I think sudo apt-get install libc-ares-dev is probably not required.

Anyways, I then ran following command to build TF c-library:
bazel build -c opt --copt="-mfpu=neon-vfpv4" --copt="-funsafe-math-optimizations" --copt="-ftree-vectorize" --copt="-fomit-frame-pointer" --jobs 1 --local_resources 1024,1.0,1.0 --verbose_failures --genrule_strategy=standalone --spawn_strategy=standalone --incompatible_remove_native_http_archive=false --copt="-std=gnu99" --define=grpc_no_ares=true //tensorflow:libtensorflow.so

Any other info / logs
It took 10 hours but all the code compiled fine, however, I ran into linker issues as follows:

ERROR: /home/pi/gocode/tensorflow/tensorflow/BUILD:423:1: Linking of rule '//tensorflow:libtensorflow.so' failed (Exit 1): gcc failed: error executing command
  (cd /home/pi/.cache/bazel/_bazel_pi/e36eb5edf9626338e60e5f0e9f0b5230/execroot/org_tensorflow && \
  exec env - \
    PATH=/home/pi/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/bin/python \
    PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
    TF_NEED_CUDA=0 \
    TF_NEED_ROCM=0 \
  /usr/bin/gcc -shared -o bazel-out/arm-opt/bin/tensorflow/libtensorflow.so -z defs -Wl,--version-script tensorflow/c/version_script.lds '-Wl,-rpath,$ORIGIN/' -Wl,-soname,libtensorflow.so -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread '-fuse-ld=gold' -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -pass-exit-codes -Wl,--gc-sections -Wl,@bazel-out/arm-opt/bin/tensorflow/libtensorflow.so-2.params)
Execution platform: @bazel_tools//platforms:host_platform
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetConfigProfileFilename(): error: undefined reference to 'Aws::FileSystem::GetHomeDirectory()'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetCredentialsProfileFilename(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetCredentialsProfileFilename(): error: undefined reference to 'Aws::FileSystem::GetHomeDirectory()'
bazel-out/arm-opt/bin/external/aws/_objs/aws/ClientConfiguration.pic.o:ClientConfiguration.cpp:function Aws::Client::ClientConfiguration::ClientConfiguration(): error: undefined reference to 'Aws::OSVersionInfo::ComputeOSVersionString()'
bazel-out/arm-opt/bin/external/aws/_objs/aws/DateTimeCommon.pic.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampStringToTimePoint(char const*, Aws::Utils::DateFormat): error: undefined reference to 'Aws::Time::TimeGM(tm*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/DateTimeCommon.pic.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampToLocalTimeStruct() const: error: undefined reference to 'Aws::Time::LocalTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/DateTimeCommon.pic.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampToGmtStruct() const: error: undefined reference to 'Aws::Time::GMTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/TempFile.pic.o:TempFile.cpp:function .LTHUNK5: error: undefined reference to 'Aws::FileSystem::RemoveFileIfExists(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/TempFile.pic.o:TempFile.cpp:function Aws::Utils::ComputeTempFileName(char const*, char const*): error: undefined reference to 'Aws::FileSystem::CreateTempFilePath()'
collect2: error: ld returned 1 exit status
Target //tensorflow:libtensorflow.so failed to build
INFO: Elapsed time: 173.724s, Critical Path: 126.80s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

@petewarden Any thoughts on this ?

also tried instructions here, but ran into following errors with TF v1.12.0:

Building for the Pi Two/Three, with NEON acceleration
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
INFO: Invocation ID: 537ebdec-4670-44b7-8f27-b9d61f978791
Loading: 0 packages loaded
Analyzing: 2 targets (1 packages loaded)
ERROR: cc_toolchain_suite '@local_config_arm_compiler//:toolchain' does not contain a toolchain for CPU 'armeabi', you may want to add an entry for 'armeabi|compiler' into toolchains and toolchain_identifier 'arm-linux-gnueabihf' into the corresponding cc_toolchain rule (see --incompatible_disable_cc_toolchain_label_from_crosstool_proto).
INFO: Elapsed time: 1.550s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (2 packages loaded)
FAILED: Build did NOT complete successfully (2 packages loaded)

The ability to disable AWS was removed in TF 1.12 but recently readded:

Apply those two patches: https://gist.github.com/fyhertz/4cef0b696b37d38964801d3ef21e8ce2 and restart your build with --config=noaws

edit: One of the patch also moves the tools/bazel.rc, it has nothing to do with your issue, I needed it to build with bazel 0.19.2 :)

mixaz commented

Any ideas how solve that "missing toolchain for armeabi" issue?

Closing this issue since RPI build is now working on master via cross-compiling workflow. Thank you TF team!

I got the same problem when compiling 2.0.0-alpha on FreeBSD.

Is AWS support intended to be working at all?