tensorflow/tensorflow

TF RPI build fails with AWS lib related linker errors

sdeoras opened this issue · 7 comments

Please make sure that this is a build/installation issue. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:build_template

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
    Linux rpi3-0 4.14.79-v7+ #1159 SMP Sun Nov 4 17:50:20 GMT 2018 armv7l GNU/Linux

  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
    n/a

  • TensorFlow installed from (source or binary):
    Source

  • TensorFlow version:
    v1.12.0

  • Python version:
    Python 2.7.13

  • Installed using virtualenv? pip? conda?:
    nope

  • Bazel version (if compiling from source):
    $ bazel version
    WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
    /home/pi/gocode/tensorflow/tools/bazel.rc
    INFO: Invocation ID: a32f1f82-b910-4312-b73c-335b3090412a
    Build label: 0.20.0- (@non-git)
    Build target: bazel-out/arm-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
    Build time: Thu Dec 13 03:56:33 2018 (1544673393)
    Build timestamp: 1544673393
    Build timestamp as int: 1544673393

  • GCC/Compiler version (if compiling from source):
    gcc (Raspbian 4.8.5-4) 4.8.5

  • CUDA/cuDNN version:
    n/a

  • GPU model and memory:
    n/a

Describe the problem
My use case is running a binary, on Raspberry Pi, serving a TF model that has dynamic dependency on libtensorflow.so. Since I could not find pre-build TF C-library for RPI I tried to build it from source. After much pain and suffering I was able to get the code to compile, but saw linker errors related to AWS library. It would be great if users could download a pre-build library but in the mean time how can I resolve these errors. I do not need cloud support in TF, so if easier I would simply like to switch off such functionality.

Provide the exact sequence of commands / steps that you executed before running into the problem

  • Built proto (success)
  • Build bazel (success)
  • Check out v1.12.0 for TF code

Manually edit files with these commands:

  • grep -Rl 'lib64' | xargs sed -i 's/lib64/lib/g'
  • sed -i "s|#define IS_MOBILE_PLATFORM|//#define IS_MOBILE_PLATFORM|g" tensorflow/core/platform/platform.h

Install a few after seeing compile issues

  • sudo apt-get install gcc-4.8 g++-4.8
  • sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 100
  • sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.8 100
  • sudo apt-get install libc-ares-dev

No compile issues at this point after I added following flags to bazel command: --copt="-std=gnu99" --define=grpc_no_ares=true. So I think sudo apt-get install libc-ares-dev is probably not required.

Anyways, I then ran following command to build TF c-library:
bazel build -c opt --copt="-mfpu=neon-vfpv4" --copt="-funsafe-math-optimizations" --copt="-ftree-vectorize" --copt="-fomit-frame-pointer" --jobs 1 --local_resources 1024,1.0,1.0 --verbose_failures --genrule_strategy=standalone --spawn_strategy=standalone --incompatible_remove_native_http_archive=false --copt="-std=gnu99" --define=grpc_no_ares=true //tensorflow:libtensorflow.so

Any other info / logs
It took 10 hours but all the code compiled fine, however, I ran into linker issues as follows:

ERROR: /home/pi/gocode/tensorflow/tensorflow/BUILD:423:1: Linking of rule '//tensorflow:libtensorflow.so' failed (Exit 1): gcc failed: error executing command
  (cd /home/pi/.cache/bazel/_bazel_pi/e36eb5edf9626338e60e5f0e9f0b5230/execroot/org_tensorflow && \
  exec env - \
    PATH=/home/pi/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/games:/usr/games \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/bin/python \
    PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
    TF_DOWNLOAD_CLANG=0 \
    TF_NEED_CUDA=0 \
    TF_NEED_OPENCL_SYCL=0 \
    TF_NEED_ROCM=0 \
  /usr/bin/gcc -shared -o bazel-out/arm-opt/bin/tensorflow/libtensorflow.so -z defs -Wl,--version-script tensorflow/c/version_script.lds '-Wl,-rpath,$ORIGIN/' -Wl,-soname,libtensorflow.so -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread '-fuse-ld=gold' -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -pass-exit-codes -Wl,--gc-sections -Wl,@bazel-out/arm-opt/bin/tensorflow/libtensorflow.so-2.params)
Execution platform: @bazel_tools//platforms:host_platform
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetConfigProfileFilename(): error: undefined reference to 'Aws::FileSystem::GetHomeDirectory()'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetCredentialsProfileFilename(): error: undefined reference to 'Aws::Environment::GetEnv(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/AWSCredentialsProvider.pic.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetCredentialsProfileFilename(): error: undefined reference to 'Aws::FileSystem::GetHomeDirectory()'
bazel-out/arm-opt/bin/external/aws/_objs/aws/ClientConfiguration.pic.o:ClientConfiguration.cpp:function Aws::Client::ClientConfiguration::ClientConfiguration(): error: undefined reference to 'Aws::OSVersionInfo::ComputeOSVersionString()'
bazel-out/arm-opt/bin/external/aws/_objs/aws/DateTimeCommon.pic.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampStringToTimePoint(char const*, Aws::Utils::DateFormat): error: undefined reference to 'Aws::Time::TimeGM(tm*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/DateTimeCommon.pic.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampToLocalTimeStruct() const: error: undefined reference to 'Aws::Time::LocalTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/DateTimeCommon.pic.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampToGmtStruct() const: error: undefined reference to 'Aws::Time::GMTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/TempFile.pic.o:TempFile.cpp:function .LTHUNK5: error: undefined reference to 'Aws::FileSystem::RemoveFileIfExists(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/TempFile.pic.o:TempFile.cpp:function Aws::Utils::ComputeTempFileName(char const*, char const*): error: undefined reference to 'Aws::FileSystem::CreateTempFilePath()'
collect2: error: ld returned 1 exit status
Target //tensorflow:libtensorflow.so failed to build
INFO: Elapsed time: 173.724s, Critical Path: 126.80s
INFO: 0 processes.
FAILED: Build did NOT complete successfully

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

@petewarden Any thoughts on this ?

also tried instructions here, but ran into following errors with TF v1.12.0:

Building for the Pi Two/Three, with NEON acceleration
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/workspace/tools/bazel.rc
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
INFO: Invocation ID: 537ebdec-4670-44b7-8f27-b9d61f978791
Loading:
Loading: 0 packages loaded
Analyzing: 2 targets (1 packages loaded)
ERROR: cc_toolchain_suite '@local_config_arm_compiler//:toolchain' does not contain a toolchain for CPU 'armeabi', you may want to add an entry for 'armeabi|compiler' into toolchains and toolchain_identifier 'arm-linux-gnueabihf' into the corresponding cc_toolchain rule (see --incompatible_disable_cc_toolchain_label_from_crosstool_proto).
INFO: Elapsed time: 1.550s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (2 packages loaded)
FAILED: Build did NOT complete successfully (2 packages loaded)

The ability to disable AWS was removed in TF 1.12 but recently readded:
3437098

Apply those two patches: https://gist.github.com/fyhertz/4cef0b696b37d38964801d3ef21e8ce2 and restart your build with --config=noaws

edit: One of the patch also moves the tools/bazel.rc, it has nothing to do with your issue, I needed it to build with bazel 0.19.2 :)

mixaz commented

also tried instructions here, but ran into following errors with TF v1.12.0:

I'm also facing this issue when cross-compiling for RPI, per docs. Seems to be another issue.

Any ideas how solve that "missing toolchain for armeabi" issue?

Closing this issue since RPI build is now working on master via cross-compiling workflow. Thank you TF team!

I got the same problem when compiling 2.0.0-alpha on FreeBSD.

Is AWS support intended to be working at all?