tensorflow/serving

Issues Installing TF Serving on Jetson TX2

booglerz opened this issue · 7 comments

Hello,

Since our TF models heavily utilize unsupported TF layers, converting our TF Model to a UFF in TensorRT does not seem feasible. Instead, we were thinking of trying to get TensorFlow Serving working on the jetson, to act as a mini server for model inference.

Has anyone done this yet, or know of people who have? I've seen examples of installing TensorFlow on the Jetson so I assumed it might be possible to install TensorFlow Serving as well.

However, I run in issues building TF Serving with Bazel, and have exhausted my ability to narrow down the problem.

So far I have:
Installed all pre-reqs
Installed bazel
cloned TF Serving and attempted to build it from source.

I run into an issue which is similar to memory issues (see below) I've seen around the forums/github pages and have tried to confine the resources used during the build, but nothing works (e.g., bazel build --jobs 1 --local_resources 1024,1.0,1.0 --verbose_failures tensorflow_serving/...)

The error I keep getting is:
Linking of rule '//tensorflow_serving/model_servers:tensorflow_model_server' failed (Exit 1).
bazel-out/local-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/client/ClientConfiguration.o:ClientConfiguration.cpp:function Aws::Client::ComputeUserAgentString(): error: undefined reference to 'Aws::OSVersionInfo::ComputeOSVersionStringabi:cxx11'

collect2: error: ld returned 1 exit status

Does anyone have experience attempting / successfully installing TensorFlow Serving on a Jetson?

Any clue why my build is failing?

Full error if it helps:

ERROR: /home/nvidia/serving/tensorflow_serving/model_servers/BUILD:205:1: Linking of rule '//tensorflow_serving/model_servers:tensorflow_model_server' failed (Exit 1): gcc failed: error executing command
(cd /home/nvidia/.cache/bazel/_bazel_root/e07dd11400dd0f5e80daa6d5086c0965/execroot/tf_serving &&
exec env -
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
PWD=/proc/self/cwd
PYTHON_BIN_PATH=/usr/bin/python
/usr/bin/gcc -o bazel-out/arm-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread '-fuse-ld=gold' -Wl,-no-as-needed -Wl,-z,relro,-z,now -B/usr/bin -B/usr/bin -pass-exit-codes -Wl,--gc-sections -Wl,@bazel-out/arm-opt/bin/tensorflow_serving/model_servers/tensorflow_model_server-2.params)
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/client/ClientConfiguration.o:ClientConfiguration.cpp:function Aws::Client::ComputeUserAgentString(): error: undefined reference to 'Aws::OSVersionInfo::ComputeOSVersionStringabi:cxx11'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/utils/DateTimeCommon.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ToLocalTimeString[abi:cxx11](char const*) const: error: undefined reference to 'Aws::Time::LocalTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/utils/DateTimeCommon.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ToGmtString[abi:cxx11](char const*) const: error: undefined reference to 'Aws::Time::GMTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/utils/DateTimeCommon.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::CalculateCurrentHour(): error: undefined reference to 'Aws::Time::LocalTime(tm*, long)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/utils/DateTimeCommon.o:DateTimeCommon.cpp:function Aws::Utils::DateTime::ConvertTimestampStringToTimePoint(char const*, Aws::Utils::DateFormat): error: undefined reference to 'Aws::Time::TimeGM(tm*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/utils/TempFile.o:TempFile.cpp:function Aws::Utils::TempFile::~TempFile(): error: undefined reference to 'Aws::FileSystem::RemoveFileIfExists(char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/utils/TempFile.o:TempFile.cpp:function Aws::Utils::ComputeTempFileName(char const*, char const*): error: undefined reference to 'Aws::FileSystem::CreateTempFilePathabi:cxx11'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv[abi:cxx11](char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv[abi:cxx11](char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.o:AWSCredentialsProvider.cpp:function Aws::Auth::EnvironmentAWSCredentialsProvider::GetAWSCredentials(): error: undefined reference to 'Aws::Environment::GetEnv[abi:cxx11](char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetConfigProfileFilenameabi:cxx11: error: undefined reference to 'Aws::FileSystem::GetHomeDirectoryabi:cxx11'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetCredentialsProfileFilenameabi:cxx11: error: undefined reference to 'Aws::Environment::GetEnv[abi:cxx11](char const*)'
bazel-out/arm-opt/bin/external/aws/_objs/aws/external/aws/aws-cpp-sdk-core/source/auth/AWSCredentialsProvider.o:AWSCredentialsProvider.cpp:function Aws::Auth::ProfileConfigFileAWSCredentialsProvider::GetCredentialsProfileFilenameabi:cxx11: error: undefined reference to 'Aws::FileSystem::GetHomeDirectoryabi:cxx11'
collect2: error: ld returned 1 exit status

My workaround is
In file:
/home/<user_name>/.cache/bazel/_bazel_<user_name>/<hash>/external/aws/BUILD.bazel
(where <user_name> - user current linux user name,
<hash> is hash like de4a7858eac0c7de37e543fdc903ef12)

In section (cc_library) in my case line 27 replace:
"//conditions:default": []"
with
"//conditions:default": glob(["aws-cpp-sdk-core/source/platform/linux-shared/*.cpp",]),

Then it will successfully build on Jetson.

@booglerz - Hi, is this still an issue ? Did the workaround help you to resolve it ?

There is a workaround provided and it was in awaiting response for more than 7 days. Hence closing this issue.

lgeo3 commented

@booglerz - Hi, is this still an issue ? Did the workaround help you to resolve it ?

The workaround help me to solve the issue. Thanks.
I think the issue should be reopen until a true fix is provided in upstream code.

Noting this as a duplicate of #1277 for prioritization

hi @mrodozov is this pulled into the main tensorflow repo?