JasonAtNvidia/JetsonTFBuild

error: patch failed

Closed this issue · 3 comments

Hello,

I tried to run BuildTensorFlow.sh on my TX2 flashed with Jetpack 3.2 using sudo bash BuildTensorFlow.sh. I also changed PYTHON_BIN_PATH=$(which python3) to build for Python 3.

In file included from ./tensorflow/core/platform/default/logging.h:24:0,
from ./tensorflow/core/platform/logging.h:25,
from ./tensorflow/core/lib/core/status.h:25,
from ./tensorflow/core/lib/core/errors.h:21,
from ./tensorflow/core/framework/tensor_shape.h:23,
from ./tensorflow/core/framework/partial_tensor_shape.h:20,
from ./tensorflow/core/framework/attr_value_util.h:23,
from ./tensorflow/core/framework/function.h:21,
from ./tensorflow/core/graph/graph.h:43,
from ./tensorflow/core/graph/costmodel.h:25,
from ./tensorflow/core/common_runtime/costmodel_manager.h:22,
from ./tensorflow/core/distributed_runtime/graph_mgr.h:22,
from ./tensorflow/core/distributed_runtime/worker.h:21,
from tensorflow/core/distributed_runtime/worker.cc:16:
./tensorflow/core/util/tensor_format.h: In instantiation of 'T tensorflow::GetTensorDim(tensorflow::gtl::ArraySlice, tensorflow::TensorFormat, char) [with T = long long int]':
./tensorflow/core/util/tensor_format.h:409:47: required from here
./tensorflow/core/util/tensor_format.h:377:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
CHECK(index >= 0 && index < dimension_attributes.size())
^
./tensorflow/core/platform/macros.h:87:47: note: in definition of macro 'TF_PREDICT_FALSE'
#define TF_PREDICT_FALSE(x) (__builtin_expect(x, 0))
^
./tensorflow/core/util/tensor_format.h:377:3: note: in expansion of macro 'CHECK'
CHECK(index >= 0 && index < dimension_attributes.size())
^
./tensorflow/core/util/tensor_format.h: In instantiation of 'T tensorflow::GetFilterDim(tensorflow::gtl::ArraySlice, tensorflow::FilterTensorFormat, char) [with T = long long int]':
./tensorflow/core/util/tensor_format.h:418:54: required from here
./tensorflow/core/util/tensor_format.h:392:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
CHECK(index >= 0 && index < dimension_attribute.size())
^
./tensorflow/core/platform/macros.h:87:47: note: in definition of macro 'TF_PREDICT_FALSE'
#define TF_PREDICT_FALSE(x) (__builtin_expect(x, 0))
^
./tensorflow/core/util/tensor_format.h:392:3: note: in expansion of macro 'CHECK'
CHECK(index >= 0 && index < dimension_attribute.size())
^
Target //tensorflow/tools/pip_package:build_pip_package up-to-date:
bazel-bin/tensorflow/tools/pip_package/build_pip_package
INFO: Elapsed time: 12038.586s, Critical Path: 592.15s
INFO: 5663 processes, local.
INFO: Build completed successfully, 7591 total actions
Fri May 25 06:54:02 UTC 2018 : === Using tmpdir: /tmp/tmp.GAyvX1ei0o
~/JetsonTFBuild/TensorFlow_Install/tensorflow/bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles ~/JetsonTFBuild/TensorFlow_Install/tensorflow
~/JetsonTFBuild/TensorFlow_Install/tensorflow
/tmp/tmp.GAyvX1ei0o ~/JetsonTFBuild/TensorFlow_Install/tensorflow
Fri May 25 06:54:12 UTC 2018 : === Building wheel
warning: no files found matching '.dll' under directory ''
warning: no files found matching '.lib' under directory ''
warning: no files found matching '' under directory 'tensorflow/aux-bin'
warning: no files found matching '
.h' under directory 'tensorflow/include/tensorflow'
warning: no files found matching '' under directory 'tensorflow/include/Eigen'
warning: no files found matching '
' under directory 'tensorflow/include/external'
warning: no files found matching '.h' under directory 'tensorflow/include/google'
warning: no files found matching '
' under directory 'tensorflow/include/third_party'
warning: no files found matching '*' under directory 'tensorflow/include/unsupported'
~/JetsonTFBuild/TensorFlow_Install/tensorflow
Fri May 25 06:54:55 UTC 2018 : === Output wheel file is in: /home/nvidia/JetsonTFBuild/TensorFlow_Install/tensorflow_pkg
The directory '/home/nvidia/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/nvidia/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
tensorflow-1.8.0-cp35-cp35m-linux_aarch64.whl is not a supported wheel on this platform.
You are using pip version 8.1.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

So I tried again running it using sudo -H after having upgraded pip and as root as well, but I still have an error:

root@tegra-ubuntu:~# cd '/home/nvidia/JetsonTFBuild' 
root@tegra-ubuntu:/home/nvidia/JetsonTFBuild# bash BuildTensorflow.sh 

This bash script will install TensorFlow
branch on a Jetson system that has been setup
by Jetpack with CUDA and cuDNN already installed.

If this is not the case then this script will
likely fail

Expect this script to take up to 6+ hours

Writen by: Jason Tichy < jtichy@nvidia.com >
Version 1.0: Jan 3rd, 2018
Version 1.1: Mar 30, 2018 Added TensorRT support

Note: TF v 1.7.0 release contains a bug for arm
because of a hardcoded x86 path in the TensorRT
Bazel script, you will need to use master to
build with TensorRT support
Get:1 file:/var/cuda-repo-9-0-local InRelease
Ign:1 file:/var/cuda-repo-9-0-local InRelease
Get:2 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 InRelease
Ign:2 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 InRelease
Get:3 file:/var/visionworks-repo InRelease
Ign:3 file:/var/visionworks-repo InRelease
Get:4 file:/var/visionworks-sfm-repo InRelease
Ign:4 file:/var/visionworks-sfm-repo InRelease
Get:5 file:/var/visionworks-tracking-repo InRelease
Ign:5 file:/var/visionworks-tracking-repo InRelease
Get:6 file:/var/cuda-repo-9-0-local Release [574 B]
Get:7 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release [574 B]
Get:8 file:/var/visionworks-repo Release [1,999 B]
Get:9 file:/var/visionworks-sfm-repo Release [2,003 B]
Get:6 file:/var/cuda-repo-9-0-local Release [574 B]
Get:10 file:/var/visionworks-tracking-repo Release [2,008 B]
Get:7 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release [574 B]
Get:11 file:/var/cuda-repo-9-0-local Release.gpg [819 B]
Get:8 file:/var/visionworks-repo Release [1,999 B]
Get:12 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release.gpg [819 B]
Get:9 file:/var/visionworks-sfm-repo Release [2,003 B]
Get:10 file:/var/visionworks-tracking-repo Release [2,008 B]
Get:11 file:/var/cuda-repo-9-0-local Release.gpg [819 B]
Get:12 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release.gpg [819 B]
Ign:11 file:/var/cuda-repo-9-0-local Release.gpg
Ign:12 file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release.gpg
Hit:16 http://ports.ubuntu.com/ubuntu-ports xenial InRelease
Hit:17 http://ports.ubuntu.com/ubuntu-ports xenial-updates InRelease
Hit:18 http://ports.ubuntu.com/ubuntu-ports xenial-security InRelease
Reading package lists... Done
W: GPG error: file:/var/cuda-repo-9-0-local Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY F60F4B3D7FA2AF80
W: The repository 'file:/var/cuda-repo-9-0-local Release' is not signed.
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use.
N: See apt-secure(8) manpage for repository creation and user configuration details.
W: GPG error: file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY F60F4B3D7FA2AF80
W: The repository 'file:/var/nv-tensorrt-repo-ga-cuda9.0-trt3.0.4-20180208 Release' is not signed.
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use.
N: See apt-secure(8) manpage for repository creation and user configuration details.
Reading package lists... Done
Building dependency tree
Reading state information... Done
mlocate is already the newest version (0.26-1ubuntu2).
ncdu is already the newest version (1.11-1build1).
htop is already the newest version (2.0.1-1ubuntu1).
0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded.
Build label: 0.13.0- (@non-git)
Build target: bazel-out/arm-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri May 25 03:18:20 2018 (1527218300)
Build timestamp: 1527218300
Build timestamp as int: 1527218300
M tensorflow/contrib/lite/kernels/internal/BUILD
M third_party/png.BUILD
Already on 'master'
Your branch is up-to-date with 'origin/master'.
error: patch failed: tensorflow/contrib/lite/kernels/internal/BUILD:20
error: tensorflow/contrib/lite/kernels/internal/BUILD: patch does not apply
error: patch failed: third_party/png.BUILD:35
error: third_party/png.BUILD: patch does not apply

@JasonAtNvidia in the section "To do" you mention:

Include a command line option to build for Python 3

Did I forget to do something?
Thanks for your help

I am experiencing the same problem with the 1.8 wheelfile not being supported.
However, I can help with the patch error and get you back on track to figuring out the wheelfile issue. Simply follow these steps below.

cd to the tensorflow_install directory from the terminal then enter

sudo swapoff swapfile.swap That should take a couple of moments to turn off the swap file. Then,
sudo swapoff -a Then,
sudo rm -rf swapfile.swap Finally, cd back to the JetsonTFBuild directory from your terminal and enter,
sudo rm -rf TensorFlow_Install

After you do all of that, you can attempt to install tensorflow again,

Thank you @EmpireofKings I installed tensorflow again as root and it was successful. However I had an error when I tried to run:

import tensorflow as tf 
a = tf.constant(10) 
b = tf.constant(32) 
sess = tf.Session() 
sess.run(a + b) 

"CUDA_ERROR_OUT_OF_MEMORY"

The only way I found to avoid this error was to manually set the memory used by tensorflow.

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7) 
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

Do you know a way so I do not have to allocate the memory manually?

@valentindbdg The memory issue is more of a Jetson design than a Tensorflow problem. Jetson's are designed such that the GPU and CPU share memory. When Tensorflow queries the memory it can only find what the CPU hasn't taken up. Manually setting the memory actually helps because it forces the system to give the memory to the GPU and the CPU goes and finds memory somewhere else.