larq/compute-engine

manylinux2010 building fails

bferrarini opened this issue · 3 comments

Hi,

I need to build the python wrapper for LCE.
I tried with the procedure described here without success.

I set the python version (3.7 in my case) and start the script .github/tools/release_linux.sh After a while I get the following errors:

ERROR: /root/.cache/bazel/_bazel_root/aa829f8eaa4f35b2fc6d736b3640d755/external/org_tensorflow/tensorflow/core/BUILD:1650:1: C++ compilation of rule '@org_tensorflow//tensorflow/core:framework_internal_impl' failed (Exit 4)
In file included from external/org_tensorflow/third_party/eigen3/unsupported/Eigen/CXX11/FixedPoint:46:0,
from external/org_tensorflow/tensorflow/core/framework/numeric_types.h:24,
from external/org_tensorflow/tensorflow/core/framework/allocator.h:26,
from external/org_tensorflow/tensorflow/core/framework/tensor.h:23,
from external/org_tensorflow/tensorflow/core/util/batch_util.h:18,
from external/org_tensorflow/tensorflow/core/util/batch_util.cc:16:

The process ends up like this:
ERROR: /tmp/lce-volume/larq_compute_engine/mlir/BUILD:263:1 C++ compilation of rule '@org_tensorflow//tensorflow /core:framework_internal_impl' failed (Exit 4)
INFO: Elapsed time: 645.690s, Critical Path: 285.63s
INFO: 800 processes: 800 local.
FAILED: Build did NOT complete successfully

Here some data on my environment:

  • Debian 10 64 bit (VMWare virtual machine)
  • python 3.7
  • Docker installed and working (I successfully built lce_banchmark_model)
  • LCE source version is the latest (master branch)

Regards,

Bruno

Hi Bruno,

Thanks for your interest in LCE and for trying to build it. I just did the same steps on my machine in the same Docker container (although with python 3.8) on the master branch, but I didn't get any error like this.

It might be possible that the compiler runs out of memory, because depending on the gcc version it can use more than 16 GB of RAM. Are there any other errors shown near this error that you got?

ERROR: /root/.cache/bazel/_bazel_root/aa829f8eaa4f35b2fc6d736b3640d755/external/org_tensorflow/tensorflow/core/BUILD:1650:1: C++ compilation of rule '@org_tensorflow//tensorflow/core:framework_internal_impl' failed (Exit 4)

If it is a memory issue, you could try creating a file .bazelrc.user in the compute-engine main directory and adding something like this to it:

build --local_ram_resources=HOST_RAM*.20
build --local_cpu_resources=2
startup --host_jvm_args=-Xmx1g

Let me know if that works!

Best regards,
Tom

Hi Tom,

The lack of memory was the problem indeed.
I played with .bazelrc.user and VM box and found a set-up that worked for me.

I increased the VM memory from 4GB to 8GB and set the Bazel parameters as follows.

build --local_ram_resources=HOST_RAM*.80 
build --local_cpu_resources=2
startup --host_jvm_args=-Xmx2g

Many thanks for your help.

Kind Regards,

Bruno

Great to hear that! If you have any other issues, feel free to open another GitHub issue.