Ekumen-OS/beluga

Beluga debians fail to build for Noetic on `armhf`

Opened this issue · 5 comments

Bug description

Platform (please complete the following information):

  • OS: Ubuntu Focal
  • Beluga version: 2.0.1

How to reproduce

See https://build.ros.org/job/Nbin_ufhf_uFhf__beluga__ubuntu_focal_armhf__binary/9/consoleFull.

Expected behavior
Build completes successfully.

Actual behavior
Build fails with tons of compiler errors.

Error from console log:

cc1plus: error: unrecognized command line option ‘-Wno-gnu-zero-variadic-macro-arguments’ [-Werror]

Seems like we can remove that argument. Unfortunately, I wasn't able to reproduce the failure using the buildfarm scripts for this platform. Something fails before reaching the build step.

There is another spammy message that is safe to ignore:

note: parameter passing for argument of type ‘...’ changed in GCC 7.1

Ah, there is also another easy-to-miss error showing in the logs:

/tmp/binarydeb/ros-noetic-beluga-2.0.1/include/beluga/sensor/ndt_sensor_model.hpp:265:43: error: conversion from ‘std::array<long long unsigned int, 2>::value_type’ {aka ‘long long unsigned int’} to ‘std::vector<Eigen::Array<double, 2, 2> >::size_type’ {aka ‘unsigned int’} may change value [-Werror=conversion]
11:13:13   265 |   std::vector<Eigen::Array<double, 2, 2>> covariances(dims_out[0]);
11:13:13       |                                           ^~~~~~~~~~~

Re-opening this one as we got new results from the Buildfarm with version 2.0.2.

https://build.ros.org/job/Nbin_ufhf_uFhf__beluga__ubuntu_focal_armhf__binary/35/consoleFull

03:10:37 /tmp/binarydeb/ros-noetic-beluga-2.0.2/include/beluga/views/take_evenly.hpp:67:60: error: conversion from ‘long long unsigned int’ to ‘std::size_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
03:10:37    67 |       const std::size_t m0 = (index - 1UL) * (count - 1UL) / (size - 1UL);
03:10:37       |                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
03:10:37 /tmp/binarydeb/ros-noetic-beluga-2.0.2/include/beluga/views/take_evenly.hpp:68:52: error: conversion from ‘long long unsigned int’ to ‘std::size_t’ {aka ‘unsigned int’} may change value [-Werror=conversion]
03:10:37    68 |       const std::size_t m1 = index * (count - 1UL) / (size - 1UL);
03:10:37       |     

A couple more errors related to size_t being unsigned int in this platform.

The fix is easy but it would be wise to make sure that the build can be created locally before submitting a new release.

Unfortunately, I'm hitting this bug when trying to reproduce locally using QEMU: https://gitlab.kitware.com/cmake/cmake/-/issues/20568

My plan is to bump CMake from 3.16 to 3.19 which seems like it will fix the CMake bug and then fix the remaining size_t compatibility bugs. We should probably bump the priority of #382 if we want to prevent regressions going forward.

FYI @hidmic @glpuga

@nahueespinosa +1 to addressing #382, but do note that bumping to CMake 3.19 means we'll no longer be compatible with Ubuntu Focal nor ROS Noetic, by extension (see packages.ubuntu.com and REP-3). IIUC there are no armhf builds in the ROS 2 buildfarm because it's a Tier 3 supported architecture (and thus it is only available when building from source, see REP-2000).

All in all, I wouldn't go to great lengths to support armhf. To the best of my knowledge, 32 bit processor architectures have been confined to embedded systems (though I'm sure @glpuga will know better). Until we get serious about supporting embedded applications, I wouldn't spend too many cycles on it.

Good point, I guess I can try to reproduce issues locally with CMake 3.19 and submit the fixes without actually bumping the version in the repository.

Also we can keep 3.16 as the minimum required version, but use 3.19 to test armhf on CI using QEMU.