pypa/manylinux

DISCUSS: manylinux_2_24 EOL?

h-vetinari opened this issue · 29 comments

manylinux_2_24 is based on Debian 9, which reaches the end of its long-term-support at the end of this month, see e.g. here.

This is obviously out-of-sync with the much longer support of the CentOS-based images, e.g. manylinux2010 being deprecated this year (#1281), and the EOL of manylinux2014 currently noted for 2024-06-30 (see link above).

manylinux_2_24 is noted as supported in the README, but not as an "Acceptable distros to build wheels" in the pep-compliance repo.

Finally, the lack of gcc-toolset (see #1012) makes 2_24 seriously unappealing, c.f. the linked issues in that thread for people staying on manylinux2014 because it has newer compilers. See the outlier in the table below the fold

manylinux vs GCC version
manylinux version GCC version
2010 8.3
2014 10.2
2_24 6.3
2_28 11.1

All things considered, should manylinux_2_24 support expire at the end of the month? Has there been deployment in any appreciable numbers?

What do people think?

The precarious dip in GCC version for 2_24 means that until this image is EOL (or #1012 solved, which doesn't look likely), scipy cannot move beyond GCC 6. From that POV, I wouldn't shed many tears if 2_24 already went the way of the dodo.

Agreed. It can be deprecated.

As an user, I think 2_24 is useless, I can always use 2014 in every situation that I want use 2_24, but can not vice versa.

I'm okay to start a deprecation process for this one given all its shortcomings.

Using https://github.com/sethmlarson/pypi-data:
There are 275 packages out there that used or are still using manylinux_2_24.

sqlite3 'pypi.db' 'SELECT name FROM wheels WHERE platform LIKE "%manylinux_2_24%" GROUP BY name;' | wc -l
275

Let's ask maintainers of project in top 5000 packages using manylinux_2_24:

psycopg-binary / psycopg2-binary: @dvarrazzo

Hi there,

For psycopg, 2_24 is the preferred build platform, because it ships with libssl 1.1.x, as opposite as Centos 7 shipping with 0.9.x, which is extremely buggy and segfaults if psycopg is used together with the Python ssl module (psycopg/psycopg2#543). It is the reason we stopped shipping wheels packages with the psycopg2 distribution and came out with the psycopg2-binary distribution hack.

Because of the above, when building on manylinux2014, we need to build from scratch several libraries (libpq, libssl, ldap, sasl).

psycopg2-binary is built with manylinux2014 on i686 and x86_64, with 2_24 on aarch64 and ppc64le platforms. Building on 2014 predates 2_24, otherwise I'd be happy to ditch 2014 altogether and just build the packages using the Debian system packages. That would make me happy for 5 minutes, after which a mob with pitchforks will start knocking at my door because of the loss of backward compatibility, of course, putting a quick end to my happiness.

psycopg-binary (which is a C speedup module for Psycopg 3, not a whole complete distribution as in psycopg2) is only build with 2_24. This is not immune to problems (psycopg/psycopg#124) however it provides a much simpler build chain. Build time is already pretty long as it is on aarch64/ppc64le (around 20 mins) and if we had to build the depending libraries will likely go up into the hours.

Dropping 2_24 will hence cause me a few days of work to port libraries building to Centos, which I'm not 100% sure will work on aarch/ppc. I will be grumpy for a few days, then probably I'll get over it.

Should it happen, I would like to know ASAP, because I would like to release psycopg 3.1 in a few weeks and it would be wise to choose which build platform to use before that rather than in a bugfix release, because the libpq has several build-time parameters and subtle incompatibilities between the packaged versions and the own-built versions are likely (see psycopg/psycopg2#1365 for example).

Thank you for the heads up!

cryptography builds on manylinux2014 images in addition to 2_24 (on both x86_64 and aarch64) partially to see what percentages of our user base can adopt newer versions. Currently over 60% of our downloads come from the 2_24 wheels, but any platform that understands 2_24 will work with 2014 so from an operational perspective dropping 2_24 isn't a big deal to pyca projects.

That said, is the intent to have only 2014 and a new 2_28? In the past we've found that as the images age they become increasingly difficult to use sanely even if the distribution is still technically supported, and making manylinux2014 be the only option for broad compatibility across the Python user base is a bit scary. I know that volunteer resources are highly limited though, so that may be the only feasible path.

Thanks for tagging us 😄

alex commented

Yes, to me the question is what are we deprecating it in favor, something newer or something older? I think it'd be unfortunate to remove the newest manylinux image, but would be perfectly fine replacing it something newer.

That said, is the intent to have only 2014 and a new 2_28?

I think so (the roll-out of 2_28 is almost complete, see #1282). This would also continue the pattern of the previous images being based (respectively) on RHEL 5, 6, 7, and 2_28 being based on RHEL 8 (through AlmaLinux as CentOS is now Stream).

Looking at https://mayeut.github.io/manylinux-timeline/, the "glibc readiness" per python version looked the following on 1st of June:

python version 2.24+ 2.26+ 2.27+ 2.28+
3.7 95.7% 93.4% 62.8% 30.4%
3.8 98.9% 98.6% 95.7% 79.7%
3.9 99.3% 99.2% 98.4% 97.3%
3.10 99.6% 99.6% 99.4% 95.9%

Given that python 3.7 is already in security-only release mode (and has been dropped at end of '21 already by libraries following NEP29), this IMHO looks like jumping straight to 2_28 should be mostly feasible.**

** note, I took glibc readiness because that is harder to change. Using manylinux_2_x also needs a new enough pip which is not installed everywhere out of the box (see "policy readiness" as opposed to "glibc readiness"), but at least pip can be updated.

For pyjson5 I simply let cibuildwheel build all the wheels for "completeness". It does not need any feature not found in manylinux2014 or 2010. (Actually I did not realize that the "newer" platform tag uses an older GCC version.)

ijl commented

orjson now publishes manylinux_2_28 and manylinux2014, dropping manylinux_2_24. I initially had feedback from python3.10 users on Amazon Linux 2 (glibc 2.26?) and Ubuntu 18.04 (glibc 2.27?) that CPython wheels weren't available, so I've added manylinux2014 wheels on even the most recent Python.

I didn't know the approach was that manylinux2014 would be supported longer than manylinux_2_24. It's probably been mentioned elsewhere but I only read the README here--maybe adding it here would help others.

httpstan is now on manylinux_2_28. Thanks for the heads up.

pmdarima is now on manylinux_2_28 as well. Thank you!

Next bottleneck release will be on manylinux2014.

So far, it seems that there was no huge outcry about keeping the announced 2_24 EOL (not that removing 2_24 is something fun for anyone, but it's very heartening to see how quickly people responded to this discussion - thanks and 👍).

Note that I don't make any decisions/policy here, I just wanted to start the discussion based on the impending EOL. Perhaps some pypa-folks should chime in too (@mayeut, care to tag anyone in particular?)...

The next CVXPY and ecos-python releases will be on manylinux2014.

Thanks for all the quick feedback.

Perhaps some pypa-folks should chime in too (@mayeut, care to tag anyone in particular?)...

@henryiii, @lkollar might have some insight on this, @messense too

I'll try to address some of the comments made so far.

cryptography builds on manylinux2014 images in addition to 2_24 (on both x86_64 and aarch64) partially to see what percentages of our user base can adopt newer versions. Currently over 60% of our downloads come from the 2_24 wheels, but any platform that understands 2_24 will work with 2014 so from an operational perspective dropping 2_24 isn't a big deal to pyca projects.

@reaperhulk, out-of curiosity, what are you using to get those stats ? pypinfo, custom BigQuery request, another tool ? there might be other ways to get those (if they're not too inconvenient of course and depending on the other part of "partially").

That said, is the intent to have only 2014 and a new 2_28? In the past we've found that as the images age they become increasingly difficult to use sanely even if the distribution is still technically supported, and making manylinux2014 be the only option for broad compatibility across the Python user base is a bit scary. I know that volunteer resources are highly limited though, so that may be the only feasible path.

If 2_24 gets deprecated then yes, we would only have 2014 and 2_28 (1 & 2010 are also still there as long as there's no volunteer resources required and there's a demand for it - base distros where EOL on 2017-03-31 & 2020-11-30).

Keeping 2_24 now requires almost no volunteer resources for the manylinux project itself. The only thing that's becoming a concern is CI time for arm64/ppc64le/s390x.

If the gap between 2014 and 2_28 is deemed too large and too risky, I'd rather we keep 2_24 for now than being asked to re-add it later on. However, when looking at "major" distros, there are only 3 in the gap 2_24/2_28:

  • debian 9 (i.e. 2_24 itself): it's worth mentioning that it will transition to ELTS from 2022-07-01 to 2027-06-30.
  • amazonlinux2 (glibc 2.26): EOL 2023-06
  • ubuntu18.04 (glibc 2.27): EOL 2023-04-30 / ELTS 2028-04-30

@dvarrazzo, there's too much to just quote one thing and it seems you've already tried a great deal to work around every issue. There are too much specifics and I can't possibly give good advices on all the issues you're seeing. The only recommendation I dare to make, and it's only my 2 cents, is that if building dependencies from sources is an option - and it seems you're using that for manylinux2014 - I'd probably go that way, always. It comes with its own caveats as you know and mentioned. For build times, caching the built dependencies might help.

Maybe a decent plan for deprecation would be to wait for amazonlinux2 EOL / ubuntu 18.04 EOL (not ELTS) and re-assess the situation around that time ?

Hi @mayeut

in this moment it seems that our project is going in two different directions:

  • I have worked to move psycopg 3 from 2_24 to 2014 just in the last couple of days (psycopg/psycopg#124 (comment)). I needed to work a bit on the existing scripts, but the heavy lifting was done in psycopg 2, so the experience wasn't overly bad.
  • @bryanculver, in the meantime has moved the psycopg 2 images using 2_24 to 2_28: psycopg/psycopg2#1459 It's a work I appreciate and, because it targets platforms less conservative than x86_64, it might be worth pursuing that route.

I assume that both directions are valuable. I was personally happy to see the Debian build chain being introduced, mostly because I'm a day-to-day Ubuntu user and I find that environment more familiar. However, it seems clear that the Centos-based build chains enjoy a longer lifetime and, in this context, it seems a very valuable characteristic. It was a bit of a disappointment to see 2_24 being sunsetted so early, but I understand the reasons behind it.

I agree with your observation that the gap between 2014 and 2_28 is so wide that, removing 2_24 altogether, would leave a scary chasm to straddle. If it's not an huge burden it might be useful to leave it there, while people decide where to go, one direction or the other, as the maintainers mentioned in this thread and I seem to be doing.

Thank you all for your great work!

@mayeut

For build times, caching the built dependencies might help.

Also yes, it is actually the last thing left to test for me. Building the package and its dependencies as 2014 now works on all the supported platforms but, as expected, build time is close to 2h on certain platforms.

I will probably finish with this last step and release 3.0.15 with 2014-only packages.

Just a dump of a few thoughts:

One of the main reasons to use 2_24 was for _GLIBCXX_USE_CXX11_ABI. Since manylinux2014 is GCC 4.8 based, it has the old ABI; if you compile a package you either need the old ABI or you won't be compatible with other extensions (unless they also use the non-default ABI). While 2_24 always used the new ABI.

We have supported past-EOL before (manylinux1 was 4 or more years past EOL). Moving from 2_24 to 2_28 removes support for some distros - though that's mostly Ubuntu 18.04 AFAICT. RHEL 8+ is supported by both, and 7 is not supported by either. I don't think that's a big problem to lose 18.04 when we already had to give up CentOS 7.

I'm not sure how great this is long-term; having a newer GCC on an older system was really nice - this means we are now stuck with compilers that are as old as the system, while before we could cheat a bit. There are other RHEL OSS builds besides the now useless for us CentOS like Rocky Linux and Alma Linux. If they can use the RHEL dev toolset and come in all important arches (don't know if they do), might that be more useful for building? (looks like 2_28 would be the name of it, though, which would be problematic with the Debian one is already out). This also seems like rather short EoL's.

I'm not sure how great this is long-term; having a newer GCC on an older system was really nice - this means we are now stuck with compilers that are as old as the system, while before we could cheat a bit. There are other RHEL OSS builds besides the now useless for us CentOS like Rocky Linux and Alma Linux. If they can use the RHEL dev toolset and come in all important arches (don't know if they do), might that be more useful for building? (looks like 2_28 would be the name of it, though, which would be problematic with the Debian one is already out). This also seems like rather short EoL's.

@henryiii, a lot of this was discussed in #1282. Basically, we can get ABI-compatibility, a current devtoolset, long support & arch coverage as before, and this discussion has already yielded the availability of manylinux_2_28 images, which can presumably be extended to RHEL 9 (2_34) in the future.

manylinux_2_28 is Debian. Same issues that 2_24 has, just a bit newer and EoL is a bit further in the future. RHEL 8 is also 2_28, and if Alma or Rocky included all the arches we do, then that would also be 2_28. Though I guess it's fine if we did introduce a RHEL 8 based image with the same target, it would just need a different image name.

manylinux_2_28 is Debian.

No it isn't.

Ahhh, I was looking at

# default to latest supported policy, x86_64
ARG BASEIMAGE=amd64/debian:9
ARG POLICY=manylinux_2_24

(which now that I look at the third line, is still 2_24) Great!

Okay, I'll change

I don't think that's a big problem to lose 18.04 when we already had to give up CentOS 7.

to "I think it's a good idea". 2_28 is the natural "next" step after manylinux2014, and 2_24 was the odd-one-out on Debian. I know NumPy failed to use it for what they wanted to use it for (better alternate arch support) since the compilers were too old to support the alternate arch's well. So +1 from me.

I'm fine with manylinux_2_24 EOL.

I mostly work on manylinux cross compiling docker images, they are stuck on old GCC toolchain because of lacking devtoolset based cross compilers, but since the docker images were built for building Python extensions from Rust crates, it's not a big issue since most dependencies are pure Rust.

Ahh, cross compiling was a reason to use 2_24 over manylinux2014, forgot about that (since you are generally stuck with the system compilers when cross compiling). But 2_28 is better than 2_24 for that anyway, and probably not much more restrictive at runtime.

@mayeut
The EOL for manylinux_2_24 as shown on https://github.com/mayeut/pep600_compliance has come and gone, and it seems (to me at least) that there is consensus in the discussion here. Would it be time to open an announcement issue like #1281 and close this one?

#1369 has been created.