Package `larynx-tts_0.5.0_amd64.deb` installs but fails to run older systems
follower opened this issue · 6 comments
Problem
The package larynx-tts_0.5.0_amd64.deb
installs on Elementary OS 5.1 (which is based on Ubuntu 18.04 LTS which is based on Debian ~buster/sid*) but the supplied python3
binary/larynx
script fails to run due to an issue related to libc
versioning.
$ larynx --help
python3: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by python3)
Workaround
I'd recently encountered this issue with another project so was able to work around the issue in the interim by extracting a package with a later version of libc
and helping things find what they were looking for. waves hands here
Cause
Anyway, as far as I'm aware, this issue occurs because the Larynx package is built on a machine with a more recent libc version than the one installed locally.
Which I think is confirmed by this line in the docker config:
Line 5 in 91ea42c
Options for resolving issue
In terms of "resolving" the issue:
- Ideally the package could be built on an older base system docker image so older machines could still run it successfully. (As I understand it, I think the only libc version changes are related to some optimisations but I don't know if they impact Larynx's performance.)
- Alternatively the package could be configured with version information that would prevent installation on older, incompatible systems, unless manually overridden.
I'll admit I didn't really expect the Larynx package to ship its own Python binary instead of depending on system packages but I assume that's to ensure compatibility with compiled extensions?
Appreciation
Despite this issue I was able to get up and running with Larynx after applying the workaround and overall am very happy with the initial resulting output.
Thanks for all the work you've put into the project, I'm really excited about the potential that high quality, free & open source offline text to speech technology brings with it!
Thanks!
Hi @follower, thanks for the detailed feedback!
I had issues getting updates for some of the older releases at one point, so I switched to buster. Ultimately, the problem comes down to the onnxruntime dependency. It looks like bullseye finally has a python3-onnx package, so I can depend on that going forward at least. For now, though, I need the compiled extension which is always bound to a particular version of Python.
To make things harder, the official onnxruntime wheels don't support 32-bit ARM. So even if I could do a pip install during the Debian package installation, it won't work unless I maintain builds for all Python versions (3.6-3.9+). I had to build my version for Python 3.7 on an actual Pi, and it took a day or two!
If you have any suggestions for getting around these problems, I'd love to hear them. These same issues with compiled Python extensions crop up in most of my projects, so any help would be much appreciated 🙂
Thanks for taking the time to read & reply. :)
I'm back with an update after further research & testing, first up...
TL;DR:
On Ubuntu 18.04-LTS derived systems it seems that it is sufficient to first install Python 3.7[0] (e.g. via apt
with):
sudo apt install python3.7
And then (with the Larynx .deb
installed), it should now be possible to run Larynx successfully with:
PYTHONPATH=/usr/lib/larynx-tts:/usr/lib/larynx-tts/usr/local/lib/python3.7/site-packages/ python3.7 -m larynx -v en "Hello."
Downsides to this workaround
Unfortunately it's still not possible to run /usr/bin/larynx
directly as it (intentionally) alters the value of PATH
so that /usr/lib/larynx-tts/usr/local/bin/python3.7
is found & used before any other installed version.
The /usr/lib/larynx-tts
is required in PYTHONPATH
in order for the larynx
module to be found & the /usr/lib/larynx-tts/usr/local/lib/python3.7/site-packages/
is required so that gruut
& other packages are found.
Possible .deb
changes for a fix
This suggests to me that perhaps the .deb
could depend on a system python3.7
and the precompiled library binaries will still be compatible.
If the pre-compiled python3.7
binary was removed from the .deb
then I think the larynx
/larynx-server
scripts might be able to be used unchanged except for replacing python3
with python3.7
on the last line. (Although maybe we'd still have to handle /usr/lib/larynx-tts/usr/local/lib/python3.7/site-packages/
specifically--not sure whether it'll just get found automatically as a result of the other changes to various path configs in the script.)
So, while a bit verbose, and not ideal, this is a straight forward enough workaround to get things working for me.
[0] For me, currently apt show python3.7
now displays:
$ apt show python3.7
Package: python3.7
Version: 3.7.5-2~18.04.4
Priority: optional
Section: universe/python
Origin: Ubuntu
[...]
Will follow-up with another comment providing a bit more background...
[Other than for someone with an idle interest in this issue the following probably isn't particularly necessary to read/write but, what can I say, I'm a completionist. :D ]
The underlying problem
As reported originally, the error message displayed is:
python3: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.28' not found (required by python3)
While there's a number of tools we could use to help investigate this further I tend toward readelf
these days, so when using that to look for symbols that mention GLIBC_2.28
we get:
$ readelf --all /usr/lib/larynx-tts/usr/local/bin/python3 | grep "2\.28"
0000002a95d8 00bd00000007 R_X86_64_JUMP_SLO 0000000000000000 fcntl64@GLIBC_2.28 + 0
189: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fcntl64@GLIBC_2.28 (15)
7672: 0000000000000000 0 FUNC GLOBAL DEFAULT UND fcntl64@@GLIBC_2.28
0bc: 2 (GLIBC_2.2.5) f (GLIBC_2.28) 10 (GLIBC_2.6) 2 (GLIBC_2.2.5)
0x00d0: Name: GLIBC_2.28 Flags: none Version: 15
So, in this case it seems there's only one symbol affected: fcntl64
(which is a plus, because when I've previously encountered this issue there were math-related symbols affected too).
What even is a fcntl64
or a fcntl
?
Why, it's a function to "manipulate file descriptor" fcntl64
/ fcntl
also known as a "kitchen sink". :)
The associated man pages go on to note:
The original Linux fcntl() system call was not designed to handle large file offsets (in the flock structure). Consequently, an fcntl64() system call was added in Linux 2.4. The newer system call employs a different structure for file locking, flock64, and corresponding commands, F_GETLK64, F_SETLK64, and F_SETLKW64. However, these details can be ignored by applications using glibc, whose fcntl() wrapper function transparently employs the more recent system call where it is available. [Emphasis mine.]
However it seems like the concluding comment is in some ways overly optimistic, because (as I understand it) in libc
2.28
the fcntl()
function definition was change to be a preprocessor macro that simply defined fcntl
to be fcntl64
--a change that is apparently not backwardly(?) compatible when the compiled binary is run on older systems.
This is apparently a "known issue" and allegedly intentional.
Whatever can we do?
Well, in our case, install a version of python3.7
built on the older system. (See above. :) )
But, were that not an option, apparently it is/may be possible to write a wrapper function that would enable the binary to run on older systems--but with various levels of cautions against reliability/undefined behaviour.
Given that the underlying "issue" originates in a different project (i.e. Python) I decided there probably wasn't much point pursuing wrapper based workaround in this case.
Related links
For completeness, here's some of the references I used/encountered while researching this:
- https://stackoverflow.com/questions/58472958/how-to-force-linkage-to-older-libc-fcntl-instead-of-fcntl64/58472959 -- This is probably the best overview of the situation and options/trade-offs for workarounds.
- "How to deal with fcntl64?" -- includes mention of creating a symbol alias to workaround the issue.
- The source code line that has the
fcntl
define: https://sourceware.org/git/?p=glibc.git;a=blob;f=io/fcntl.h;h=6b0e9fa1fa2a2d8872b1adb5c0861410756a9e52;hb=3c03baca37fdcb52c3881e653ca392bba7a99c2b#l180- Based on
blame
this appears to be the commit that added the redefine: https://sourceware.org/git/?p=glibc.git;a=commit;f=io/fcntl.h;h=06ab719d30b01da401150068054d3b8ea93dd12f - This appears to be the associated bug/issue: "Bug 20251 - 32bit programs pass garbage in struct flock for OFD locks "
- Thread associated with the commit: https://sourceware.org/legacy-ml/libc-alpha/2018-04/msg00124.html
- Includes couple of posts related to a question about symbol versioning: https://sourceware.org/legacy-ml/libc-alpha/2018-06/msg00709.html
- Based on
- A very old (~2010) email that appears to be about a different
fcntl
/fcntl64
issue: https://marc.info/?l=uclibc&m=135873980749195 (Oh, this appears to be inuclibc
notlibc
.) - Discussion about this issue in another project: https://forum.rebol.info/t/dynamic-linking-static-linking-and-the-march-of-progress/1231 that also discusses workarounds.
- This project apparently uses the wrapper work around approach: https://reviews.bitcoinabc.org/D5902
- Related issue, seems to uses alias/mapping as workaround: ziglang/zig#5882 (comment) (Also https://giters.com/ziglang/zig/issues/9485.)
- https://code.woboq.org/userspace/glibc/sysdeps/unix/sysv/linux/fcntl64.c.html
- https://stackoverflow.com/questions/8823267/linking-against-older-symbol-version-in-a-so-file/8862631#8862631
- https://stackoverflow.com/questions/34480469/why-glibc-fcntl-is-implemented-as-this
- https://stackoverflow.com/questions/66976993/how-to-maintain-compatibility-between-gcc-7-and-gcc-9-with-ofast
- https://stackoverflow.com/questions/59391695/cross-compiling-python-for-arm-from-source
- bincrafters/community#1058
- "[PATCH 7/9] ARM: oabi-compat: rework fcntl64() emulation"
- "linux-user: protect fcntl64 with an #ifdef"
- This commit about a semi-related aspect/issue mentions:
With dyanmical linking, libc is backwards compatible and works but with static linking it does not.
This appears to be the most discussion I've seen about (semi-)related issues on the libc
side, and seems to provide useful context about symbol versioning: "Evolution of ELF symbol management"
Addendum
Part of the reason why I've included the link dump above (outside of completionism :) ) is that the more I looked into it, the less convinced I am that this breakage is intentional.
Now, it may be that libc
just treats such breakage as "expected" & intentional when such a change occurs and so doesn't explicitly mention it because it should be "obvious" based on project norms--or I might not be reading between the lines to see where such breakage is implied as intentional.
However, there are aspects that stand out:
- The code in question appears to have some code that is explicitly related to compatibility.
- The source includes references to
__REDIRECT
,__USE_FILE_OFFSET64
,__USE_LARGEFILE64
which seems to suggest in some situations an alias is intended to be created? - Some discussion of
__REDIRECT
for context: https://stackoverflow.com/questions/37276120/how-does-my-compiler-find-the-stat-file-status-function#37385712
- The source includes references to
- There is no explicit discussion/notification about the impact of compiling with e.g. code that uses
fcntl()
silently usingfcntl64()
instead & thus not working on older systems.-
The commit in question mentions "for architectures which defines __USE_FILE_OFFSET64, fcntl64 will aliased to fcntl and no adjustment would be required.", "A new LFS fcntl64 is added on default ABI [maybe this is the "implying compatibility issues" part] with the usual macros to select it for FILE_OFFSET_BITS=64." and "Keep a compat symbol with old broken semantic". It also mentions "The idea follows other LFS interfaces that provide two symbols".
-
This is also the case in the "News" file for the 2.28 release which mentions:
The fcntl function now have a Long File Support variant named fcntl64. It is added to fix some Linux Open File Description (OFD) locks usage on non LFS mode. As for others *64 functions, fcntl64 semantics are analogous with fcntl and LFS support is handled transparently.
But in other sections goes into details about other functions being deprecated/removed etc.
-
It should also be noted that the
fcntl64
functionality isn't new, just that it was transparently invoked viafcntl
previously--AIUI.
-
- There seems to be no related bug in https://sourceware.org/bugzilla/query.cgi despite multiple complaints about the changes in a number of places online.
- Semi-related bugs: https://sourceware.org/bugzilla/show_bug.cgi?id=28182 &https://sourceware.org/bugzilla/show_bug.cgi?id=23822
- Note: There are a lot of spurious results returned for a search for e.g.
fcntl
due to the (dubious) decision to include/duplicate release notes/messages in almost all the bug reports which means you'll get hits forfcntl
on issues that have nothing to do with it. :/
Anyway, I'm curious about what the reality of the situation is and--given I've run into similar issues twice in the past couple of months--I suspect this is probably not the last time I'll encounter it. So hopefully these notes will prove helpful if that happens.
[Feel free to close this issue when we've handled the Larynx-specific aspect to your satisfaction.]
Edited to add:
- This links to fairly comprehensive looking information about ELF symbol versioning: https://stackoverflow.com/questions/2856438/how-can-i-link-to-a-specific-glibc-version#2858996
- https://stackoverflow.com/questions/62815622/replace-all-calls-to-a-function-with-symbol-version (
fcntl
-specific question)
For (further :D ) completeness here is the workaround referred to in the original issue comment:
- Download e.g.
libc6_2.31-0ubuntu9.2_amd64.deb
from http://archive.ubuntu.com/ubuntu/pool/main/g/glibc/. - Extract (not install) the contents of the package into a new sub-directory with, e.g.:
dpkg-deb --vextract libc6_2.31-0ubuntu9.2_amd64.deb try__libc6_2_31/
- (Larynx-specific step.) Configure the environment with:
(Ignore the
source /usr/bin/larynx
version GLIBC_2.28 not found
error). - Hopefully, running this will now succeed:
./try__libc6_2_31/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 --library-path ./try__libc6_2_31/lib/x86_64-linux-gnu /usr/lib/larynx-tts/usr/local/bin/python3 -m larynx --voice en "Hello! I lib c you!"
What you're doing (AIUI) is executing the later version of the dynamic linker/loader (a.k.a. interpreter) directly (rather than the path stored inside the binary) and telling it (a) where to find the library files it expects to find and (b) what the name of the executable it should run is.
I assume there's reasons why this won't work in all cases but it's worked for both of the ones I've run into recently.
Related links
As far as I can tell none of the easily found "answers" to the question of how to handle this style of error suggest just extracting the files from the .deb
(rather than installing them [inadvisable!], compiling from source [slow] or copying arbitrary number of files from another machine [prone to error]), so hopefully this is useful to someone.
- This one probably gets closest & also notes a couple of "wrinkles" that might affect whether of not direct use of
ld-linux
might work: https://stackoverflow.com/questions/8657908/deploying-yesod-to-heroku-cant-build-statically/#8658468 - Many inadvisable approaches: https://askubuntu.com/questions/1143268/how-to-install-a-libc6-version-2-29 (e.g. see https://askubuntu.com/questions/1314766/whole-system-is-broken-after-failed-libc6-upgrade-attempt)
My recent Debian packages are built for Debian bullseye without including an internal Python interpreter. Do you think it's worth building packages for older systems, or is from source is easy enough?
Somehow side-related but:
-
larynx -h
ModuleNotFoundError: No module named 'gruut'
-
PYTHON_PATH=/usr/lib/larynx-tts/lib/python3.9/site-packages larynx -h
ModuleNotFoundError: No module named 'regex'
-
PYTHONPATH=/usr/lib/larynx-tts python3 -m larynx -h
ModuleNotFoundError: No module named 'onnxruntime'
-
PYTHONPATH=/usr/lib/larynx-tts:/usr/lib/larynx-tts/lib/python3.9/site-packages python3 -m larynx -h
ModuleNotFoundError: No module named 'pycrfsuite._pycrfsuite'
-
PYTHON_PATH=$HOME/.local/lib/python3.8/site-packages:/usr/lib/larynx-tts/lib/python3.9/site-packages:/usr/lib/larynx-tts/lib/python3.9/site-packages:/usr/lib/python3/dist-packages larynx -h
ModuleNotFoundError: No module named 'onnxruntime.capi.onnxruntime_pybind11_state'
With a package (larynx-tts 1.1.0
on Ubuntu 20.04
and Python 3.8) I would expect the installation to be more... fluent.