[BUG] NDK r28 beta1: issues with existing polyfills
dg0yt opened this issue ยท 36 comments
Description
This is in response to the r28 beta1 release notes.
If your project contains polyfills for any of those APIs, this change may
break your build due to the conflicting declarations. The simplest fix is to
rename your polyfill to not collide with libc. For example, rename
conflicting_api to conflicting_api_fallback and call that instead. Use
#define conflicting_api() conflicting_api_fallback() if you want to avoid
rewriting callsites.Please open a bug if you run into issues with existing polyfills. We may be
able to add the polyfill directly to the NDK.
A test build in vcpkg CI (which uses android 21) shows a lot of breakage related to __INTRODUCED_IN()
with polyfills from gnulib. There is no full picture yet because the build error cascade starts early. The first hit is mempcpy
in GNU libiconv. I added a patch to rename the polyfill, but the next failures is already gettext-libintl.
In the non-gnulib group, openssl is affected, removing many ports from the build. For poppler, I assume that GNU libiconv used to be the polyfill, but I didn't verify that. "libiconv as a polyfill" might be a broader pattern.
I grepped the arm64 logs for __INTRODUCED_IN
. Transformed and compressed:
gettext-libintl 'feof_unlocked' is unavailable: introduced in Android 23
gettext-libintl 'fgets_unlocked' is unavailable: introduced in Android 28
gettext-libintl 'mblen' is unavailable: introduced in Android 26
gettext-libintl 'mempcpy' is unavailable: introduced in Android 23
gettext-libintl 'nl_langinfo' is unavailable: introduced in Android 26
gsasl 'error_at_line' is unavailable: introduced in Android 23
gsasl 'error' is unavailable: introduced in Android 23
gsasl 'error_message_count' is unavailable: introduced in Android 23
gsasl 'error_print_progname' is unavailable: introduced in Android 23
gsasl 'fflush_unlocked' is unavailable: introduced in Android 28
gsasl 'fputs_unlocked' is unavailable: introduced in Android 28
gsasl '__freading' is unavailable: introduced in Android 28
gsasl '__fsetlocking' is unavailable: introduced in Android 23
gsasl 'getrandom' is unavailable: introduced in Android 28
gsasl 'iconv_close' is unavailable: introduced in Android 28
gsasl 'iconv' is unavailable: introduced in Android 28
gsasl 'iconv_open' is unavailable: introduced in Android 28
gsasl 'mblen' is unavailable: introduced in Android 26
gsasl 'nl_langinfo' is unavailable: introduced in Android 26
libconfuse 'fmemopen' is unavailable: introduced in Android 23
libidn2 'error_at_line' is unavailable: introduced in Android 23
libidn2 'error' is unavailable: introduced in Android 23
libidn2 'error_message_count' is unavailable: introduced in Android 23
libidn2 'error_print_progname' is unavailable: introduced in Android 23
libidn2 'nl_langinfo' is unavailable: introduced in Android 26
libidn2 'strchrnul' is unavailable: introduced in Android 24
liblsl 'pthread_getname_np' is unavailable: introduced in Android 26
libuv 'freeifaddrs' is unavailable: introduced in Android 24
libuv 'getifaddrs' is unavailable: introduced in Android 24
libuv 'preadv' is unavailable: introduced in Android 24
libuv 'pthread_barrier_destroy' is unavailable: introduced in Android 24
libuv 'pthread_barrier_init' is unavailable: introduced in Android 24
libuv 'pthread_barrier_wait' is unavailable: introduced in Android 24
libuv 'pwritev' is unavailable: introduced in Android 24
openssl 'getentropy' is unavailable: introduced in Android 28
poppler 'iconv_close' is unavailable: introduced in Android 28
poppler 'iconv' is unavailable: introduced in Android 28
poppler 'iconv_open' is unavailable: introduced in Android 28
Patching gnutext-libintl would probably add many more.
(All those packages build with NDK r26. NDK r27 is still blocked by its bugs.)
Affected versions
r28
Canary version
beta1
Host OS
Linux
Host OS version
vcpkg CI image
Affected ABIs
arm64-v8a
Build system
Other (specify below)
Other build system
No response
minSdkVersion
21
Device API level
No response
In gettext libintl, the mempcpy related errors also include:
[..]gettext-runtime/intl/dcigettext.c:1723:1: error: static declaration of 'mempcpy' follows non-static declaration
Ooh, if vcpkg's CI has issues that gives us a great place to go look at the scope of the problem. Thanks! I should remember to try that whenever I'm making ambitious changes in the future.
We'll have a look and see what can be done. My gut says that we'll re-hide the stuff that's causing problems for r28 and work on upstreaming fixes. If it were a shorter list of projects I'd say we should just fix those projects and ship r28 as-is, but if it's hit a non-trivial number of projects just in vcpkg, that's a lot of patches to upstream, and a lot of package updates that developers would need to take to get those fixes.
A complete rollback is an option, but it's sort of the last resort. For the reasons mentioned in the changelog, this is a step in the right direction that we want to take, so I'd like to see us make progress wherever we can do so safely. I'm okay with gradual progress though, so if we need to re-hide the stuff that's broadly polyfilled already we can do that to avoid breaks while we fix the various open source projects to be compatible with the change.
A test build in vcpkg CI (which uses android 21) shows a lot of breakage related to __INTRODUCED_IN() with polyfills from gnulib.
How did you do this? I was expecting to find a GitHub workflow for this, or instructions on how to build all the ports in their docs, but I'm not finding either.
I eventually figured this out: ANDROID_NDK_HOME=/path/to/ndk vcpkg ci --triplet=arm64-android
, which seems to do what I need.
This is very slow though so if there's a smarter way to do this please lmk.
i agree with danalbert's plan to quickly fix all this by just adding #if
version checks...
...but for the longer-term solution, here's a quick brain dump in case i go under a bus:
gettext-libintl 'feof_unlocked' is unavailable: introduced in Android 23
gettext-libintl 'fgets_unlocked' is unavailable: introduced in Android 28
gsasl 'fflush_unlocked' is unavailable: introduced in Android 28
gsasl 'fputs_unlocked' is unavailable: introduced in Android 28
gsasl '__freading' is unavailable: introduced in Android 28
gsasl '__fsetlocking' is unavailable: introduced in Android 23
i don't really see any way to polyfill any of the unlocked stuff unless we re-expose the FILE implementation details. which maybe we should, but (a) that's a big decision and (b) we should definitely see what the wasm plan looks like in that area first. (since they might be the first actual beneficiaries of the increased opacity.)
libconfuse 'fmemopen' is unavailable: introduced in Android 23
large but doable, modulo similar FILE-internals leakage.
gettext-libintl 'mempcpy' is unavailable: introduced in Android 23
trivial one-liner. we should probably have done this as a polyfill in the first place and stopped there given that memcpy() can even be optimized to direct loads/stores, but mempcpy() is definitely going to cost you two function calls. where it should really just cost you an add.
gettext-libintl 'mblen' is unavailable: introduced in Android 26
so believe it or not, mblen() is a trivial wrapper for mbrlen() which is freely available. so surprisingly, this should be a trivial polyfill too.
gettext-libintl 'nl_langinfo' is unavailable: introduced in Android 26
this might seem like an obvious "no", but it's one where i wonder whether this is actually a great candidate for "let's just polyfill all the time". why? because Android has hard-coded answers to all the questions. and because every call i've ever seen asks a hard-coded question. so i'd expect that clang would be able to turn nl_langinfo("AM_STR")
into "AM" or whatever. (of course, this probably isn't as exciting as it sounds, because in practice nothing is using this for anything that actually matters.)
gsasl 'error_at_line' is unavailable: introduced in Android 23
gsasl 'error' is unavailable: introduced in Android 23
gsasl 'error_message_count' is unavailable: introduced in Android 23
gsasl 'error_print_progname' is unavailable: introduced in Android 23
the reliance on globals here makes this a "no".
gsasl 'iconv_close' is unavailable: introduced in Android 28
gsasl 'iconv' is unavailable: introduced in Android 28
gsasl 'iconv_open' is unavailable: introduced in Android 28
even our limited set of supported encodings is probably too big for this to make sense.
libidn2 'strchrnul' is unavailable: introduced in Android 24
as long as we don't care about not having the optimized arm64 implementation (which is kind of a moot point if your alternative is "nothing", as here), there's a trivial implementation of this.
liblsl 'pthread_getname_np' is unavailable: introduced in Android 26
certainly doable, though it's a weak symbol so that native bridge can override it, so doing so would likely cause a bit of trouble there.
libuv 'freeifaddrs' is unavailable: introduced in Android 24
libuv 'getifaddrs' is unavailable: introduced in Android 24
nightmare.
libuv 'pthread_barrier_destroy' is unavailable: introduced in Android 24
libuv 'pthread_barrier_init' is unavailable: introduced in Android 24
libuv 'pthread_barrier_wait' is unavailable: introduced in Android 24
this doesn't seem to touch any internal pthread state, and it's all relatively self-contained and based on <stdatomic.h>, so i think this is something we could do ... it would leak the currently-private internal implementation of pthread_barrier_t, though i suspect we could always cheat there by giving the polyfill struct and functions slightly different names?
libuv 'preadv' is unavailable: introduced in Android 24
libuv 'pwritev' is unavailable: introduced in Android 24
conceptually easy, but really annoying as long as we're still supporting ILP32/32-bit off_t. which, realistically, is something we'll be doing for longer than we're still supporting API 24. certainly doable though. just gross.
gsasl 'getrandom' is unavailable: introduced in Android 28
openssl 'getentropy' is unavailable: introduced in Android 28
getentropy() relies on getrandom() under the covers, and although that's been around since 3.17, it was only added to bionic at the end of 2017, so only 2018+ devices will have a seccomp policy that will let you use these. TL;DR: not likely.
A test build in vcpkg CI (which uses android 21) shows a lot of breakage related to __INTRODUCED_IN() with polyfills from gnulib.
How did you do this? I was expecting to find a GitHub workflow for this, or instructions on how to build all the ports in their docs, but I'm not finding either.
It a draft PR, microsoft/vcpkg#41293. It is in the timeline here. The failure logs are available as "assets" from Azure Pipelines.
I eventually figured this out:
ANDROID_NDK_HOME=/path/to/ndk vcpkg ci --triplet=arm64-android
, which seems to do what I need.
Basically yes, but vcpkg CI also skips known failures with --ci-baseline=scripts/ci.baseline.txt
.
This is very slow though so if there's a smarter way to do this please lmk.
Of course ... vcpkg builds debug+release, and you start with empty caches for assets and artifacts.
You can look at individual ports with vcpkg install <port> --triplet=arm64-android
.
Maybe you can limit the build to the release config with --cmake-args=-DVCPKG_BUILD_TYPE=release
.
I definitely looked for PRs... I must have accidentally done that in the wrong tab with a different repo. Thanks.
This is very slow though so if there's a smarter way to do this please lmk.
Of course
Ah, okay. If this is just a thing that's slow, alright.
Note that when the Microsoft team added Android, they didn't fix all the existing ports. scripts/ci.baseline.txt really must be applied. Otherwise you will meet ports which will make you cry.
Oh, I'd misunderstood you before and thought that was applied automatically. That explains why I've found mostly stuff that didn't work with r27 either.
gettext-libintl 'feof_unlocked' is unavailable: introduced in Android 23
gettext-libintl 'fgets_unlocked' is unavailable: introduced in Android 28
gettext-libintl 'mblen' is unavailable: introduced in Android 26
gettext-libintl 'mempcpy' is unavailable: introduced in Android 23
gettext-libintl 'nl_langinfo' is unavailable: introduced in Android 26
I see (and have fixed) the mempcpy
one, but the others don't repro for me (using vcpkg install --triplet=arm64-android gettext-libintl
).
gsasl 'error_at_line' is unavailable: introduced in Android 23 gsasl 'error' is unavailable: introduced in Android 23 gsasl 'error_message_count' is unavailable: introduced in Android 23 gsasl 'error_print_progname' is unavailable: introduced in Android 23
the reliance on globals here makes this a "no".
No, this is more broken than you'd think. The gl/
subdirectory of the gsasl source tarball is a copy of the glibc sysroot (the files all say This file is part of the GNU C Library.
), and it's using that and bionic. I tried with an older NDK I had installed (r25b) and it surprisingly does build. I'm guessing it builds just because the bionic headers and glibc headers are compatible enough then, but this is fundamentally a broken (mixing and matching C libraries has been a source of bugs for Android apps in the past; it's the reason that the NDK's libc++ wasn't reliable for the first few years of its existence) port and honestly a bit alarming. It's not just glibc headers in there, there are implementations, and quite a lot of them!
I'm honestly not sure if I'm doing anyone any favors by "fixing" the NDK to make this port compatible again. I'm not convinced that the previous port which built would actually work.
nothing says "Simple Authentication and Security Layer" like "we copied and merged multiple libcs" :-)
The
gl/
subdirectory of the gsasl source tarball is a copy of the glibc sysroot (the files all sayThis file is part of the GNU C Library.
),
nothing says "Simple Authentication and Security Layer" like "we copied and merged multiple libcs" :-)
I think this interpretation needs some adjustment, even if I didn't check gsasl in particular.
gl
probably is not a naive copy of glibc but actually Gnulib, a software collection meant to help with portability issues. And it is the package mentioned in my initial post.
Gnulib is not a standalone lib, but vendored into the configuration and build of software packages. It is organized into "modules", some meant to add, rectify or enhance implementations of ISO C or POSIX functtions.
https://www.gnu.org/software/gnulib/manual/gnulib.html#Support-for-ISO-C-or-POSIX-functions_002e
Polyfill. Subject to check results during configure
, gnulib modules will be added to a build. They may provide implementation under the same name, or replacements with a different identifier but mapped using macros.
I don't consider myself an expert for autotools or gnulib. I honestly also don't know what is the best way to deal with this, and I don't ask for a particular solution. Gnulib providing an implementation under the same name may be hard to align with the Android ecosystem's variation over API level and actual runtime.
The vcpkg problem is that the modules are littered into all those packages based on autotools. They all need to be patched individually until Gnulib improvements are integrated by each single package.
gl probably is not a naive copy of glibc but actually Gnulib, a software collection meant to help with portability issues. And it is the package mentioned in my initial post.
Gnulib is not a standalone lib, but vendored into the configuration and build of software packages. It is organized into "modules", some meant to add, rectify or enhance implementations of ISO C or POSIX functtions.
https://www.gnu.org/software/gnulib/manual/gnulib.html#Support-for-ISO-C-or-POSIX-functions_002e
Polyfill. Subject to check results during configure, gnulib modules will be added to a build. They may provide implementation under the same name, or replacements with a different identifier but mapped using macros.
Ah, that comment in the headers is pretty misleading if so.
Odd. https://www.gnu.org/software/gnulib/manual/gnulib.html#Support-for-ISO-C-or-POSIX-functions_002e says that gnulib's polyfills are supposed to be prefixed to avoid exactly this kind of name conflict.
The vcpkg problem is that the modules are littered into all those packages based on autotools. They all need to be patched individually until Gnulib improvements are integrated by each single package.
Yep, we do understand that. That's why we're partially reverting bionic's header changes for now. I am going to look through the existing ports while I do that though, because some of them may be problems with the ports. 7zip wasn't in your list, but it's the first one I looked at and it's just wrong, so soon 7zip should build for android: microsoft/vcpkg#41721. For the rest where it's not a port problem, I'll revert the bionic header change that's causing the problem. Like I said that's already been done for mempcpy
.
Odd. https://www.gnu.org/software/gnulib/manual/gnulib.html#Support-for-ISO-C-or-POSIX-functions_002e says that gnulib's polyfills are supposed to be prefixed to avoid exactly this kind of name conflict.
Ah, because I'd come to this bug prepared to find issues with polyfills, but that's not what this is. This looks to be an autoconf detection issue. It's trying to use error_message_count
because it's declared, even though it's declared unavailable. I don't know why it's considering that as available.
I happened to be watching the log re-populate when rebuilding libuv after hiding the pthread_barrier_*
stuff, and it doesn't actually protect against calls if it's not available:
/usr/local/google/home/danalbert/src/vcpkg/buildtrees/libuv/src/v1.49.2-5cffa54d5b.clean/src/thread-common.c:154:18: warning: implicit declaration of function 'pthread_barrier_init' [-Wimplicit-function-declaration]
154 | return UV__ERR(pthread_barrier_init(barrier, NULL, count));
| ^
/usr/local/google/home/danalbert/src/vcpkg/buildtrees/libuv/src/v1.49.2-5cffa54d5b.clean/src/thread-common.c:161:8: warning: implicit declaration of function 'pthread_barrier_wait' [-Wimplicit-function-declaration]
161 | rc = pthread_barrier_wait(barrier);
| ^
/usr/local/google/home/danalbert/src/vcpkg/buildtrees/libuv/src/v1.49.2-5cffa54d5b.clean/src/thread-common.c:171:7: warning: implicit declaration of function 'pthread_barrier_destroy' [-Wimplicit-function-declaration]
171 | if (pthread_barrier_destroy(barrier))
| ^
3 warnings generated.
It links though, so I guess those calls are probably culled by dead code elimination?
Once this stack of commits is merged, the list you gave above all builds, aside from poppler, which is broken for other reasons. I think upstream forgot an #include
for something in the stdlib? Thanks for your help with this.
It's going to be a bit before I can actually pull this update into the NDK. The OS repo is not in a state where I can pull a new sysroot, so I have to wait for that before I can update the NDK. We won't ship r28 without that, it's just going to be a while before we can get even a canary build to recheck vcpkg :(
I happened to be watching the log re-populate when rebuilding libuv after hiding the
pthread_barrier_*
stuff, and it doesn't actually protect against calls if it's not available:/usr/local/google/home/danalbert/src/vcpkg/buildtrees/libuv/src/v1.49.2-5cffa54d5b.clean/src/thread-common.c:154:18: warning: implicit declaration of function 'pthread_barrier_init' [-Wimplicit-function-declaration] 154 | return UV__ERR(pthread_barrier_init(barrier, NULL, count)); | ^ /usr/local/google/home/danalbert/src/vcpkg/buildtrees/libuv/src/v1.49.2-5cffa54d5b.clean/src/thread-common.c:161:8: warning: implicit declaration of function 'pthread_barrier_wait' [-Wimplicit-function-declaration] 161 | rc = pthread_barrier_wait(barrier); | ^ /usr/local/google/home/danalbert/src/vcpkg/buildtrees/libuv/src/v1.49.2-5cffa54d5b.clean/src/thread-common.c:171:7: warning: implicit declaration of function 'pthread_barrier_destroy' [-Wimplicit-function-declaration] 171 | if (pthread_barrier_destroy(barrier)) | ^ 3 warnings generated.
It links though, so I guess those calls are probably culled by dead code elimination?
ah, funnily enough i saw your code change before i saw this, and my review comment there might be the answer --- i think they're actually checking for the corresponding constant #define instead. my guess is that if you undo the change you did make, and #if around the #define instead, you'll get the result you're looking for.
(this was phrased more as a question on the code review because i wasn't sure whether i just didn't understand the cmake bumf...)
i think they're actually checking for the corresponding constant #define instead.
I'd have to hide the #define
and the function decls. If I only hide the #define
then it'll use its polyfill path and then fail for "redefined with different attributes" error or however it called it:
#if defined(_AIX) || \
defined(__OpenBSD__) || \
!defined(PTHREAD_BARRIER_SERIAL_THREAD)
int uv_barrier_init(uv_barrier_t* barrier, unsigned int count) {
If I hide only the function decls, anyone that needs to call these via dlsym
can still use PTHREAD_BARRIER_SERIAL_THREAD
.