patchelf-0.17.2 seems to corrupt emacs on `staging-next`
trofi opened this issue ยท 24 comments
On current staging-next
iteration quite a few emacs-dependent packages are failing. The failures seem to stem from the fact that emacs
is incorrectly modified by patchelf-0.17.2
(0.15.0
works, bisected in nixpkgs
by @mweinelt).
$ nix run https://github.com/NixOS/nixpkgs/archive/staging-next.tar.gz#emacs
Segmentation fault (core dumped)
$ patchelf --version
patchelf 0.17.2
It seems to have something to do with modified library list:
$ nix shell https://github.com/NixOS/nixpkgs/archive/staging-next.tar.gz#emacs
$ gdb emacs
Reading symbols from emacs...
warning: Loadable section ".dynstr" outside of ELF segments
in /nix/store/lzahvwakhghr8b3ri40s935bwhn7nf0x-emacs-28.2/bin/emacs-28.2
warning: Loadable section ".dynamic" outside of ELF segments
in /nix/store/lzahvwakhghr8b3ri40s935bwhn7nf0x-emacs-28.2/bin/emacs-28.2
(No debugging symbols found in emacs)
(gdb) run
Starting program: /nix/store/lzahvwakhghr8b3ri40s935bwhn7nf0x-emacs-28.2/bin/emacs
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fe6597 in dl_main ()
from /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/ld-linux-x86-64.so.2
(gdb) bt
#0 0x00007ffff7fe6597 in dl_main ()
from /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/ld-linux-x86-64.so.2
#1 0x00007ffff7fe2a06 in _dl_sysdep_start ()
from /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/ld-linux-x86-64.so.2
#2 0x00007ffff7fe45ad in _dl_start ()
from /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/ld-linux-x86-64.so.2
#3 0x00007ffff7fe33a8 in _start ()
from /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/ld-linux-x86-64.so.2
#4 0x0000000000000001 in ?? ()
#5 0x00007fffffffd1f3 in ?? ()
#6 0x0000000000000000 in ?? ()
If gdb
is to be believed loadable program headers that contain ".dynstr"
and ".dynamic"
are not what they should be.
LD_DEBUG
also suggests very little could be loaded by ld.so
:
$ LD_DEBUG=all emacs
2608983: symbol=__vdso_clock_gettime; lookup in file=linux-vdso.so.1 [0]
2608983: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_clock_gettime' [LINUX_2.6]
2608983: symbol=__vdso_gettimeofday; lookup in file=linux-vdso.so.1 [0]
2608983: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_gettimeofday' [LINUX_2.6]
2608983: symbol=__vdso_time; lookup in file=linux-vdso.so.1 [0]
2608983: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_time' [LINUX_2.6]
2608983: symbol=__vdso_getcpu; lookup in file=linux-vdso.so.1 [0]
2608983: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_getcpu' [LINUX_2.6]
2608983: symbol=__vdso_clock_getres; lookup in file=linux-vdso.so.1 [0]
2608983: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_clock_getres' [LINUX_2.6]
Segmentation fault (core dumped)
eu-elflint
is also unhappy:
$ eu-elflint /nix/store/lzahvwakhghr8b3ri40s935bwhn7nf0x-emacs-28.2/bin/emacs
section [ 7] '.dynstr' not fully contained in segment of program header entry 2
section [ 8] '.dynamic': alloc flag set but section not in any loaded segment
section [29] '.symtab': symbol 1 (__abi_tag): st_value out of bounds
section [29] '.symtab': _GLOBAL_OFFSET_TABLE_ symbol size 0 does not match .got section size 9736
section [29] '.symtab': symbol 5599 (_DYNAMIC): st_value out of bounds
section [29] '.symtab': _DYNAMIC_ symbol value 0x6ab560 does not match dynamic segment address 0x40ee60
section [29] '.symtab': _DYNAMIC symbol size 0 does not match dynamic segment size 1184
section [29] '.symtab': symbol 6650 (__bss_start): st_value out of bounds
loadable segment [2] is writable but contains no writable sections
Thanks for reporting the issue.
I'll investigate when I get home.
I think I will be able to reproduce the issue easily using nix, but if you could attach the result of readelf -a -W
for the binary, it always helps.
Using emacs
as an example bisected patchelf
down to 42394e8 write out replace sections in original order
.
It's gist st change of traversal from
for (auto & i : replacedSections) {
const std::string & sectionName = i.first;
auto & shdr = findSectionHeader(sectionName);
to
/* We iterate over the sorted section headers here, so that the relative
position between replaced sections stays the same. */
for (auto & shdr : shdrs) {
std::string sectionName = getSectionName(shdr);
auto i = replacedSections.find(sectionName);
if (i == replacedSections.end())
continue;
I suspect it has a chance to miss newly added sections if patchelf
ever does that. But maybe not in emacs
case. readelf
(attached below "before" and "after") says both have 31 sections. But I'm not sure I believe it.
Attaching readelf -a -W
:
- readelf-aw-bad.txt: at this commit
- readelf-aw-good.txt: just before this commit
Looks like one of program headers got lost (or merged into existing one):
diff -u readelf-aw-good.txt readelf-aw-bad.txt | cat
--- readelf-aw-good.txt 2023-03-18 18:30:42.604009844 +0000
+++ readelf-aw-bad.txt 2023-03-18 18:31:55.812435119 +0000
@@ -10,11 +10,11 @@
Version: 0x1
Entry point address: 0x427ea0
Start of program headers: 64 (bytes into file)
- Start of section headers: 6757680 (bytes into file)
+ Start of section headers: 6753584 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
- Number of program headers: 15
+ Number of program headers: 14
I debugged something like this. It had to do with rounding of load segments overlapping.
Thanks for this info!
I think this is the same scenario I saw in: #446
We can see the following LOAD segments with different read/write permissions:
LOAD 0x000000 0x00000000003ff000 0x00000000003ff000 0x01057c 0x01057c RW 0x1000
LOAD 0x01057c 0x000000000040f57c 0x000000000040f57c 0x0085cc 0x0085cc R 0x1000
The first one goes from 0x3ff000 to 0x40f57c
And the second goes from 0x40f57c to 0x417b48
Notice that the alignment is 0x1000
so the OS will have to map the pages:
0x3ff000
to0x410000
0x40e000
to0x418000
There is an overlap between the two. So the fix #469 should hopefully fix this issue.
I'll try the "master" commit tomorrow and see if it fixes the issue.
I tried master
as well and it did not fix the issue for me. Tested as:
--- a/pkgs/applications/editors/emacs/generic.nix
+++ b/pkgs/applications/editors/emacs/generic.nix
@@ -46,6 +46,7 @@
else "lucid")
, withSystemd ? lib.meta.availableOn stdenv.hostPlatform systemd, systemd
, withTreeSitter ? lib.versionAtLeast version "29", tree-sitter ? null
+, patchelfUnstable
}:
assert (libXft != null) -> libpng != null; # probably a bug
@@ -135,7 +136,7 @@ assert withTreeSitter -> tree-sitter != null;
""
];
- nativeBuildInputs = [ pkg-config makeWrapper ]
+ nativeBuildInputs = [ pkg-config makeWrapper patchelfUnstable ]
++ lib.optionals (srcRepo || withMacport) [ texinfo ]
++ lib.optionals srcRepo [ autoreconfHook ]
++ lib.optional (withX && (withGTK3 || withXwidgets)) wrapGAppsHook;
diff --git a/pkgs/development/tools/misc/patchelf/unstable.nix b/pkgs/development/tools/misc/patchelf/unstable.nix
index 66c14bd07e0..3f20cb7834f 100644
--- a/pkgs/development/tools/misc/patchelf/unstable.nix
+++ b/pkgs/development/tools/misc/patchelf/unstable.nix
@@ -2,13 +2,13 @@
stdenv.mkDerivation rec {
pname = "patchelf";
- version = "unstable-2023-03-07";
+ version = "unstable-2023-03-18";
src = fetchFromGitHub {
owner = "NixOS";
repo = "patchelf";
- rev = "ea2fca765c440fff1ff74e1463444dea7b819db2";
- sha256 = "sha256-IH80NcLhwjGpIXEjHuV+NgaSC+Y/PXquxZ/C8Bl+CLk=";
+ rev = "265b31ae22c6e1d20b01295aaa7bcf28fd31a5cf";
+ sha256 = "sha256-+iGvdjXvhk5mN8jp3u+M9fICKFqbtyZCx+WjQszaB1o=";
};
# Drop test that fails on musl (?)
NixOS/nixpkgs#221900 was merged into staging-next
to fix emacs
. You might need to revert the change locally to reproduce it on staging-next
.
Weird, I can't reproduce the crash using the commit before the merge.
# before merge
nix run https://github.com/NixOS/nixpkgs/archive/6c70dbc.gz#emacs
# after merge
nix run https://github.com/NixOS/nixpkgs/archive/ce7e136.gz#emacs
(I'm new to Nix, so I might be doing something wrong)
I can reproduce the messages from gdb
and eu-elflint
in both hashes.
I reverted it locally and now I can reproduce it. Not sure what is the difference, but let me get to it.
This commit fails for me (it's the one directly preceeding the patchelf-0.15.0
pin):
$ nix run https://github.com/NixOS/nixpkgs/archive/403b148aa51073bc343febbbfd041ecd495dbe3e.tar.gz#emacs
Segmentation fault (core dumped)
This should allow extracting exact binary:
$ nix build https://github.com/NixOS/nixpkgs/archive/403b148aa51073bc343febbbfd041ecd495dbe3e.tar.gz#emacs
$ result/bin/emacs
Segmentation fault (core dumped)
Note so far:
- I generated an unpatched emacs by removing the
patchelf
invocation on emacs/generic.nix and I'm running them manually. Seems to be a good way to debug these things. - patching emacs with the new layout engine from #477 generates good binaries.
- emacs is patched to change rpath and then add a needed so. Issue happens only after the "add needed".
Still investigating. Looks like the strtab
is falling out of the LOAD segment that is supposed to map it into memory.
I'm quite sure the missing thing is an else
to this if:
Line 1015 in 265b31a
When we enter that if, we split a LOAD segment in two: the one the loads the replaced sections and the one that keeps loading the sections that stay in place.
However, for the "else", when there is enough space before the first non-replaced section, we don't check if the LOAD is large enough to map all the new rewritten sections. Then some of the sections may be dangling out of the load.
It's a bit hard though to nail the exact fix
I think we should revert the patchelf default on staging-next
for now โ or switch to any other reliable version. I confirmed that ldc
is broken by that as well, and I've seen some other build regressions that look caused by that. Spraying weird failures all over nixpkgs is just bad, and I fear not all will be even shown on Hydra (and I'm not counting out-of-official-repo use cases).
I wonder if you'd want to get a jobset on Hydra to verify a full nixpkgs rebuild before a (stable?) patchelf release is made. (or possibly even on a PR/branch if it's considered risky) This certainly isn't the first time we had to revert the default, e.g. I found NixOS/nixpkgs#69213
I wonder if you'd want to get a jobset on Hydra to verify a full nixpkgs rebuild before a (stable?) patchelf release is made. (or possibly even on a PR/branch if it's considered risky)
That would be lovely. I can see that Patchelf has accumulated fixes without an associated test. It's a bit painful to recover from that and being able to verify all nix packages will certainly help a lot. Perhaps we should add "ldd" test after patching anything because it catches several issues.
I have now bumped patchelf to 0.18.0 in a branch based on nixpkgs staging and my emacs seems to be no longer corrupted.
pcloud seems to be affected by this as well - what's curious, when compiled with patchelfUnstable, it crashes inside ld-linux-x86-64.so.2!
i.e. doing:
diff --git a/pkgs/applications/networking/pcloud/default.nix b/pkgs/applications/networking/pcloud/default.nix
index 403d1e0cf34..93e9eb9b1d1 100644
--- a/pkgs/applications/networking/pcloud/default.nix
+++ b/pkgs/applications/networking/pcloud/default.nix
@@ -34,6 +34,7 @@
, libXdamage
, nss
, udev
+, patchelfUnstable
}:
let
@@ -62,6 +63,7 @@ stdenv.mkDerivation {
nativeBuildInputs = [
autoPatchelfHook
+ patchelfUnstable
];
buildInputs = [
... and then:
$ NIXPKGS_ALLOW_UNFREE=1 nix build --impure .#pcloud
$ gdb --args bash ./result/bin/pcloud
(gdb) b main
Breakpoint 1 at 0x31340
(gdb) r
Breakpoint 1, 0x0000555555585340 in main ()
(gdb) c
... results in:
warning: Loadable section ".interp" outside of ELF segments
in /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/pcloud
warning: Loadable section ".note.ABI-tag" outside of ELF segments
in /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/pcloud
warning: Loadable section ".dynstr" outside of ELF segments
in /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/pcloud
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fece58 in strcmp () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
(gdb) bt
#0 0x00007ffff7fece58 in strcmp () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#1 0x00007ffff7fdcaa4 in _dl_check_map_versions () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#2 0x00007ffff7fdd0f0 in _dl_check_all_versions () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#3 0x00007ffff7fe54bc in version_check_doit () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#4 0x00007ffff7fcb3fb in _dl_receive_error () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#5 0x00007ffff7fe7ae9 in dl_main () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#6 0x00007ffff7fe4483 in _dl_sysdep_start () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#7 0x00007ffff7fe5bac in _dl_start () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#8 0x00007ffff7fe4a58 in _start () from /nix/store/yaz7pyf0ah88g2v505l38n0f3wg2vzdj-glibc-2.37-8/lib/ld-linux-x86-64.so.2
#9 0x0000000000000001 in ?? ()
#10 0x00007fffffffe184 in ?? ()
#11 0x0000000000000000 in ?? ()
eu-elflint also seems to have an issue with one of the files there:
$ eu-elflint /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/libnode.so
section [ 2] '.dynsym': symbol 594 (_ZTSN6icu_6025CollationFastLatinBuilderE): st_value out of bounds
section [ 2] '.dynsym': symbol 603 (_ZTSN6icu_608CalendarE): st_value out of bounds
section [ 2] '.dynsym': symbol 608 (_ZTSN6icu_609LocaleKeyE): st_value out of bounds
section [ 2] '.dynsym': symbol 694 (_ZN6icu_6011StringPiece4nposE): st_value out of bounds
section [ 2] '.dynsym': symbol 704 (_ZN6icu_6016CollationBuilder11HAS_BEFORE2E): st_value out of bounds
section [ 2] '.dynsym': symbol 705 (_ZN6icu_609Collation20LEVEL_SEPARATOR_BYTEE): st_value out of bounds
section [ 2] '.dynsym': symbol 714 (_ZTSN6icu_6013ResourceValueE): st_value out of bounds
section [ 2] '.dynsym': symbol 734 (_ZN2v88internal17GCIdleTimeHandler22kConservativeTimeRatioE): st_value out of bounds
section [ 2] '.dynsym': symbol 751 (_ZTSN6icu_6010GenderInfoE): st_value out of bounds
section [ 2] '.dynsym': symbol 769 (_ZTSN6icu_6022UIterCollationIteratorE): st_value out of bounds
section [ 2] '.dynsym': symbol 828 (_ZTSN6icu_6014HebrewCalendarE): st_value out of bounds
section [ 2] '.dynsym': symbol 840 (_ZN6icu_6016CollationBuilder11HAS_BEFORE3E): st_value out of bounds
section [ 2] '.dynsym': symbol 935 (_ZTSN6icu_6014SimpleTimeZoneE): st_value out of bounds
section [ 2] '.dynsym': symbol 959 (_ZN6icu_6018CalendarAstronomer2PIE): st_value out of bounds
section [ 2] '.dynsym': symbol 989 (_ZN2v88internal11interpreter20ConstantArrayBuilder14k16BitCapacityE): st_value out of bounds
(+ like 100 more of those)
This issue seems to exist on all patchelf versions available in nixpkgs now (i.e. patchelf 0.13, 0.15 and unstable-2023-04-25 all generate invalid libnode.so's, which seems to have been somewhat exacerbated by NixOS/nixpkgs#209870 since it now additionally links libgcc_s.so.1
).
fwiw, it looks like pcloud (x86-64_linux) got (more) broken by #469 - i.e. doing:
diff --git a/pkgs/applications/networking/pcloud/default.nix b/pkgs/applications/networking/pcloud/default.nix
index 403d1e0cf34..93e9eb9b1d1 100644
--- a/pkgs/applications/networking/pcloud/default.nix
+++ b/pkgs/applications/networking/pcloud/default.nix
@@ -34,6 +34,7 @@
, libXdamage
, nss
, udev
+, patchelfUnstable
}:
let
@@ -62,6 +63,7 @@ stdenv.mkDerivation {
nativeBuildInputs = [
autoPatchelfHook
+ patchelfUnstable
];
buildInputs = [
diff --git a/pkgs/development/tools/misc/patchelf/unstable.nix b/pkgs/development/tools/misc/patchelf/unstable.nix
index 7d340cf547b..987f6bb8860 100644
--- a/pkgs/development/tools/misc/patchelf/unstable.nix
+++ b/pkgs/development/tools/misc/patchelf/unstable.nix
@@ -2,13 +2,13 @@
stdenv.mkDerivation rec {
pname = "patchelf";
- version = "unstable-2023-04-25";
+ version = "unstable";
src = fetchFromGitHub {
owner = "NixOS";
repo = "patchelf";
- rev = "008a582741617e2d7d5aa4aab1e8ddfdec0067d9";
- sha256 = "sha256-SC9zZbHN1p5BD6YHr+/ZNelmmZDozEO/vDwuCdJJCcs=";
+ rev = "27cbc89d4830d5ae1fe3a2396f2a6042266895bc";
+ sha256 = "sha256-FxwKznM/xcYZAmeKMAKYA2qkED4Zfayr62R7cg8AORA=";
};
# Drop test that fails on musl (?)
... generates a file that crashes over ld-linux-x86-64.so.2
(like I mentioned above), but going a single commit before:
diff --git a/pkgs/applications/networking/pcloud/default.nix b/pkgs/applications/networking/pcloud/default.nix
index 403d1e0cf34..93e9eb9b1d1 100644
--- a/pkgs/applications/networking/pcloud/default.nix
+++ b/pkgs/applications/networking/pcloud/default.nix
@@ -34,6 +34,7 @@
, libXdamage
, nss
, udev
+, patchelfUnstable
}:
let
@@ -62,6 +63,7 @@ stdenv.mkDerivation {
nativeBuildInputs = [
autoPatchelfHook
+ patchelfUnstable
];
buildInputs = [
diff --git a/pkgs/development/tools/misc/patchelf/unstable.nix b/pkgs/development/tools/misc/patchelf/unstable.nix
index 7d340cf547b..cd986b539a4 100644
--- a/pkgs/development/tools/misc/patchelf/unstable.nix
+++ b/pkgs/development/tools/misc/patchelf/unstable.nix
@@ -2,13 +2,13 @@
stdenv.mkDerivation rec {
pname = "patchelf";
- version = "unstable-2023-04-25";
+ version = "unstable";
src = fetchFromGitHub {
owner = "NixOS";
repo = "patchelf";
- rev = "008a582741617e2d7d5aa4aab1e8ddfdec0067d9";
- sha256 = "sha256-SC9zZbHN1p5BD6YHr+/ZNelmmZDozEO/vDwuCdJJCcs=";
+ rev = "ac212d0e6fb8b741e5a5e9ea61091149103f401c";
+ sha256 = "sha256-JtobCiZEl3KeXT5CAhXTRhjAPgTVx2upVAUTJNCb/a0=";
};
# Drop test that fails on musl (?)
... yields a binary/library that at least can be loaded - I mean, some symbols in libnode.so
there seem to be still linked in a wrong way, but at least it doesn't cause ld to crash ๐ (NixOS/nixpkgs#226339 (comment)).
Edit: also, in all the invalid cases libnode.so
has funky procmap:
0x7ffff6600000 0x7ffff6dc8000 0x7c8000 0x0 r--p /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/libnode.so
0x7ffff6dc8000 0x7ffff7a7d000 0xcb5000 0x7c8000 r-xp /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/libnode.so
0x7ffff7a7d000 0x7ffff7b2c000 0xaf000 0x147d000 rw-p /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/libnode.so
0x7ffff7b2c000 0x7ffff7b43000 0x17000 0x0 rw-p
0x7ffff7b43000 0x7ffff7fbd000 0x47a000 0x152d000 rw-p /nix/store/5iarv8nx5i0q30d879i9v6wkbapxjvqb-pcloud-1.12.0/app/libnode.so
... where the fourth, uhm, part (segment? not sure on the terminology) has an offset of zero; I don't know much about elf files or how the stuff gets mapped into RAM, but it feels sus.
For comparison, here's a correct libnode.so
(taken from pcloud built from nixpkgs:fdd49f1bcd8a7f0b5e29f550d698b2abe5c540cd):
0x7ffff6a00000 0x7ffff71c8000 0x7c8000 0x0 r--p /nix/store/dp16s7cfwslam9rd5l0mkj9skrvy49aq-pcloud-1.12.0/app/libnode.so
0x7ffff71c8000 0x7ffff7e7d000 0xcb5000 0x7c8000 r-xp /nix/store/dp16s7cfwslam9rd5l0mkj9skrvy49aq-pcloud-1.12.0/app/libnode.so
0x7ffff7e7d000 0x7ffff7f2c000 0xaf000 0x147d000 rw-p /nix/store/dp16s7cfwslam9rd5l0mkj9skrvy49aq-pcloud-1.12.0/app/libnode.so
Edit 2: here's a patchelf's log when building an invalid libnode.so
:
searching for dependencies of /nix/store/rjw66ywwkd9r85559fnsiyw37idjhbxh-pcloud-1.12.0/app/libnode.so
libgcc_s.so.1 -> found: /nix/store/5gk8zqasr9hdhm9nhl0y7g0g7bf5lvbc-gcc-12.2.0-libgcc/lib
setting RPATH to: /nix/store/5gk8zqasr9hdhm9nhl0y7g0g7bf5lvbc-gcc-12.2.0-libgcc/lib
patching ELF file '/nix/store/rjw66ywwkd9r85559fnsiyw37idjhbxh-pcloud-1.12.0/app/libnode.so'
new rpath is '/nix/store/5gk8zqasr9hdhm9nhl0y7g0g7bf5lvbc-gcc-12.2.0-libgcc/lib'
rpath is too long or shared, resizing...
DT_NULL index is 30
replacing section '.dynamic' with size 512
replacing section '.dynstr' with size 1046753
this is a dynamic library
last page is 0x1543000
first page is 0x0
needed space is 4690320
rewriting section '.rodata' from offset 0x240 (size 3643048) to offset 0x152d000 (size 3643048)
rewriting section '.dynstr' from offset 0x40ee54 (size 1046687) to offset 0x18a66a8 (size 1046753)
rewriting section '.dynamic' from offset 0x1528250 (size 496) to offset 0x19a5f90 (size 512)
rewriting symbol table section 2
writing /nix/store/rjw66ywwkd9r85559fnsiyw37idjhbxh-pcloud-1.12.0/app/libnode.so
The needed space
here is kinda suspicious as well, considering how large it is ๐ค
i think i'm seeing the same, or very similar, on a recent guix (i.e. patchelf 0.18.0
).
i'm downloading the go-ethereum binary release and patching it to run on guix using patchelf (package source is available here).
if i read the git log correctly, then what broke it for me was a patchelf update from 0.11
to 0.18.0
.
the moving parts are:
- the patchelf update
- i think go-ethereum also switched to golang 1.23.0 at the release that broke.
$ gdb ./geth
GNU gdb (GDB) 14.2
Reading symbols from ./geth...
(No debugging symbols found in ./geth)
(gdb) r
Starting program: /tmp/guix-build-geth-binary-1.14.10.drv-0/geth
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fe7c0a in dl_main () from /gnu/store/zvlp3n8iwa1svxmwv4q22pv1pb1c9pjq-glibc-2.39/lib/ld-linux-x86-64.so.2
(gdb) back
#0 0x00007ffff7fe7c0a in dl_main () from /gnu/store/zvlp3n8iwa1svxmwv4q22pv1pb1c9pjq-glibc-2.39/lib/ld-linux-x86-64.so.2
#1 0x00007ffff7fe45ef in _dl_sysdep_start () from /gnu/store/zvlp3n8iwa1svxmwv4q22pv1pb1c9pjq-glibc-2.39/lib/ld-linux-x86-64.so.2
#2 0x00007ffff7fe5d9c in _dl_start () from /gnu/store/zvlp3n8iwa1svxmwv4q22pv1pb1c9pjq-glibc-2.39/lib/ld-linux-x86-64.so.2
#3 0x00007ffff7fe4ba8 in _start () from /gnu/store/zvlp3n8iwa1svxmwv4q22pv1pb1c9pjq-glibc-2.39/lib/ld-linux-x86-64.so.2
#4 0x0000000000000001 in ?? ()
#5 0x00007fffffffc2fb in ?? ()
#6 0x0000000000000000 in ?? ()
(gdb)
$ LD_DEBUG=all ./geth
10265: symbol=__vdso_clock_gettime; lookup in file=linux-vdso.so.1 [0]
10265: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_clock_gettime' [LINUX_2.6]
10265: symbol=__vdso_gettimeofday; lookup in file=linux-vdso.so.1 [0]
10265: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_gettimeofday' [LINUX_2.6]
10265: symbol=__vdso_time; lookup in file=linux-vdso.so.1 [0]
10265: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_time' [LINUX_2.6]
10265: symbol=__vdso_getcpu; lookup in file=linux-vdso.so.1 [0]
10265: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_getcpu' [LINUX_2.6]
10265: symbol=__vdso_clock_getres; lookup in file=linux-vdso.so.1 [0]
10265: binding file linux-vdso.so.1 [0] to linux-vdso.so.1 [0]: normal symbol `__vdso_clock_getres' [LINUX_2.6]
Segmentation fault
$
a nonguix issue that is probably related: https://gitlab.com/nonguix/nonguix/-/issues/350
nvidia-smi
is also written in golang.
@attila-lendvai could you try using #544?
@attila-lendvai could you try using #544?
sadly, i do not notice any difference. it looks like it produces the same output with my binary.
but i've set up myself to relatively easily experiment with various different patchelf versions, so let me know if i should try anything else!
someone at @nonguix identified v0.16.1 as the latest that still works.
i tried go-ethereum with that version, and i can confirm that it works with v0.16.1.