RISC-V linker relaxation breaks booting with LTO
Opened this issue · 7 comments
I forgot to boot test v3 of RISC-V LTO enablement and to my surprise, kernels built with LTO and LLVM 15.0.0 and newer were broken (do not get past OpenSBI). I did a bisect of LLVM and landed on the commit that added support for RISC-V linker relaxation to ld.lld
: llvm/llvm-project@6611d58
# Apply https://lore.kernel.org/20231003-riscv-lto-v3-1-8aca61a4ecb4@kernel.org/ first then turn on CONFIG_LTO_CLANG_THIN in menuconfig
$ make -skj"$(nproc)" ARCH=riscv LLVM=1 mrproper defconfig menuconfig Image
$ boot-qemu.py -k . -t 20s
...
OpenSBI v1.3.1
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|___/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform Features : medeleg
...
Boot HART ID : 0
Boot HART Domain : root
Boot HART Priv Version : v1.12
Boot HART Base ISA : rv64imafdch
Boot HART ISA Extensions : time,sstc
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 16
Boot HART MIDELEG : 0x0000000000001666
Boot HART MEDELEG : 0x0000000000f0b509
qemu-system-riscv64: terminating on signal 15 from pid 2261906 (timeout)
...
With this diff applied on top:
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 1329e060c548..c65e9cff59f4 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -45,7 +45,6 @@ else
endif
ifeq ($(CONFIG_LD_IS_LLD),y)
-ifeq ($(call test-lt, $(CONFIG_LLD_VERSION), 150000),y)
KBUILD_CFLAGS += -mno-relax
KBUILD_AFLAGS += -mno-relax
ifndef CONFIG_AS_IS_LLVM
@@ -53,7 +52,6 @@ ifndef CONFIG_AS_IS_LLVM
KBUILD_AFLAGS += -Wa,-mno-relax
endif
endif
-endif
# ISA string setting
riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
The machine boots again:
$ make -skj"$(nproc)" ARCH=riscv LLVM=1 clean Image
$ boot-qemu.py -k . -t 20s
...
OpenSBI v1.3.1
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|___/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform Features : medeleg
...
Boot HART ID : 0
Boot HART Domain : root
Boot HART Priv Version : v1.12
Boot HART Base ISA : rv64imafdch
Boot HART ISA Extensions : time,sstc
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 16
Boot HART MIDELEG : 0x0000000000001666
Boot HART MEDELEG : 0x0000000000f0b509
[ 0.000000] Linux version 6.6.0-rc4-00002-g81137efe9431-dirty (nathan@dev-arch.thelio-3990X) (ClangBuiltLinux clang version 18.0.0 (https://github.com/llvm/llvm-project 9a954c693573281407f6ee3f4eb1b16cc545033d), ClangBuiltLinux LLD 18.0.0) #4 SMP Thu Oct 5 13:39:05 MST 2023
[ 0.000000] random: crng init done
[ 0.000000] Machine model: riscv-virtio,qemu
[ 0.000000] SBI specification v1.0 detected
[ 0.000000] SBI implementation ID=0x1 Version=0x10003
[ 0.000000] SBI TIME extension detected
[ 0.000000] SBI IPI extension detected
[ 0.000000] SBI RFENCE extension detected
[ 0.000000] SBI SRST extension detected
...
I will see if I can narrow down if this happens in a particular translation unit. In the meantime, if we wanted to workaround this in the kernel, we could add something like this to v4:
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 1329e060c548..ecaa784e8198 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -45,7 +45,7 @@ else
endif
ifeq ($(CONFIG_LD_IS_LLD),y)
-ifeq ($(call test-lt, $(CONFIG_LLD_VERSION), 150000),y)
+ifeq ($(filter y, $(call test-lt, $(CONFIG_LLD_VERSION), 150000) $(CONFIG_LTO_CLANG)),y)
KBUILD_CFLAGS += -mno-relax
KBUILD_AFLAGS += -mno-relax
ifndef CONFIG_AS_IS_LLVM
I can reproduce this issue with Linux 6.5 and LLVM commit 0ce6255a5058.
After some debugging using QEMU, I found the return value of ss->css_alloc(NULL)
(where it points to cpuset_css_alloc()
) in cgroup_init_subsys()
was NULL
and later triggered an unexpected page fault.
The source code of cpuset_css_alloc()
is:
static struct cgroup_subsys_state *
cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
{
struct cpuset *cs;
if (!parent_css)
return &top_cpuset.css;
// ...
Disassembling cpuset_css_alloc()
in vmlinux.o
shows:
0000000000000000 <cpuset_css_alloc>:
0: 1101 addi sp,sp,-32
2: ec06 sd ra,24(sp)
4: e822 sd s0,16(sp)
6: e426 sd s1,8(sp)
8: 1000 addi s0,sp,32
000000000000000a <.Lpcrel_hi14724>:
a: 00000597 auipc a1,0x0
a: R_RISCV_PCREL_HI20 top_cpuset
e: 00058593 mv a1,a1
e: R_RISCV_PCREL_LO12_I .Lpcrel_hi14724
12: c935 beqz a0,86 <.LBB4160_4>
0000000000000014 <.Lpcrel_hi14725>:
...
0000000000000086 <.LBB4160_4>:
86: 852e mv a0,a1
88: 60e2 ld ra,24(sp)
8a: 6442 ld s0,16(sp)
8c: 64a2 ld s1,8(sp)
8e: 6105 addi sp,sp,32
90: 8082 ret
Note that there is no relocation entry (like llvm/llvm-project#59350) for the beqz
, which makes the target of the beqz
wrong after linker relaxation.
And disassembling cpuset_css_alloc()
in vmlinux
shows:
ffffffff800e84fc <cpuset_css_alloc>:
ffffffff800e84fc: 1101 addi sp,sp,-32
ffffffff800e84fe: ec06 sd ra,24(sp)
ffffffff800e8500: e822 sd s0,16(sp)
ffffffff800e8502: e426 sd s1,8(sp)
ffffffff800e8504: 1000 addi s0,sp,32
ffffffff800e8506 <.Lpcrel_hi14724>:
ffffffff800e8506: 013a3597 auipc a1,0x13a3
ffffffff800e850a: ae258593 addi a1,a1,-1310 # ffffffff8148afe8 <top_cpuset>
ffffffff800e850e: c935 beqz a0,ffffffff800e8582 <.LBB4160_4+0x2>
ffffffff800e8510 <.Lpcrel_hi14725>:
...
ffffffff800e8580 <.LBB4160_4>:
ffffffff800e8580: 852e mv a0,a1
ffffffff800e8582: 60e2 ld ra,24(sp)
ffffffff800e8584: 6442 ld s0,16(sp)
ffffffff800e8586: 64a2 ld s1,8(sp)
ffffffff800e8588: 6105 addi sp,sp,32
ffffffff800e858a: 8082 ret
Note that the target of the beqz
is wrong and mv a0,a1
is skipped, which is exactly the same issue as llvm/llvm-project#65090.
According to @MaskRay 's comment in llvm/llvm-project#59350, I tried the following diff:
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index 6ec6d52a4180..c8efe12b9238 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -55,6 +55,10 @@ endif
endif
endif
+ifdef CONFIG_LTO_CLANG
+ KBUILD_LDFLAGS += -mllvm -mattr=+c -mllvm -mattr=+relax
+endif
+
(something like llvm/llvm-project#50505 ?)
And the kernel can boot now.
Oh! Nice work!
(something like llvm/llvm-project#50505 ?)
Exactly!
KBUILD_LDFLAGS += -mllvm -mattr=+c -mllvm -mattr=+relax
I thought llvm/llvm-project#50505 addressed -mattr=+c
? Perhaps only relax is necessary?
I'm going to reopen llvm/llvm-project#50505. The above diff should be folded into the LTO series with a comment that it's working around that bug in LLVM.
@twd2 Would you like to send v4 with that diff added or would you like me to? I do not mind either way.
Oh! Nice work!
(something like llvm/llvm-project#50505 ?)
Exactly!
KBUILD_LDFLAGS += -mllvm -mattr=+c -mllvm -mattr=+relax
I thought llvm/llvm-project#50505 addressed
-mattr=+c
? Perhaps only relax is necessary?
I checked the bitcode and IR and found that +c
and +relax
were indeed encoded in the IR's target-features
:
attributes #2 = { ... "target-features"="+64bit,+a,+c,+m,+relax,...
But ld.lld
somehow ignored these attributes when doing LTO. (?)
I thought the diff I referenced in llvm/llvm-project#50505 (llvm/llvm-project@e63455d) only solved the issue when writing nops. And, if we don't pass +c
, e_flags
of the ELF header will not contain EF_RISCV_RVC
(MaskRay said in llvm/llvm-project#59350).
I'm going to reopen llvm/llvm-project#50505. The above diff should be folded into the LTO series with a comment that it's working around that bug in LLVM.
So, can you please change the title of llvm/llvm-project#50505 to something like "LLD ignores target-features
when doing LTO"? I think this is the root cause of 50505 and this issue.
Thanks.
@twd2 Would you like to send v4 with that diff added or would you like me to? I do not mind either way.
Please do that, thanks!
I thought there needed some if-conditions to check whether we should add +c
or +relax
, but I have not come up with a good solution.
v4 with this issue worked around: https://lore.kernel.org/20231017-riscv-lto-v4-1-e7810b24e805@kernel.org/
@hiraditya has been tracking this kind of issue with LTO and riscv for Android. He pointed me to the discussion in:
https://discourse.llvm.org/t/myterious-soft-float-output-in-lto-cache/70753