lkrg-org/lkrg

"FATAL: Can't initialize sysctl" on post-6.10 kernels

Closed this issue · 4 comments

As seen e.g. in https://github.com/lkrg-org/lkrg/actions/runs/10074867728/job/27851932799

[   68.232329] LKRG: ALIVE: Loading LKRG
[   68.263420] Freezing user space processes
[   68.268790] Freezing user space processes completed (elapsed 0.005 seconds)
[   68.269608] OOM killer disabled.
[   69.094324] LKRG: ISSUE: [kretprobe] register_kretprobe() for <ovl_dentry_is_whiteout> failed! [err=-2]
[   69.095058] LKRG: ISSUE: Can't hook 'ovl_dentry_is_whiteout'. This is expected when OverlayFS is not used.
[   69.462918] sysctl table check failed: lkrg/(null) procname is null
[   69.463519] sysctl table check failed: lkrg/(null) No proc_handler
[   69.463995] LKRG: FATAL: Can't initialize sysctl
[   69.464337] LKRG: DYING: Not loading LKRG (initialization failed)
[   71.960075] WARNING: Flushing system-wide workqueues will be prohibited in near future.
[   71.960925] CPU: 0 UID: 0 PID: 506 Comm: modprobe Tainted: G           OE      6.10.0-061000daily20240722-generic #202407212203
[   71.961707] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[   71.961984] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
[   71.962476] Call Trace:
[   71.963030]  <TASK>
[   71.963271]  dump_stack_lvl+0x76/0xa0
[   71.963726]  dump_stack+0x10/0x20
[   71.963927]  __warn_flushing_systemwide_wq+0x1a/0x30
[   71.964232]  p_exploit_detection_exit+0x33/0xc0 [lkrg]
[   71.964622]  p_lkrg_register+0x2b0/0xff0 [lkrg]
[   71.964946]  ? __pfx_p_lkrg_register+0x10/0x10 [lkrg]
[   71.965281]  do_one_initcall+0x5e/0x340
[   71.965599]  do_init_module+0x97/0x2a0

I think the most relevant commit is this (merged along with several others very recently):

commit d7a76ec87195ced6910b0ca10ca133bb316c90f5
Author: Joel Granados <j.granados@samsung.com>
Date:   Tue Jun 4 08:29:21 2024 +0200

    sysctl: Remove check for sentinel element in ctl_table arrays
    
    Use ARRAY_SIZE exclusively by removing the check to ->procname in the
    stopping criteria of the loops traversing ctl_table arrays. This commit
    finalizes the removal of the sentinel elements at the end of ctl_table
    arrays which reduces the build time size and run time memory bloat by
    ~64 bytes per sentinel (further information Link :
    https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)
    
    Remove the entry->procname evaluation from the for loop stopping
    criteria in sysctl and sysctl_net.
    
    Signed-off-by: Joel Granados <j.granados@samsung.com>

Also important is:

commit 9edbfe92a0a1355bae1e47c8f542ac0d39f19f8c
Author: Joel Granados <joel.granados@gmail.com>
Date:   Wed Aug 9 12:49:58 2023 +0200

    sysctl: Add size to register_sysctl
    
    This commit adds table_size to register_sysctl in preparation for the
    removal of the sentinel elements in the ctl_table arrays (last empty
    markers). And though we do *not* remove any sentinels in this commit, we
    set things up by either passing the table_size explicitly or using
    ARRAY_SIZE on the ctl_table arrays.
    
    We replace the register_syctl function with a macro that will add the
    ARRAY_SIZE to the new register_sysctl_sz function. In this way the
    callers that are already using an array of ctl_table structs do not
    change. For the callers that pass a ctl_table array pointer, we pass the
    table_size to register_sysctl_sz instead of the macro.
    
    Signed-off-by: Joel Granados <j.granados@samsung.com>
    Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

My summary is that the trailing zero-filled table entries were no longer required for a year, and are now disallowed. Since we support kernels older than a year as well, we should conditionally exclude these elements at least on kernels that disallow them and maybe also on kernels that don't require them. An easy way to implement the latter appears to be #ifdef register_sysctl, which would detect "replace the register_syctl function with a macro" in the commit above.

An easy way to implement the latter appears to be #ifdef register_sysctl, which would detect "replace the register_syctl function with a macro" in the commit above.

I've just pushed a fix along these lines, but:

  1. We'll only know whether it works once our CI setup is fixed, #343.
  2. I think ideally upstream should either revert the recent requirement of there being no sentinel element, or it should add a feature macro that out-of-tree modules could check for. We could want to suggest that in this thread: https://lists.openwall.net/linux-kernel/2024/07/16/687

Thanks for that work @solardiz. Regarding point 2, we can try but taking into account my last experience with Greg I doubt it will be taken into account since it's not an internal use-case, per reference:
https://lkml.iu.edu/hypermail/linux/kernel/2101.1/04227.html

Although, maybe it is worth to try.

we can try but taking into account my last experience with Greg I doubt it will be taken into account since it's not an internal use-case

Right. I've just tried and got a similar response from Linus:

On Tue, 6 Aug 2024 at 11:57, Solar Designer <solar@openwall.com> wrote:
>
> As (I assume it was) expected, these changes broke out-of-tree modules.

It was presumably not expected - but for the simple reason that no
kernel developer should spend one single second worrying about
out-of-tree modules.

It's simply not a concern - never has been, and never will be.

Now, if some out-of-tree module is on the cusp of being integrated,
and is out-of-tree just because it's not quite ready yet, that would
maybe be then a case of "hey, wait a second".

But no. We are not going to start any kind of feature test macros for
external modules in general.

                 Linus