[RFC] Using Clang (ThinLTO) for the default kernel in the cachyos repository
Closed this issue ยท 21 comments
Hi,
Im considering to use the ThinLTO for the "linux-cachyos" as default in the cachyos repository.
Clang built kernel has been pretty stabilized over the years and is also used by Google in Android as default, together with ThinLTO.
The Clang built Kernel provides as a bunch of benefits, like:
- Rust support (could be interesting for using https://www.phoronix.com/news/Linux-6.12-DRM-Panic-QR-Code)
- Better Performance --> https://www.phoronix.com/review/clang-lto-kernel
There are some outstanding issues, which needs to be discussed/fixed:
nvidia-open-dkms with ThinLTO does not boot, fix can be found here: dell/dkms#417CachyOS/CachyOS-PKGBUILDS@f402040Find other dkms modules, which are not compatible- The maintenance of the archlinux clang toolchain
Test kCFI/FineIBT for the kernel and its impact on performance and modules- Breaks NVIDIA modules
This issue is for general discussions and fiddling out the pending issue, which needs to be resolved.
Also, we should consider providing the default kernel built with GCC too, with a "linux-cachyos-gcc" variant or equal, but this will be only used for the cachyos repository.
known problematic DKMS Modules:
vmware-workstationn (Maybe we can put this at compilation time, like we do with zfs/nvidia) https://paste.soulharsh007.dev/p/5tY8t6.log - workaround available here: CachyOS/CachyOS-PKGBUILDS#320 can not be distributed, due licensenvidia-390xx-dkmsFixed by: CachyOS/CachyOS-PKGBUILDS#318nvidia-470xx-dkmsFixed by: CachyOS/CachyOS-PKGBUILDS#319- 8192cu-dkms - https://paste.cachyos.org/p/6059efb.c
- rtl8812au-dkms - https://paste.cachyos.org/p/385e3f2.txt
- rtl8821cu-morrownr-dkms-git - https://paste.cachyos.org/p/7545b73.txt
rtl88xxau-aircrack-dkms-git - https://paste.cachyos.org/p/2fe7c0c.log- mainly ARM project, ignoring.
Benchmarking
We should benchmark the ThinLTO kernel compared to the GCC quite much.
There will be a bunch of cachyos-benchmarker tests done, as well as a bigger phoronix test suite.
Aditionally we should test using different marches, e.g v3/znver4 if there are any regressions found
DKMS probably will not be such a priority issue since we already mostly rely on NVIDIA/v4l2loopback modules that are already pre-built for our kernel.
vmware-workstation has problems with Clang built kernels.
According log it fails because of Wstrict-prototypes
. Passing to the dkms.conf of it -Wno-strict-prototypes
should fix it there.
Done with fixes for issues with NVIDIA 390xx modules: CachyOS/CachyOS-PKGBUILDS#318
We could also test and think about enabling kCFI and FineIBT for the kernel as default, when switching to the LLVM built kernels.
This needs to be tested as well as benchmarked.
kCFI should have a smaller performance hit compared to CFI, but it should be still present.
This would increase the security of the kernel.
We could also test and think about enabling kCFI and FineIBT for the kernel as default, when switching to the LLVM built kernels. This needs to be tested as well as benchmarked.
kCFI should have a smaller performance hit compared to CFI, but it should be still present. This would increase the security of the kernel.
kCFI appears to break the NVIDIA module. So, this can not be used yet. Same for FineIBT.
vmware-workstation has problems with Clang built kernels.
According log it fails because of
Wstrict-prototypes
. Passing to the dkms.conf of it-Wno-strict-prototypes
should fix it there.
I usually patch the -Werror line away when I compile my own kernels, the warning comes from /lib/modules/version/build/scripts/Makefile.extrawarn
KBUILD_CFLAGS += -Werror=strict-prototypes
Removing it doesn't seem to cause any issues.
But for cachyos repro kernels it needs some workaround.
EDIT:
Entire post. :)
diff --git a/scripts/Makefile.clang b/scripts/Makefile.clang
index 6c23c6af797f..59b2e19416af 100644
--- a/scripts/Makefile.clang
+++ b/scripts/Makefile.clang
@@ -31,9 +31,7 @@ endif
# certain optimization flags it knows it has not implemented.
# Make it behave more like gcc by erroring when these flags are encountered
# so they can be implemented or wrapped in cc-option.
-CLANG_FLAGS += -Werror=unknown-warning-option
CLANG_FLAGS += -Werror=ignored-optimization-argument
CLANG_FLAGS += -Werror=option-ignored
-CLANG_FLAGS += -Werror=unused-command-line-argument
KBUILD_CPPFLAGS += $(CLANG_FLAGS)
export CLANG_FLAGS
diff --git a/scripts/Makefile.extrawarn b/scripts/Makefile.extrawarn
index 1d13cecc7cc7..ec97275c8852 100644
--- a/scripts/Makefile.extrawarn
+++ b/scripts/Makefile.extrawarn
@@ -12,7 +12,6 @@ KBUILD_CFLAGS += -Wundef
KBUILD_CFLAGS += -Werror=implicit-function-declaration
KBUILD_CFLAGS += -Werror=implicit-int
KBUILD_CFLAGS += -Werror=return-type
-KBUILD_CFLAGS += -Werror=strict-prototypes
KBUILD_CFLAGS += -Wno-format-security
KBUILD_CFLAGS += -Wno-trigraphs
KBUILD_CFLAGS += $(call cc-disable-warning,frame-address,)
@@ -68,9 +67,6 @@ KBUILD_CFLAGS += $(KBUILD_CFLAGS-y) $(CONFIG_CC_IMPLICIT_FALLTHROUGH)
# Prohibit date/time macros, which would make the build non-deterministic
KBUILD_CFLAGS += -Werror=date-time
-# enforce correct pointer usage
-KBUILD_CFLAGS += $(call cc-option,-Werror=incompatible-pointer-types)
-
# Require designated initializers for all marked structures
KBUILD_CFLAGS += $(call cc-option,-Werror=designated-init)
Applying this diff to the kernel makes all found incompatible dkms modules except rtl88xxau-aircrack-dkms-git
to build and install successfully.
Looking at https://github.com/aircrack-ng/rtl8812au, it seems that it is mainly targeting ARM/ARM64 devices. Therefore it is out of scope for us.
@1Naim
We can think about this applying the patchset, when if thinlto then source + ...
This would avoid patching it also for GCC Kernels.
We can do a rollout first to the LTO kernels with a 2-4 week grace period before we make it the default.
Please don't forget the problems with the winesync-dkms module on newer LTO-Kernels (see: dell/dkms#439). This is a winesync problem though but shows when a modern LTO-Kernel is used. The following changes to winesync.c and winesync.h are needed to get it compiled properly with newer LTO-Kernels, see the following with more details: ms178/archpkgbuilds@ee01c7f
Please don't forget the problems with the winesync-dkms module on newer LTO-Kernels (see: dell/dkms#439). This is a winesync problem though but shows when a modern LTO-Kernel is used. The following changes to winesync.c and winesync.h are needed to get it compiled properly with newer LTO-Kernels, see the following with more details: ms178/archpkgbuilds@ee01c7f
Thats not reproduceable on llvm18. Also, we have dropped winesync support, therefore not a thing
Please don't forget the problems with the winesync-dkms module on newer LTO-Kernels (see: dell/dkms#439). This is a winesync problem though but shows when a modern LTO-Kernel is used. The following changes to winesync.c and winesync.h are needed to get it compiled properly with newer LTO-Kernels, see the following with more details: ms178/archpkgbuilds@ee01c7f
Thats not reproduceable on llvm18. Also, we have dropped winesync support, therefore not a thing
As mentioned in the dkms issue, it is also relevant for NTSync although with a different modpost error. Please keep it in mind as sooner or later it might become a problem with LLVM-19+.
Please don't forget the problems with the winesync-dkms module on newer LTO-Kernels (see: dell/dkms#439). This is a winesync problem though but shows when a modern LTO-Kernel is used. The following changes to winesync.c and winesync.h are needed to get it compiled properly with newer LTO-Kernels, see the following with more details: ms178/archpkgbuilds@ee01c7f
Thats not reproduceable on llvm18. Also, we have dropped winesync support, therefore not a thing
As mentioned in the dkms issue, it is also relevant for NTSync although with a different modpost error. Please keep it in mind as sooner or later it might become a problem with LLVM-19+.
That is neither an issue, since NTSync is shipped inside the kernel.
Please don't forget the problems with the winesync-dkms module on newer LTO-Kernels
Well, the winesync module hasn't been updated for 2 years now. It is very much deprecated and abandoned at this point. We also do not ship it anymore as mentioned by @ptr1337.
it is also relevant for NTSync although with a different modpost error
We will tackle it once Arch has updated their LLVM toolchain. I'm sure it will cause build problems when that happens if there are no updates from upstream we need to patch it.
Why not go all the way with full lto?
From my side, I think we are mostly ready for this. We could start rolling this out on the Zen4 Repository first, and then proceed over with the v4/v3 repository after around one week (maybe with 6.11.2)
IMO, we should target for 6.11.2 for all repos. Either that or wait a few days after ISO release then push to znver4.