Enable CONFIG_DYNAMIC_DEBUG on the default raspbian build
Closed this issue ยท 19 comments
I am quite familiar with x86/x86_64 kernel debugging (being one of the in-tree driver maintainers...), and recently needed to look into some raspberrypi driver-related issues. I found that in raspbian buster (at least) , CONFIG_DYNAMIC_DEBUG is off. This is on by default on most x86/x86_64 linux desktop distros - at least those I regularly use. And the overhead is very small if you don't do additional steps to use it.
Could we switch this on in the default raspbian kernels?
+1 I'm currently trying to debug some issues with a V4L2 app. It would have been nice to have this on by default.
I agree that it's pretty important for this feature to be enabled by default--I hate the fact that I have to manually recompile the kernel in order to view Wireguard logs.
We usually expect requesters of a feature like this (which could have non-trivial cost on kernel size, free memory and performance) to test it and report any changes.
For a start build the kernel with this option disabled and enabled and report size of kernel.img (+modules) and output of 'free -h' straight after boot.
okay, I'll provide those info when I next need to do a kernel build. Others, please feel free to beat me to it!
I will get the results posted in the next day or two. Thanks.
@popcornmix @HinTak -- ok so I cross-compiled the 32-bit Pi 4 armv7l kernel both with and without CONFIG_DYNAMIC_DEBUG=y
. The resulting kernel size is ~135k bigger with dynamic debug enabled:
With dynamic debug
-rwxrwxr-x 1 user user 6911760 Jun 16 14:45 linux/arch/arm/boot/zImage
Without dynamic debug
-rwxrwxr-x 1 user user 6776744 Jun 18 11:55 linux-no-dd/arch/arm/boot/zImage
I then proceeded to compare the compiled module size between the two kernels and found absolutely no difference in size (see attached files):
dd_kernel_modules_sorted.csv
no_dd_kernel_modules_sorted.csv
Note: The following is not a fresh install as I have a few things installed, @HinTak -- if you have the ability to perform a fresh install the results may be better. Otherwise give me a few more days and I will be able to check the kernel against a fresh install of Raspberry Pi OS.
Here's also the lsmod from both kernels (excuse the fact I may have some extraneous modules running, such as "wireguard"):
With dynamic debug
Module Size Used by
xt_recent 24576 0
nf_log_ipv6 16384 5
ip6t_REJECT 16384 1
nf_reject_ipv6 20480 1 ip6t_REJECT
xt_hl 16384 22
ip6_tables 28672 52
ip6t_rt 20480 3
nf_log_ipv4 16384 5
nf_log_common 16384 2 nf_log_ipv6,nf_log_ipv4
ipt_REJECT 16384 1
nf_reject_ipv4 16384 1 ipt_REJECT
xt_LOG 20480 10
nft_limit 16384 13
xt_limit 16384 0
nf_conntrack_netbios_ns 16384 0
nf_conntrack_broadcast 16384 1 nf_conntrack_netbios_ns
nf_nat_ftp 20480 0
nf_conntrack_ftp 20480 1 nf_nat_ftp
cmac 16384 1
rfcomm 77824 4
fuse 131072 3
xt_nat 16384 6
bnep 32768 2
xt_tcpudp 20480 33
veth 28672 0
hci_uart 57344 1
btbcm 16384 1 hci_uart
bluetooth 573440 29 hci_uart,bnep,btbcm,rfcomm
ecdh_generic 16384 2 bluetooth
ecc 40960 1 ecdh_generic
xt_conntrack 16384 18
nf_conntrack_netlink 49152 0
xt_addrtype 16384 6
br_netfilter 32768 0
bridge 163840 1 br_netfilter
nft_chain_nat 16384 8
xt_MASQUERADE 20480 5
nf_nat 49152 4 xt_nat,xt_MASQUERADE,nft_chain_nat,nf_nat_ftp
nf_conntrack 155648 9 xt_nat,nf_conntrack_ftp,xt_MASQUERADE,nf_conntrack_broadcast,nf_nat_ftp,nf_conntrack_netlink,nf_conntrack_netbios_ns,xt_conntrack,nf_nat
nf_defrag_ipv6 24576 1 nf_conntrack
nf_defrag_ipv4 16384 1 nf_conntrack
nft_compat 20480 165
nft_counter 16384 212
nf_tables 196608 556 nft_limit,nft_compat,nft_chain_nat,nft_counter
nfnetlink 16384 4 nft_compat,nf_conntrack_netlink,nf_tables
overlay 122880 2
8021q 32768 0
garp 16384 1 8021q
stp 16384 2 garp,bridge
llc 16384 3 garp,bridge,stp
asix 49152 0
binfmt_misc 24576 1
cdc_acm 40960 2
rpivid_mem 16384 0
brcmfmac 331776 0
brcmutil 24576 1 brcmfmac
sha256_generic 16384 0
cfg80211 778240 1 brcmfmac
rfkill 32768 6 bluetooth,cfg80211
bcm2835_v4l2 40960 0
bcm2835_codec 40960 0
bcm2835_isp 32768 0
raspberrypi_hwmon 16384 0
bcm2835_mmal_vchiq 40960 3 bcm2835_isp,bcm2835_codec,bcm2835_v4l2
v4l2_mem2mem 36864 1 bcm2835_codec
videobuf2_vmalloc 20480 1 bcm2835_v4l2
videobuf2_dma_contig 24576 2 bcm2835_isp,bcm2835_codec
videobuf2_memops 20480 2 videobuf2_dma_contig,videobuf2_vmalloc
videobuf2_v4l2 32768 4 bcm2835_isp,bcm2835_codec,bcm2835_v4l2,v4l2_mem2mem
videobuf2_common 61440 5 bcm2835_isp,bcm2835_codec,bcm2835_v4l2,v4l2_mem2mem,videobuf2_v4l2
snd_bcm2835 32768 4
v3d 73728 2
videodev 258048 6 bcm2835_isp,bcm2835_codec,videobuf2_common,bcm2835_v4l2,v4l2_mem2mem,videobuf2_v4l2
gpu_sched 45056 1 v3d
vc_sm_cma 45056 2 bcm2835_isp,bcm2835_mmal_vchiq
mc 57344 6 bcm2835_isp,bcm2835_codec,videobuf2_common,videodev,v4l2_mem2mem,videobuf2_v4l2
joydev 24576 0
vc4 258048 2
cec 49152 1 vc4
drm_kms_helper 241664 3 vc4
drm 528384 8 v3d,vc4,gpu_sched,drm_kms_helper
drm_panel_orientation_quirks 16384 1 drm
snd_soc_core 245760 1 vc4
snd_compress 28672 1 snd_soc_core
snd_pcm_dmaengine 16384 1 snd_soc_core
snd_pcm 118784 4 snd_compress,snd_pcm_dmaengine,snd_bcm2835,snd_soc_core
snd_timer 36864 1 snd_pcm
snd 81920 13 snd_compress,snd_timer,snd_bcm2835,snd_soc_core,snd_pcm
syscopyarea 16384 1 drm_kms_helper
sysfillrect 16384 1 drm_kms_helper
sysimgblt 16384 1 drm_kms_helper
fb_sys_fops 16384 1 drm_kms_helper
backlight 24576 1 drm
uio_pdrv_genirq 20480 0
nvmem_rmem 16384 0
uio 20480 1 uio_pdrv_genirq
wireguard 81920 0
curve25519_neon 28672 1 wireguard
libcurve25519_generic 24576 2 curve25519_neon,wireguard
libchacha20poly1305 16384 1 wireguard
chacha_neon 32768 1 libchacha20poly1305
poly1305_arm 24576 1 libchacha20poly1305
ip6_udp_tunnel 16384 1 wireguard
udp_tunnel 28672 1 wireguard
libblake2s 16384 1 wireguard
libblake2s_generic 20480 1 libblake2s
i2c_dev 24576 0
ip_tables 28672 8
x_tables 36864 15 xt_LOG,ipt_REJECT,xt_nat,ip_tables,nft_compat,xt_tcpudp,xt_MASQUERADE,xt_limit,ip6t_REJECT,xt_recent,xt_addrtype,ip6t_rt,xt_conntrack,ip6_tables,xt_hl
ipv6 512000 78 nf_reject_ipv6,br_netfilter,wireguard,bridge
Without dynamic debug
Module Size Used by
cmac 16384 1
rfcomm 49152 4
fuse 126976 3
xt_nat 16384 6
veth 28672 0
bnep 20480 2
hci_uart 40960 1
btbcm 16384 1 hci_uart
bluetooth 393216 29 hci_uart,bnep,btbcm,rfcomm
ecdh_generic 16384 2 bluetooth
ecc 40960 1 ecdh_generic
nf_conntrack_netlink 49152 0
br_netfilter 32768 0
bridge 163840 1 br_netfilter
nft_chain_nat 16384 8
xt_MASQUERADE 16384 5
overlay 106496 2
8021q 32768 0
garp 16384 1 8021q
stp 16384 2 garp,bridge
llc 16384 3 garp,bridge,stp
cdc_acm 24576 2
binfmt_misc 20480 1
rpivid_mem 16384 0
brcmfmac 327680 0
brcmutil 20480 1 brcmfmac
sha256_generic 16384 0
cfg80211 761856 1 brcmfmac
v3d 73728 2
gpu_sched 45056 1 v3d
rfkill 32768 6 bluetooth,cfg80211
raspberrypi_hwmon 16384 0
bcm2835_v4l2 40960 0
bcm2835_codec 36864 0
v4l2_mem2mem 36864 1 bcm2835_codec
bcm2835_isp 32768 0
bcm2835_mmal_vchiq 28672 3 bcm2835_isp,bcm2835_codec,bcm2835_v4l2
videobuf2_dma_contig 20480 2 bcm2835_isp,bcm2835_codec
videobuf2_vmalloc 16384 1 bcm2835_v4l2
videobuf2_memops 16384 2 videobuf2_dma_contig,videobuf2_vmalloc
videobuf2_v4l2 32768 4 bcm2835_isp,bcm2835_codec,bcm2835_v4l2,v4l2_mem2mem
videobuf2_common 61440 5 bcm2835_isp,bcm2835_codec,bcm2835_v4l2,v4l2_mem2mem,videobuf2_v4l2
snd_bcm2835 28672 4
vc4 253952 2
videodev 249856 6 bcm2835_isp,bcm2835_codec,videobuf2_common,bcm2835_v4l2,v4l2_mem2mem,videobuf2_v4l2
cec 49152 1 vc4
mc 45056 6 bcm2835_isp,bcm2835_codec,videobuf2_common,videodev,v4l2_mem2mem,videobuf2_v4l2
vc_sm_cma 32768 2 bcm2835_isp,bcm2835_mmal_vchiq
drm_kms_helper 241664 3 vc4
drm 520192 8 v3d,vc4,gpu_sched,drm_kms_helper
drm_panel_orientation_quirks 16384 1 drm
snd_soc_core 225280 1 vc4
snd_compress 20480 1 snd_soc_core
snd_pcm_dmaengine 16384 1 snd_soc_core
snd_pcm 106496 5 snd_compress,vc4,snd_pcm_dmaengine,snd_bcm2835,snd_soc_core
snd_timer 32768 1 snd_pcm
snd 77824 13 snd_compress,snd_timer,snd_bcm2835,snd_soc_core,snd_pcm
syscopyarea 16384 1 drm_kms_helper
sysfillrect 16384 1 drm_kms_helper
sysimgblt 16384 1 drm_kms_helper
fb_sys_fops 16384 1 drm_kms_helper
backlight 20480 1 drm
nvmem_rmem 16384 0
uio_pdrv_genirq 16384 0
uio 20480 1 uio_pdrv_genirq
nf_log_ipv6 16384 5
joydev 20480 0
ip6t_REJECT 16384 1
nf_reject_ipv6 16384 1 ip6t_REJECT
xt_hl 16384 22
ip6_tables 28672 52
ip6t_rt 16384 3
nf_log_ipv4 16384 5
nf_log_common 16384 2 nf_log_ipv6,nf_log_ipv4
ipt_REJECT 16384 1
nf_reject_ipv4 16384 1 ipt_REJECT
xt_LOG 16384 10
nft_limit 16384 13
xt_limit 16384 0
xt_addrtype 16384 6
xt_tcpudp 16384 33
xt_conntrack 16384 18
nft_compat 20480 165
nft_counter 16384 212
nf_conntrack_netbios_ns 16384 0
nf_conntrack_broadcast 16384 1 nf_conntrack_netbios_ns
nf_nat_ftp 16384 0
nf_nat 49152 4 xt_nat,xt_MASQUERADE,nft_chain_nat,nf_nat_ftp
nf_conntrack_ftp 16384 1 nf_nat_ftp
nf_conntrack 139264 9 xt_nat,nf_conntrack_ftp,xt_MASQUERADE,nf_conntrack_broadcast,nf_nat_ftp,nf_conntrack_netlink,nf_conntrack_netbios_ns,xt_conntrack,nf_nat
nf_defrag_ipv6 20480 1 nf_conntrack
nf_defrag_ipv4 16384 1 nf_conntrack
nf_tables 196608 556 nft_limit,nft_compat,nft_chain_nat,nft_counter
nfnetlink 16384 4 nft_compat,nf_conntrack_netlink,nf_tables
wireguard 73728 0
curve25519_neon 28672 1 wireguard
libcurve25519_generic 24576 2 curve25519_neon,wireguard
libchacha20poly1305 16384 1 wireguard
chacha_neon 32768 1 libchacha20poly1305
poly1305_arm 24576 1 libchacha20poly1305
ip6_udp_tunnel 16384 1 wireguard
udp_tunnel 24576 1 wireguard
libblake2s 16384 1 wireguard
libblake2s_generic 20480 1 libblake2s
i2c_dev 20480 0
ip_tables 28672 8
x_tables 32768 14 xt_LOG,ipt_REJECT,xt_nat,ip_tables,nft_compat,xt_tcpudp,xt_MASQUERADE,xt_limit,ip6t_REJECT,xt_addrtype,ip6t_rt,xt_conntrack,ip6_tables,xt_hl
ipv6 495616 78 nf_reject_ipv6,br_netfilter,wireguard,bridge
And finally, the free -h
with and without DD:
With dynamic debug
total used free shared buff/cache available
Mem: 7.7Gi 275Mi 7.0Gi 40Mi 478Mi 7.2Gi
Swap: 99Mi 0B 99Mi
Without dynamic debug
total used free shared buff/cache available
Mem: 7.7Gi 264Mi 7.0Gi 51Mi 458Mi 7.2Gi
Swap: 99Mi 0B 99Mi
@cjones26 Thanks for the effort! I see you have a whole bunch of firewall / wireguard related modules. That 11MB transfer from shared to used in output of free -h
is a bit curious. I seem to have a much lighter system than yours:
$ free -h
total used free shared buff/cache available
Mem: 3.8Gi 71Mi 1.1Gi 16Mi 2.7Gi 3.6Gi
Swap: 99Mi 48Mi 51Mi
This is a fairly minimal headless setup I used for debugging a out-of-tree driver a month or two ago. Maybe I'll go ahead and build a kernel too...
I am not surprised that individual modules are not affected - the overhead is likely in the kernel core and with maintaining the extra debugfs string tables.
@popcornmix @cjones26 okay, I have made two clones and can freely swap between the two, so let me know if you want to know anything else.
git clone --depth=1 -b rpi-5.10.y git://github.com/raspberrypi/linux linux-5.10-a
cp -ipr linux-5.10-a linux-5.10-b
It is at 0088fda . For linux-5.10-a, I just did make bcm2711_defconfig
before the usual build instructons. For linux-5.10-b, I did 'make menuconfig' in addition to enable dynamic debugging. In addition, I edited the top-level Makefile of both to EXTRAVERSION = s
or b
. (sorry, it was s
for stock
, instead of a
).
$ diff -u linux-5.10-a/.config linux-5.10-b/.config
--- linux-5.10-a/.config 2021-06-18 23:16:10.315335671 +0100
+++ linux-5.10-b/.config 2021-06-19 00:37:32.368208391 +0100
@@ -1,6 +1,6 @@
#
# Automatically generated file; DO NOT EDIT.
-# Linux/arm 5.10.44s Kernel Configuration
+# Linux/arm 5.10.44b Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="gcc (Raspbian 8.3.0-6+rpi1) 8.3.0"
CONFIG_CC_IS_GCC=y
@@ -1585,7 +1585,6 @@
# CONFIG_BT_MSFTEXT is not set
CONFIG_BT_DEBUGFS=y
# CONFIG_BT_SELFTEST is not set
-# CONFIG_BT_FEATURE_DEBUG is not set
#
# Bluetooth device drivers
@@ -7750,8 +7749,8 @@
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
CONFIG_BOOT_PRINTK_DELAY=y
-# CONFIG_DYNAMIC_DEBUG is not set
-# CONFIG_DYNAMIC_DEBUG_CORE is not set
+CONFIG_DYNAMIC_DEBUG=y
+CONFIG_DYNAMIC_DEBUG_CORE=y
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options
The results:
$ ls -l linux-5.10-?/arch/arm/boot/zImage
-rwxr-xr-x 1 pi pi 6774112 Jun 19 00:24 linux-5.10-a/arch/arm/boot/zImage
-rwxr-xr-x 1 pi pi 6906424 Jun 19 01:50 linux-5.10-b/arch/arm/boot/zImage
$ ls -l linux-5.10-?/vmlinux
-rwxr-xr-x 1 pi pi 20810344 Jun 19 00:24 linux-5.10-a/vmlinux
-rwxr-xr-x 1 pi pi 21265176 Jun 19 01:49 linux-5.10-b/vmlinux
$ du -s /lib/modules/5.10.44?-v7l+
61732 /lib/modules/5.10.44s-v7l+
65368 /lib/modules/5.10.44b-v7l+
Here are the sorted listings under /lib/modules/...
, done with
find . -type f -printf '%s\t%p\n' | sort -k 2
You can look at the details yourself, but for the 1677 files under there, 1129 are the same size, 548 are large for the dynamic debug version.
So in terms of static sizes, the kernel is increased by about 132k compressed, 455k decompressed; the sum total size of the kernel modules increases by 3.6MB out of ~62MB, or about 5%; but 2/3 of kernel modules built for the pi 4 defconfig actually stays the same size. I think that's because dynamic debugging have no effect on stable kernel modules which does not contain debug statements, which is 2/3 of all the modules in the pi 4 defconfig.
for free -h
, I rebooted 4 times, actuall, with, without, with, without. The result is really inconclusive:
total used free shared buff/cache available
Mem: 3.7Gi 47Mi 3.6Gi 16Mi 89Mi 3.6Gi
Swap: 99Mi 0B 99Mi
total used free shared buff/cache available
Mem: 3.7Gi 47Mi 3.6Gi 16Mi 88Mi 3.6Gi
Swap: 99Mi 0B 99Mi
total used free shared buff/cache available
Mem: 3.7Gi 47Mi 3.6Gi 8.0Mi 80Mi 3.6Gi
Swap: 99Mi 0B 99Mi
total used free shared buff/cache available
Mem: 3.7Gi 46Mi 3.6Gi 16Mi 88Mi 3.6Gi
Swap: 99Mi 0B 99Mi
The other interesting thing is the Memory line in dmesg
:
# without
Memory: 3666800K/4050944K available (10240K kernel code, 1364K rwdata, 3188K rodata, 2048K init, 891K bss, 122000K reserved, 262144K cma-reserved, 3264512K highmem)
# with
Memory: 3666744K/4050944K available (10240K kernel code, 1420K rwdata, 3352K rodata, 2048K init, 891K bss, 122056K reserved, 262144K cma-reserved, 3264512K highmem)
From further above, the uncompressed kernel image file size is increased by 455k; and resulting runtime change is in 56k more rwdata and 164k more rodata, 56k more reseved. I don't know if 455k is supposed to related to that 56k + 164k + 56k?
Anyway, how do you decide whether these numbers are large? as you see this is a pi 4 with 4GB ram; 0.5MB for increase in uncompressed kernel image file size, plus 4MB of kernel modules (2/3 of them don't change, and many of the rest are not used) increase seem small.
As I have two separate builds with different extraversions, I can switch between them just by copy /boot/kernel7l.img
alone and reboot. So if you have any questions about different between them, I can go look. At some point I'll go back to a distribution kernel, but since distribution kernels don't have the same extraversion, it is fairly easy to switch back to one of these.
Mine is a native build, BTW.
@cjones26 I am surprised that you found no changes in kernel modules sizes - I found about 2/3 are the same, so I looked at your csv file - you are looking at *.mod
instead of *.ko
! *.mod
is just one of generated object code files! I think they are kernel module entry point stubs, so natually don't have debug code in them.
Welp, apologies I am a bit of a newbie when it comes to the Linux kernel. Thanks for posting more detailed information and giving me a heads up!
@cjones26 no need to apologize - I appreciate the effort, and it is always good to have another person checking. I think the uncompressed size is supposed to be related to the runtime memory footprint line, but maybe @popcornmix knows better? I am actually quite surprised that it is such a high proportion (0.5M / 20M, and 4M / 65M, say 3-5%), but maybe because the pi 4 defconfig is already quite minimal, unlike the typical x86 scenario.
To my mind that is too large an overhead to inflict on all users, given that the kind of users who are likely to want to enable debug are also the ones who would end up compiling a kernel. Kernel debug facilities are not something that would normally be used frequently. Does WireGuard really not provide other logging options?
To my mind that is too large an overhead to inflict on all users, given that the kind of users who are likely to want to enable debug are also the ones who would end up compiling a kernel. Kernel debug facilities are not something that would normally be used frequently. Does WireGuard really not provide other logging options?
Got it. Strangely, and unfortunately, they do not provide other logging options.
I am surprised about the high proportion (however one sees it, it is no less than ~2%). I wonder if the pi set of modules and pi core code paths have a higher debug print counts than the rest (and more than the x86 ones). One interesting finding is that the effects on kernel modules are not uniform, and affects 1/3 of them, and some worse than others. Maybe that's a useful classification to spend time on improve the individual modules on.
That wireguard uses a debugging functionality as a routine day-to-day logging operation is certainly a strange decision.
The original motivation for this filing was for out-of-tree driver debugging; meanwhile wireguard has changed from being out of tree to in tree, I think? Anyway, seeing what I find now, I agree that it is not a good idea to enable it by default. It would be interesting to know if the overhead would get better in a year or two's time on the pi though.
It would be interesting to know if the overhead would get better in a year or two's time on the pi though.
I agree. In the meantime I'm going to close this issue.
FWIW I ran into this Wireguard logging peculiarity.
It would be nice if I could see Wireguard log messages. But I understand there are performance trade-offs.
I opened a discussion about it at MichaIng/DietPi#5847
given that the kind of users who are likely to want to enable debug are also the ones who would end up compiling a kernel.
I'm the kind of user that wants to enable (WireGuard) debug
but definitely not one who would end up compiling a kernel haha.
Just wanted to post here for future reference - are there by any
chance somewhat-official pre-built kernels with dynamic debug
enabled? Something that can be optionally installed and booted
into?
Alternatively, would there be a guide on how to recompile the Pi
kernel with dynamic debug enabled?
@mck182 recompile pi kernel with dynamic debug enabled is not hard. There is already an official "bulld your own pi kernel" guide from the pi people on the official pi website. The small additional step(change one kernel build option) you follow any of the standard or hobbyist Linux kernel page on such, the short version is my comment #3486 (comment) above. Perhaps you go and read thr official pi kernel build page first, then read that comment of mine above, then ask questions of things not clear?