Not working with IPQ807x, all of the interrupts classified as class0
Closed this issue · 13 comments
Hi! The IPQ807x is now supported on OpenWrt. I have a Xiaomi AX3600 but this happens too with others routers as seen in the official OpenWrt forum. I'm using irqbalance
there with the latest master of OpenWrt, but the interrupts are all in the cpu0 and all are classified as class0.
This is the data you need I think:
Version: 1.9.2-2 (as defined by OpenWrt)
#cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
9: 0 0 0 0 GIC-0 39 Level arch_mem_timer
13: 1474020 1488109 1147576 1127555 GIC-0 20 Level arch_timer
16: 2 0 0 0 GIC-0 354 Edge smp2p
17: 0 0 0 0 GIC-0 216 Level 4a9000.thermal-sensor
18: 0 0 0 0 GIC-0 239 Level bam_dma
21: 0 0 0 0 GIC-0 270 Level bam_dma
22: 5 0 0 0 GIC-0 340 Level msm_serial0
23: 64315 0 0 0 GIC-0 178 Level bam_dma
24: 0 0 0 0 GIC-0 35 Edge wdt_bark
25: 0 0 0 0 GIC-0 357 Edge q6v5 wdog
29: 5 0 0 0 GIC-0 348 Edge ce0
30: 17601658 0 0 0 GIC-0 347 Edge ce1
31: 8673711 0 0 0 GIC-0 346 Edge ce2
32: 213398 0 0 0 GIC-0 343 Edge ce3
34: 31 0 0 0 GIC-0 443 Edge ce5
36: 190410 0 0 0 GIC-0 72 Edge ce7
38: 0 0 0 0 GIC-0 334 Edge ce9
39: 1 0 0 0 GIC-0 333 Edge ce10
40: 0 0 0 0 GIC-0 69 Edge ce11
47: 0 0 0 0 GIC-0 323 Edge reo2ost-exception
48: 1646168 0 0 0 GIC-0 322 Edge wbm2host-rx-release
49: 260 0 0 0 GIC-0 321 Edge reo2host-status
50: 801400 0 0 0 GIC-0 320 Edge reo2host-destination-ring4
51: 1376010 0 0 0 GIC-0 271 Edge reo2host-destination-ring3
52: 1702495 0 0 0 GIC-0 268 Edge reo2host-destination-ring2
53: 232679 0 0 0 GIC-0 267 Edge reo2host-destination-ring1
57: 5752486 0 0 0 GIC-0 263 Edge ppdu-end-interrupts-mac3
58: 0 0 0 0 GIC-0 262 Edge ppdu-end-interrupts-mac2
59: 10211035 0 0 0 GIC-0 261 Edge ppdu-end-interrupts-mac1
60: 1 0 0 0 GIC-0 260 Edge rxdma2host-monitor-status-ring-mac3
61: 0 0 0 0 GIC-0 256 Edge rxdma2host-monitor-status-ring-mac2
62: 1 0 0 0 GIC-0 255 Edge rxdma2host-monitor-status-ring-mac1
63: 1 0 0 0 GIC-0 235 Edge host2rxdma-host-buf-ring-mac3
64: 0 0 0 0 GIC-0 215 Edge host2rxdma-host-buf-ring-mac2
65: 1 0 0 0 GIC-0 212 Edge host2rxdma-host-buf-ring-mac1
66: 0 0 0 0 GIC-0 211 Edge rxdma2host-destination-ring-mac3
67: 0 0 0 0 GIC-0 210 Edge rxdma2host-destination-ring-mac2
68: 0 0 0 0 GIC-0 209 Edge rxdma2host-destination-ring-mac1
73: 3373884 0 0 0 GIC-0 191 Edge wbm2host-tx-completions-ring3
74: 525666 0 0 0 GIC-0 190 Edge wbm2host-tx-completions-ring2
75: 897200 0 0 0 GIC-0 189 Edge wbm2host-tx-completions-ring1
77: 3 0 0 0 GIC-0 47 Edge cpr3
78: 836656 0 0 0 GIC-0 377 Level edma_txcmpl
79: 0 0 0 0 GIC-0 385 Level edma_rxfill
80: 42784 0 0 0 GIC-0 393 Level edma_rxdesc
81: 0 0 0 0 GIC-0 376 Level edma_misc
82: 0 0 0 0 MSI 0 Edge PCIe PME, aerdrv
83: 0 0 0 0 pmic_arb 51380237 Edge pm-adc5
84: 0 0 0 0 smp2p 0 Edge q6v5 fatal
85: 1 0 0 0 smp2p 1 Edge q6v5 ready
86: 0 0 0 0 smp2p 2 Edge q6v5 handover
87: 0 0 0 0 smp2p 3 Edge q6v5 stop
88: 0 0 0 0 msmgpio 34 Edge keys
89: 31 0 0 0 MSI 524288 Edge ath10k_pci
90: 62 0 0 0 GIC-0 353 Edge glink-native
IPI0: 3904 7179 5469 6472 Rescheduling interrupts
IPI1: 819483 13763972 13900799 10592802 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 706 830 804 766 IRQ work interrupts
IPI6: 0 0 0 0 CPU wake-up interrupts
Err: 0
#irqbalance --debug
This machine seems not NUMA capable.
Prevent irq assignment to these isolated CPUs: 00000000
Prevent irq assignment to these adaptive-ticks CPUs: 00000000
Banned CPUs: 00000000
Package 0: numa_node -1 cpu mask is 0000000f (load 0)
Cache domain 0: numa_node is -1 cpu mask is 0000000f (load 0)
CPU number 3 numa_node is -1 (load 0)
CPU number 1 numa_node is -1 (load 0)
CPU number 2 numa_node is -1 (load 0)
CPU number 0 numa_node is -1 (load 0)
IRQ arch_mem_timer(9) guessed as class 0
IRQ arch_timer(13) guessed as class 0
IRQ smp2p(16) guessed as class 0
IRQ 4a9000.thermal-sensor(17) guessed as class 0
IRQ bam_dma(18) guessed as class 0
IRQ bam_dma(21) guessed as class 0
IRQ msm_serial0(22) guessed as class 0
IRQ bam_dma(23) guessed as class 0
IRQ wdt_bark(24) guessed as class 0
IRQ q6v5 wdog(25) guessed as class 0
IRQ ce0(29) guessed as class 0
IRQ ce1(30) guessed as class 0
IRQ ce2(31) guessed as class 0
IRQ ce3(32) guessed as class 0
IRQ ce5(34) guessed as class 0
IRQ ce7(36) guessed as class 0
IRQ ce9(38) guessed as class 0
IRQ ce10(39) guessed as class 0
IRQ ce11(40) guessed as class 0
IRQ reo2ost-exception(47) guessed as class 0
IRQ wbm2host-rx-release(48) guessed as class 0
IRQ reo2host-status(49) guessed as class 0
IRQ reo2host-destination-ring4(50) guessed as class 0
IRQ reo2host-destination-ring3(51) guessed as class 0
IRQ reo2host-destination-ring2(52) guessed as class 0
IRQ reo2host-destination-ring1(53) guessed as class 0
IRQ ppdu-end-interrupts-mac3(57) guessed as class 0
IRQ ppdu-end-interrupts-mac2(58) guessed as class 0
IRQ ppdu-end-interrupts-mac1(59) guessed as class 0
IRQ rxdma2host-monitor-status-ring-mac3(60) guessed as class 0
IRQ rxdma2host-monitor-status-ring-mac2(61) guessed as class 0
IRQ rxdma2host-monitor-status-ring-mac1(62) guessed as class 0
IRQ host2rxdma-host-buf-ring-mac3(63) guessed as class 0
IRQ host2rxdma-host-buf-ring-mac2(64) guessed as class 0
IRQ host2rxdma-host-buf-ring-mac1(65) guessed as class 0
IRQ rxdma2host-destination-ring-mac3(66) guessed as class 0
IRQ rxdma2host-destination-ring-mac2(67) guessed as class 0
IRQ rxdma2host-destination-ring-mac1(68) guessed as class 0
IRQ wbm2host-tx-completions-ring3(73) guessed as class 0
IRQ wbm2host-tx-completions-ring2(74) guessed as class 0
IRQ wbm2host-tx-completions-ring1(75) guessed as class 0
IRQ cpr3(77) guessed as class 0
IRQ edma_txcmpl(78) guessed as class 0
IRQ edma_rxfill(79) guessed as class 0
IRQ edma_rxdesc(80) guessed as class 0
IRQ edma_misc(81) guessed as class 0
IRQ PCIe PME, aerdrv(82) guessed as class 0
IRQ pm-adc5(83) guessed as class 0
IRQ q6v5 fatal(84) guessed as class 0
IRQ q6v5 ready(85) guessed as class 0
IRQ q6v5 handover(86) guessed as class 0
IRQ q6v5 stop(87) guessed as class 0
IRQ keys(88) guessed as class 0
IRQ ath10k_pci(89) guessed as class 0
IRQ glink-native(90) guessed as class 0
Adding IRQ 89 to database
Adding IRQ 82 to database
Adding IRQ 9 to database
Adding IRQ 13 to database
Adding IRQ 16 to database
Adding IRQ 17 to database
Adding IRQ 18 to database
Adding IRQ 21 to database
Adding IRQ 22 to database
Adding IRQ 23 to database
Adding IRQ 24 to database
Adding IRQ 25 to database
Adding IRQ 29 to database
Adding IRQ 30 to database
Adding IRQ 31 to database
Adding IRQ 32 to database
Adding IRQ 34 to database
Adding IRQ 36 to database
Adding IRQ 38 to database
Adding IRQ 39 to database
Adding IRQ 40 to database
Adding IRQ 47 to database
Adding IRQ 48 to database
Adding IRQ 49 to database
Adding IRQ 50 to database
Adding IRQ 51 to database
Adding IRQ 52 to database
Adding IRQ 53 to database
Adding IRQ 57 to database
Adding IRQ 58 to database
Adding IRQ 59 to database
Adding IRQ 60 to database
Adding IRQ 61 to database
Adding IRQ 62 to database
Adding IRQ 63 to database
Adding IRQ 64 to database
Adding IRQ 65 to database
Adding IRQ 66 to database
Adding IRQ 67 to database
Adding IRQ 68 to database
Adding IRQ 73 to database
Adding IRQ 74 to database
Adding IRQ 75 to database
Adding IRQ 77 to database
Adding IRQ 78 to database
Adding IRQ 79 to database
Adding IRQ 80 to database
Adding IRQ 81 to database
Adding IRQ 83 to database
Adding IRQ 84 to database
Adding IRQ 85 to database
Adding IRQ 86 to database
Adding IRQ 87 to database
Adding IRQ 88 to database
Adding IRQ 90 to database
NUMA NODE NUMBER: -1
LOCAL CPU MASK: ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
# for i in $(seq 0 300); do grep . /proc/irq/$i/smp_affinity /dev/null 2>/dev/null; done
/proc/irq/1/smp_affinity:f
/proc/irq/2/smp_affinity:f
/proc/irq/3/smp_affinity:f
/proc/irq/4/smp_affinity:f
/proc/irq/5/smp_affinity:f
/proc/irq/6/smp_affinity:f
/proc/irq/7/smp_affinity:f
/proc/irq/8/smp_affinity:f
/proc/irq/9/smp_affinity:f
/proc/irq/10/smp_affinity:f
/proc/irq/11/smp_affinity:f
/proc/irq/12/smp_affinity:f
/proc/irq/13/smp_affinity:f
/proc/irq/14/smp_affinity:f
/proc/irq/16/smp_affinity:f
/proc/irq/17/smp_affinity:f
/proc/irq/18/smp_affinity:f
/proc/irq/21/smp_affinity:f
/proc/irq/22/smp_affinity:f
/proc/irq/23/smp_affinity:f
/proc/irq/24/smp_affinity:f
/proc/irq/25/smp_affinity:f
/proc/irq/29/smp_affinity:f
/proc/irq/30/smp_affinity:f
/proc/irq/31/smp_affinity:f
/proc/irq/32/smp_affinity:f
/proc/irq/34/smp_affinity:f
/proc/irq/36/smp_affinity:f
/proc/irq/38/smp_affinity:f
/proc/irq/39/smp_affinity:f
/proc/irq/40/smp_affinity:f
/proc/irq/47/smp_affinity:f
/proc/irq/48/smp_affinity:f
/proc/irq/49/smp_affinity:f
/proc/irq/50/smp_affinity:f
/proc/irq/51/smp_affinity:f
/proc/irq/52/smp_affinity:f
/proc/irq/53/smp_affinity:f
/proc/irq/57/smp_affinity:f
/proc/irq/58/smp_affinity:f
/proc/irq/59/smp_affinity:f
/proc/irq/60/smp_affinity:f
/proc/irq/61/smp_affinity:f
/proc/irq/62/smp_affinity:f
/proc/irq/63/smp_affinity:f
/proc/irq/64/smp_affinity:f
/proc/irq/65/smp_affinity:f
/proc/irq/66/smp_affinity:f
/proc/irq/67/smp_affinity:f
/proc/irq/68/smp_affinity:f
/proc/irq/73/smp_affinity:f
/proc/irq/74/smp_affinity:f
/proc/irq/75/smp_affinity:f
/proc/irq/77/smp_affinity:f
/proc/irq/78/smp_affinity:f
/proc/irq/79/smp_affinity:f
/proc/irq/80/smp_affinity:f
/proc/irq/81/smp_affinity:f
/proc/irq/82/smp_affinity:f
/proc/irq/83/smp_affinity:f
/proc/irq/84/smp_affinity:f
/proc/irq/85/smp_affinity:f
/proc/irq/86/smp_affinity:f
/proc/irq/87/smp_affinity:f
/proc/irq/88/smp_affinity:f
/proc/irq/89/smp_affinity:8
/proc/irq/90/smp_affinity:f
The lstopo-no-graphics
command is not found. I don't know if I can install it with some package.
Thanks in advance!
Adding information related to OpenWrt:
I added commit bbcd9a4 as a local patch (as 1.9.2-2), and that helped e.g. RT3200 - a mediatek mt7622 based aarch64 router - to recognize some interrupts.
But the Qualcomm ipq807x based aarch64 routers still classify everything as class 0.
I have dynalink dl-wrx36 and it shows similar classification as above.
It looks like openwrt has drivers that give ita interrupts names that are not recognized by irqbalance (arm/arm64) doesn't report interrupt classes like x86 unfortunately). I need you to build a map for me. For every log entry that you attached above, I need to have a table with a column for the interrupt name, and a second column for the interrupt class.it should be. I'll use that information to augment the guessing code in irqbalance. Wildcards are ok if it saves you some time
I don't know too much about interrupts, so I don't know what is a interrupt class or how to get it.
If someone can point me to the way to get them, I can try to get this information.
Look in the irqbalance code in classes.c, you'll see this array, which lists the available classes:
char *classes[] = {
"other",
"legacy",
"storage",
"video",
"ethernet",
"gbit-ethernet",
"10gbit-ethernet",
"virt-event",
0
};
As far as I know, ath10k is an old wifi chip, so I doubt it can do 10gbit.
I'm not an embedded developer, more a Java developer, and I don't understand the meaning of each interrupt.
I've been looking and I've found the interrupts with this names in the Linux kernel: https://github.com/torvalds/linux/blob/master/drivers/net/wireless/ath/ath11k/ahb.c and other places, but I have not found the functionality of each one. I will keep searching.
I made a classification patch for ethernet and ath11k in DL-WRX36, based on the interrupt data below (two flent speed tests with wired, and two with wireless):
root@router5:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
9: 0 0 0 0 GIC-0 39 Level arch_mem_timer
13: 4797513 486501 483777 475057 GIC-0 20 Level arch_timer
16: 2 0 0 0 GIC-0 354 Edge smp2p
17: 0 0 0 0 GIC-0 216 Level 4a9000.thermal-sensor
18: 0 0 0 0 GIC-0 239 Level bam_dma
21: 0 0 0 0 GIC-0 270 Level bam_dma
22: 5 0 0 0 GIC-0 340 Level msm_serial0
23: 86416 0 0 0 GIC-0 178 Level bam_dma
24: 0 0 0 0 GIC-0 35 Edge wdt_bark
25: 0 0 0 0 GIC-0 357 Edge q6v5 wdog
29: 5 0 0 0 GIC-0 348 Edge ce0
30: 1163748 0 0 0 GIC-0 347 Edge ce1
31: 33650 0 0 0 GIC-0 346 Edge ce2
32: 4515 0 0 0 GIC-0 343 Edge ce3
34: 0 0 0 0 GIC-0 443 Edge ce5
36: 3408 0 0 0 GIC-0 72 Edge ce7
38: 0 0 0 0 GIC-0 334 Edge ce9
39: 0 0 0 0 GIC-0 333 Edge ce10
40: 0 0 0 0 GIC-0 69 Edge ce11
47: 0 0 0 0 GIC-0 323 Edge reo2ost-exception
48: 30 0 0 0 GIC-0 322 Edge wbm2host-rx-release
49: 30 0 0 0 GIC-0 321 Edge reo2host-status
50: 173515 0 0 0 GIC-0 320 Edge reo2host-destination-ring4
51: 183588 0 0 0 GIC-0 271 Edge reo2host-destination-ring3
52: 29378 0 0 0 GIC-0 268 Edge reo2host-destination-ring2
53: 143897 0 0 0 GIC-0 267 Edge reo2host-destination-ring1
57: 32213 0 0 0 GIC-0 263 Edge ppdu-end-interrupts-mac3
58: 0 0 0 0 GIC-0 262 Edge ppdu-end-interrupts-mac2
59: 238340 0 0 0 GIC-0 261 Edge ppdu-end-interrupts-mac1
60: 1 0 0 0 GIC-0 260 Edge rxdma2host-monitor-status-ring-mac3
61: 0 0 0 0 GIC-0 256 Edge rxdma2host-monitor-status-ring-mac2
62: 1 0 0 0 GIC-0 255 Edge rxdma2host-monitor-status-ring-mac1
63: 1 0 0 0 GIC-0 235 Edge host2rxdma-host-buf-ring-mac3
64: 0 0 0 0 GIC-0 215 Edge host2rxdma-host-buf-ring-mac2
65: 1 0 0 0 GIC-0 212 Edge host2rxdma-host-buf-ring-mac1
66: 0 0 0 0 GIC-0 211 Edge rxdma2host-destination-ring-mac3
67: 0 0 0 0 GIC-0 210 Edge rxdma2host-destination-ring-mac2
68: 0 0 0 0 GIC-0 209 Edge rxdma2host-destination-ring-mac1
73: 682 0 0 0 GIC-0 191 Edge wbm2host-tx-completions-ring3
74: 813 0 0 0 GIC-0 190 Edge wbm2host-tx-completions-ring2
75: 196030 0 0 0 GIC-0 189 Edge wbm2host-tx-completions-ring1
77: 19 0 0 0 GIC-0 47 Edge cpr3
78: 1630192 0 0 0 GIC-0 377 Level edma_txcmpl
79: 0 0 0 0 GIC-0 385 Level edma_rxfill
80: 3099397 0 0 0 GIC-0 393 Level edma_rxdesc
81: 0 0 0 0 GIC-0 376 Level edma_misc
82: 0 0 0 0 pmic_arb 51380237 Edge pm-adc5
83: 0 0 0 0 smp2p 0 Edge q6v5 fatal
84: 1 0 0 0 smp2p 1 Edge q6v5 ready
85: 0 0 0 0 smp2p 2 Edge q6v5 handover
86: 0 0 0 0 smp2p 3 Edge q6v5 stop
87: 0 0 0 0 msmgpio 34 Edge keys
88: 0 0 0 0 msmgpio 63 Edge keys
89: 0 0 0 0 GIC-0 172 Level xhci-hcd:usb1
90: 64 0 0 0 GIC-0 353 Edge glink-native
IPI0: 3057 3478 3356 7411 Rescheduling interrupts
IPI1: 35659 355804 329030 334136 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 736 474 418 448 IRQ work interrupts
IPI6: 0 0 0 0 CPU wake-up interrupts
Err: 0
wan/lan are edma_txcmpl, edma_txcmpl
Regex:
edma_[rt]x.*
All interrupts 29-75 are wireless ath11k related (as also documented in the ahb.c file linked in the github issue.)
The following regex might be ok for wifi:
ce[0-9][0-9]*
host2rxdma-host-buf-ring-mac[0-9]
ppdu-end-interrupts-mac[0-9]
reo2host-destination-ring[0-9]
rxdma2host-.*-ring-mac[0-9]
wbm2host-tx-completions-ring[0-9]
Wildcarding could reduce this to four regex rules:
ce[0-9][0-9]*
edma_[rt]x.*
ppdu-end-interrupts-mac[0-9][0-9]*
.*2.*host-.*-ring.*[0-9]
As a patch for irqbalance that is:
perus@ub2210:/Openwrt/e8450/feeds/packages/utils/irqbalance$ cat patches/120-add-ipq807x-ath11k-ints.patch
--- a/procinterrupts.c
+++ b/procinterrupts.c
@@ -108,6 +108,10 @@ static void guess_arm_irq_hints(char *na
/* Note: Last entry is a catchall */
static struct irq_match matches[] = {
{ "eth.*" ,{NULL} ,NULL, IRQ_TYPE_LEGACY, IRQ_GBETH },
+ { "ce[0-9][0-9]*" ,{NULL} ,NULL, IRQ_TYPE_LEGACY, IRQ_GBETH },
+ { "edma_[rt]x.*" ,{NULL} ,NULL, IRQ_TYPE_LEGACY, IRQ_GBETH },
+ { "ppdu-end-interrupts-mac[0-9][0-9]*" ,{NULL} ,NULL, IRQ_TYPE_LEGACY, IRQ_GBETH },
+ { ".*2.*host-.*-ring.*[0-9]" ,{NULL} ,NULL, IRQ_TYPE_LEGACY, IRQ_GBETH },
{ "[A-Z0-9]{4}[0-9a-f]{4}", {NULL} ,check_platform_device, IRQ_TYPE_LEGACY, IRQ_OTHER},
{ "PNP[0-9a-f]{4}", {NULL} ,check_platform_device, IRQ_TYPE_LEGACY, IRQ_OTHER},
{ ".*", {NULL}, NULL, IRQ_TYPE_LEGACY, IRQ_OTHER},
Irqbalance nicely classifies them:
IRQ bam_dma(23) guessed as class 0
IRQ wdt_bark(24) guessed as class 0
IRQ q6v5 wdog(25) guessed as class 0
IRQ ce0(29) guessed as class 5
IRQ ce1(30) guessed as class 5
IRQ ce2(31) guessed as class 5
IRQ ce3(32) guessed as class 5
IRQ ce5(34) guessed as class 5
IRQ ce7(36) guessed as class 5
IRQ ce9(38) guessed as class 5
IRQ ce10(39) guessed as class 5
IRQ ce11(40) guessed as class 5
IRQ reo2ost-exception(47) guessed as class 0
IRQ wbm2host-rx-release(48) guessed as class 0
IRQ reo2host-status(49) guessed as class 0
IRQ reo2host-destination-ring4(50) guessed as class 5
IRQ reo2host-destination-ring3(51) guessed as class 5
IRQ reo2host-destination-ring2(52) guessed as class 5
IRQ reo2host-destination-ring1(53) guessed as class 5
IRQ ppdu-end-interrupts-mac3(57) guessed as class 5
IRQ ppdu-end-interrupts-mac2(58) guessed as class 5
IRQ ppdu-end-interrupts-mac1(59) guessed as class 5
IRQ rxdma2host-monitor-status-ring-mac3(60) guessed as class 5
IRQ rxdma2host-monitor-status-ring-mac2(61) guessed as class 5
IRQ rxdma2host-monitor-status-ring-mac1(62) guessed as class 5
IRQ host2rxdma-host-buf-ring-mac3(63) guessed as class 5
IRQ host2rxdma-host-buf-ring-mac2(64) guessed as class 5
IRQ host2rxdma-host-buf-ring-mac1(65) guessed as class 5
IRQ rxdma2host-destination-ring-mac3(66) guessed as class 5
IRQ rxdma2host-destination-ring-mac2(67) guessed as class 5
IRQ rxdma2host-destination-ring-mac1(68) guessed as class 5
IRQ wbm2host-tx-completions-ring3(73) guessed as class 5
IRQ wbm2host-tx-completions-ring2(74) guessed as class 5
IRQ wbm2host-tx-completions-ring1(75) guessed as class 5
IRQ cpr3(77) guessed as class 0
IRQ edma_txcmpl(78) guessed as class 5
IRQ edma_rxfill(79) guessed as class 5
IRQ edma_rxdesc(80) guessed as class 5
IRQ edma_misc(81) guessed as class 0
IRQ pm-adc5(82) guessed as class 0
IRQ q6v5 fatal(83) guessed as class 0
And irqbalance seems to balance them ok.
But the big BUT is that ath11k crashes pretty soon. It starts with a channel survey error, and wifi remains unusable until reboot:
[ 432.235328] ath11k c000000.wifi: bss channel survey timed out
[ 435.275349] ath11k c000000.wifi: bss channel survey timed out
[ 438.315363] ath11k c000000.wifi: bss channel survey timed out
[ 440.365426] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: fatal error received:
[ 440.365426] QC Image Version: QC_IMAGE_VERSION_STRING=WLAN.HK.2.5.0.1-01208-QCAHKSWPL_SILICONZ-1
[ 440.365426] Image Variant : IMAGE_VARIANT_STRING=8074.wlanfw.eval_v2Q
[ 440.365426]
[ 440.365426] hif_ce.c:641 Assertion 0 failedparam0 :zero, param1 :zero, param2 :zero.
[ 440.365426] Thread ID : 0x00000067 Thread name : WLAN_HIF Process ID : 0
[ 440.365426] Register:
[ 440.365426] SP : 0x4c11fc90
[ 440.365426] FP : 0x4c11fc98
[ 440.365426] PC : 0x4b195a10
[ 440.365426] SSR : 0x00000008
[ 440.365426] BADVA : 0x00020000
[ 440.365426] LR : 0x4b1951ac
[ 440.365426]
[ 440.365426] Stack Dump
[ 440.365426] from : 0x4c11fc90
[ 440.365426] to : 0x4c11ff20
[ 440.365426]
[ 440.411613] remoteproc remoteproc0: crash detected in cd00000.q6v5_wcss: type fatal error
[ 440.433869] remoteproc remoteproc0: handling crash #1 in cd00000.q6v5_wcss
[ 440.442178] remoteproc remoteproc0: recovering cd00000.q6v5_wcss
[ 440.474826] remoteproc remoteproc0: stopped remote processor cd00000.q6v5_wcss
[ 440.495880] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
[ 440.495933] ath11k c000000.wifi: failed to send pdev bss chan info request
[ 440.755647] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
[ 440.755689] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 1
[ 440.761089] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
[ 440.767607] ath11k c000000.wifi: failed to submit beacon template command: -108
[ 440.773748] ath11k c000000.wifi: failed to update bcn template: -108
[ 440.780966] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
[ 440.787563] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 1
[ 444.315989] ath11k_warn: 39 callbacks suppressed
Apparently some of those interrupts should not be dynamically changed during heavy wifi operation.
Would be nice if also others might try to figure out if there is certain interrupt(s) that should not be dynamically manipulated by irqbalance.
I don't remember exactly what was, but playing to assign interrupts to CPUs, after modifying some of them something failed. So yes, it seems some interrupts must be left alone and not changed, but I don't know what interrupts where the culprit.
If interrupts shouldn't be affined from user space, its the responsibility of the device driver registering those interrupts to forcibly set and mark the affinity as non-changeable.
I would suggest that you disable irqbalance and manually try adjusting affinity for the candidate interrupts to observe when the problem occurs and open a bug against the kernel driver
ping, any update here?
Not really,
I have been experimenting with the irqs, and ipq807x (and ath11k WiFi) seem to be a bit crash-happy if irqs are dynamically changed. I tried testing with manually setting the IRQ affinities, but got a crash today with that, too.
Right now it looks like a statical assignment might be better, but even that is risky.
you need to open a bug with the respective driver maintainers in the kernel. If they don't do anything to disable irq affinity the drivers need to be prepared for the affinity to change. Its a serious kernel bug they need to be made aware of.