pcengines/apu2-documentation

pfSense crashing- wle200nx used in 5Ghz (Fatal trap 12: page fault while in kernel mode)

Closed this issue · 6 comments

PCEngines firmware version
v4.0.9 or 4.14.0.6

APU variant
APU2

OS and OS Version
Affects PFSense 2.6.0/2.5.2 - FreeBSD 12.3
PFSense 2.4.5-p1 or previous are not affected.

Affected component(s), peripheral(s) or functionality
wle200nx wireless card does not work in 5Ghz and crashes OS.

Brief summary
Configure multiple wireless virtual interfaces using 5Ghz, the PFSense keeps crashing in continuous loop: "Fatal trap 12: page fault while in kernel mode"
When using 2.4Ghz everything seems fine.

How reproducible
Always.

How to reproduce

Steps to reproduce the behavior:

  1. Create 2 virtual wireless interfaces
  2. On the first interface , change wireless Standard to: "802.11na"
  3. Change channel to: 11 a/n - 36"
  4. Press "Apply"
  5. Do the same thing to other virtual interface
  6. Press "Apply"
  7. Crash happens in that moment or after a reboot

Expected behavior
I should be able to have multiple Wireless SSIDs in 5Ghz and not crash PFSense.

Actual behavior
It crashed, I have to reinstall because it keeps crashing in loop.

Additional context
I have raised this with PFSense however they closed saying it was not likely a problem in their end.
PFSense 2.4.2 works without any issues, which maybe can help to trace the problem ?
Tried all permutations of options on the bios configuration, no luck - none of them fixed the issues...

More info about the crash:

Welcome to pfSense 2.6.0-RELEASE...

savecore 104 - - reboot after panic: page fault
savecore 104 - - writing core to /var/crash/textdump.tar.13
...ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib /usr/local/lib/compat/pkg /usr/local/lib/compat/pkg /usr/local/lib/ipsec /usr/local/lib/perl5/5.32/mach/CORE
32-bit compatibility ldconfig path:
done.
External config loader 1.0 is now starting... ada0p1 ada0p3
Launching the init system...Updating CPU Microcode...
CPU: AMD GX-412TC SOC                                (998.15-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x730f01  Family=0x16  Model=0x30  Stepping=1
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  Features2=0x3ed8220b<SSE3,PCLMULQDQ,MON,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C>
  AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
  AMD Features2=0x1d4037ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT,Topology,PNXC,DBE,PTSC,PL2I>
  Structured Extended Features=0x8<BMI1>
  XSAVE Features=0x1<XSAVEOPT>
  SVM: NP,NRIP,AFlush,DAssist,NAsids=8
  TSC: P-state invariant, performance statistics
Done.
.... done.
Initializing.................. done.
Starting device manager (devd)...done.
Loading configuration......done.
Updating configuration...done.
Checking config backups consistency.......................done.
Setting up extended sysctls...done.
Setting timezone...done.
Configuring loopback interface...done.
Starting syslog...done.
Setting up interfaces microcode...done.
Configuring loopback interface...done.
Creating wireless clone interfaces...ath0: no beacon buffer available
ath0: no beacon buffer available
done.
Configuring VLAN interfaces...done.
Configuring WIR_NORMAL interface...[ar5210] loaded
[ar5211] loaded
[ar5212] loaded
[ar5416] loaded
[ar9300] loaded
[ath_rate] loaded
[ath_dfs] loaded
[ath] loaded
done.
Configuring WIR_GUEST interface...

Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 03
fault virtual address	= 0x0
fault code		= supervisor write data, page not present
instruction pointer	= 0x20:0xffffffff80f1078c
stack pointer	        = 0x28:0xfffffe0000561a30
frame pointer	        = 0x28:0xfffffe0000561a80
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 12 (irq24: ath0)
trap number		= 12
panic: page fault
cpuid = 3
time = 1645373507
KDB: enter: panic
[ thread pid 12 tid 100063 ]
Stopped at      kdb_enter+0x37: movq    $0,0x28f4676(%rip)
db:0:kdb.enter.default> textdump set
textdump set
db:0:kdb.enter.default>  capture on
db:0:kdb.enter.default>  run lockinfo
db:1:lockinfo> show locks
No such command; use "help" to list available commands
db:1:lockinfo>  show alllocks
No such command; use "help" to list available commands
db:1:lockinfo>  show lockedvnods
Locked vnodes
db:0:kdb.enter.default>  show pcpu
cpuid        = 3
dynamic pcpu = 0xfffffe007e322200
curthread    = 0xfffff8000539f000: pid 12 tid 100063 "irq24: ath0"
curpcb       = 0xfffff8000539f5a0
fpcurthread  = none
idlethread   = 0xfffff80005245740: tid 100006 "idle: cpu3"
curpmap      = 0xffffffff8368f6e8
tssp         = 0xffffffff837198d8
commontssp   = 0xffffffff837198d8
rsp0         = 0xfffffe0000561cc0
kcr3         = 0xffffffffffffffff
ucr3         = 0xffffffffffffffff
scr3         = 0x0
gs32p        = 0xffffffff837200f0
ldt          = 0xffffffff83720130
tss          = 0xffffffff83720120
tlb gen      = 1619
curvnet      = 0
db:0:kdb.enter.default>  bt
Tracing pid 12 tid 100063 td 0xfffff8000539f000
kdb_enter() at kdb_enter+0x37/frame 0xfffffe00005616f0
vpanic() at vpanic+0x197/frame 0xfffffe0000561740
panic() at panic+0x43/frame 0xfffffe00005617a0
trap_fatal() at trap_fatal+0x391/frame 0xfffffe0000561800
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0000561850
trap() at trap+0x286/frame 0xfffffe0000561960
calltrap() at calltrap+0x8/frame 0xfffffe0000561960
--- trap 0xc, rip = 0xffffffff80f1078c, rsp = 0xfffffe0000561a30, rbp = 0xfffffe0000561a80 ---
ieee80211_beacon_update() at ieee80211_beacon_update+0x7ac/frame 0xfffffe0000561a80
ath_beacon_generate() at ath_beacon_generate+0x46/frame 0xfffffe0000561ad0
ath_beacon_proc() at ath_beacon_proc+0x241/frame 0xfffffe0000561b20
ath_intr() at ath_intr+0x4b8/frame 0xfffffe0000561b50
ithread_loop() at ithread_loop+0x23c/frame 0xfffffe0000561bb0
fork_exit() at fork_exit+0x7e/frame 0xfffffe0000561bf0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000561bf0

...

Tracing command hostapd pid 14252 tid 100527 td 0xfffff80029507000
sched_switch() at sched_switch+0x630/frame 0xfffffe0033c35750
mi_switch() at mi_switch+0xd4/frame 0xfffffe0033c35780
sleepq_catch_signals() at sleepq_catch_signals+0x403/frame 0xfffffe0033c357d0
sleepq_timedwait_sig() at sleepq_timedwait_sig+0x14/frame 0xfffffe0033c35810
_cv_timedwait_sig_sbt() at _cv_timedwait_sig_sbt+0x11f/frame 0xfffffe0033c35870
seltdwait() at seltdwait+0x71/frame 0xfffffe0033c358a0
kern_select() at kern_select+0x91a/frame 0xfffffe0033c35a80
sys_select() at sys_select+0x56/frame 0xfffffe0033c35ac0
amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe0033c35bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0033c35bf0
--- syscall (93, FreeBSD ELF64, sys_select), rip = 0x800888c8a, rsp = 0x7fffffffea38, rbp = 0x7fffffffea70 ---

Tracing command syslogd pid 7653 tid 100504 td 0xfffff80008fe4000
sched_switch() at sched_switch+0x630/frame 0xfffffe002d9a7770
mi_switch() at mi_switch+0xd4/frame 0xfffffe002d9a77a0
sleepq_catch_signals() at sleepq_catch_signals+0x403/frame 0xfffffe002d9a77f0
sleepq_wait_sig() at sleepq_wait_sig+0xf/frame 0xfffffe002d9a7820
_cv_wait_sig() at _cv_wait_sig+0xf7/frame 0xfffffe002d9a7870
seltdwait() at seltdwait+0xb3/frame 0xfffffe002d9a78a0
kern_select() at kern_select+0x91a/frame 0xfffffe002d9a7a80
sys_select() at sys_select+0x56/frame 0xfffffe002d9a7ac0
amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe002d9a7bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe002d9a7bf0
--- syscall (93, FreeBSD ELF64, sys_select), rip = 0x800415c8a, rsp = 0x7fffffffe5c8, rbp = 0x7fffffffebe0 ---

Tracing command devd pid 606 tid 100513 td 0xfffff8000f7e0000
sched_switch() at sched_switch+0x630/frame 0xfffffe002d9d4750
mi_switch() at mi_switch+0xd4/frame 0xfffffe002d9d4780
sleepq_catch_signals() at sleepq_catch_signals+0x403/frame 0xfffffe002d9d47d0
sleepq_timedwait_sig() at sleepq_timedwait_sig+0x14/frame 0xfffffe002d9d4810
_cv_timedwait_sig_sbt() at _cv_timedwait_sig_sbt+0x11f/frame 0xfffffe002d9d4870
seltdwait() at seltdwait+0x71/frame 0xfffffe002d9d48a0
kern_select() at kern_select+0x91a/frame 0xfffffe002d9d4a80
sys_select() at sys_select+0x56/frame 0xfffffe002d9d4ac0
amd64_syscall() at amd64_syscall+0x387/frame 0xfffffe002d9d4bf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe002d9d4bf0
--- syscall (93, FreeBSD ELF64, sys_select), rip = 0x2dd47a, rsp = 0x7fffffffcac8, rbp = 0x7fffffffec60 ---

@adamast0r looks like some problem with IRQ from the WLE200 card. Does disabling IOMMU help with the v4.14.0.6 version? I recall issues, when WLE200NX is used with IOMMU due to legacy INTx, interrupt signalling present o this card.

I have raised this with PFSense however they closed saying it was not likely a problem in their end.

Did they at least explain what happens in these crash logs? Could you point to the pfSense tracking ticket you created? I wish I had some explanation/assistance from them at least if they are not taking any responsibility for that...

Also could you please try to go down with pfSense releases step by step from 2.5.2 to 2.4.2 and see which is the first version that fails?

@miczyg1 I edited the first post , the last version working is 2.4.5 p1 instead of 2.4.2. The problem definitely starts on 2.5.X and 2.6.X. As far as I know 2.4.X uses FreeBSD 11.3 and 2.5.X/2.6.X use FreeBSD 12.

The ticket was raised with Pfsense: https://redmine.pfsense.org/issues/12788#change-58922 but closed. I also have noticed that sometimes I can change the config in PFsense from 2Ghz to 5GHZ and apparently works but after a reboot when it reads the configuration back it keeps crashing in loop.

I have changed the IOMMU settings because I saw older posts about, but they made no difference.

Basically I am stuck at Pfsense 2.4.5-p1 because I can't have multiple SSIDs in the newer versions working in 5GHz. I have changed the IOMMU settings because I saw tickets mentioning it but unfortunately made no difference.

So it seems like problems started when migrating from FreeBSD 11.3 to FreeBSD 12.2 and newer. It is true that they have no control over OS drivers as it is derived from FreeBSD. Have you tried posting an issue on FreeBSD in such case?

As it looks more like an OS problem (issues started after updating the FreeBSD/pfSense version) we do not plan to look on this issue. Closing the issue now unless it is proven that this is firmware issue