Kernel panics on I/O to spun down disks
Description
Problem/Justification
Impact
Activity
Alexander Motin April 29, 2022 at 4:56 PM
Lets reclassify this issue and consider it fixed for now. If testing show that some issue still remained – create a new ticket for 13.0-U1.
Alexander Motin April 28, 2022 at 2:24 AM
For note, so far working on this ticket:
- Improved kernel dumping reliability by merging one commit and making another: https://cgit.freebsd.org/src/commit/?id=38f8addaab1aac0830948189826bd7c7931859fd
- Fixed assertion and possibly memory corruption when spun down disk receives 2+ commands: https://cgit.freebsd.org/src/commit/?id=404f001161b975164d8b52d9f404d07ac7584027
Will see whether the second is related to the original problem.
Krautmaster April 19, 2022 at 8:28 PM(edited)
attached the grabbed minidump from
WARNING
freenas.local had an unscheduled system reboot. The operating system successfully came back online at Tue Apr 19 22:15:21 2022.
2022-04-19 22:15:21 (Europe/Berlin)
Alexander Motin April 18, 2022 at 7:28 PM(edited)
To get minidump instead of textdump before the crash do:
sysctl debug.debugger_on_panic=0
sysctl debug.ddb.textdump.pending=0
dumpon off
dumpon /dev/daX
After the crash and reboot:
cd /mnt/tank
savecore . /dev/daX
As result in the specified directory should be stored couple files representing the dump. They do for me.
Debug symbols for the specific TrueNAS build can be found at http://download.freenas.org/ , looking for TrueNAS-13.0-MASTER-*.debug.txz for the exact version you are running (see `cat /etc/version` on the NAS). Inside usr/lib/debug/boot there are symbols for normal and debug kernels, that can be unpacked into the same path on TrueNAS to run kgdb on the core.
Krautmaster April 11, 2022 at 5:10 AM
thanks, will try - this morning i noticed a pretty odd panic. I was watching the "VM monitor" while checking if the system was up normally or rebooted again, and all looked normal. Until I tried to access the webgui, then a panic was thown.
will now try this
Alexander Motin -> like dicussed:
corresponding FreeBSD ticket
262894 – Kernel Panic (page fault) with 13.1-BETA2 in g_eli & httpd (freebsd.org)
Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address = 0xfffff80e00000004
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80f1c50d
stack pointer = 0x28:0xfffffe0144d3cc00
frame pointer = 0x28:0xfffffe0144d3cca0
code segment = base r x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 2653 (g_eli[2] gptid/e2d4)
trap number = 12
panic: page fault
cpuid = 2
time = 1648393340
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0144d3c9c0
vpanic() at vpanic+0x17f/frame 0xfffffe0144d3ca10
panic() at panic+0x43/frame 0xfffffe0144d3ca70
trap_fatal() at trap_fatal+0x385/frame 0xfffffe0144d3cad0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe0144d3cb30
calltrap() at calltrap+0x8/frame 0xfffffe0144d3cb30
— trap 0xc, rip = 0xffffffff80f1c50d, rsp = 0xfffffe0144d3cc00, rbp = 0xfffffe0144d3cca0 —
aesni_crypt_xts() at aesni_crypt_xts+0x17d/frame 0xfffffe0144d3cca0
aesni_decrypt_xts() at aesni_decrypt_xts+0xe/frame 0xfffffe0144d3ccc0
aesni_cipher_crypt() at aesni_cipher_crypt+0x2f1/frame 0xfffffe0144d3cd70
aesni_process() at aesni_process+0x159/frame 0xfffffe0144d3cdc0
crypto_dispatch() at crypto_dispatch+0x118/frame 0xfffffe0144d3cdf0
g_eli_crypto_run() at g_eli_crypto_run+0x178/frame 0xfffffe0144d3ce90
g_eli_worker() at g_eli_worker+0x328/frame 0xfffffe0144d3cef0
fork_exit() at fork_exit+0x7e/frame 0xfffffe0144d3cf30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0144d3cf30
— trap 0x80af5f94, rip = 0, rsp = 0, rbp = 0 —
KDB: enter: panic