Repeated Kernel Panics in TrueNAS 12 RC1 - System was stable under 11.3

Description

I am experiencing repeated Kernel Panics related to a 'spin lock' after migrating a system from FreeNAS 11.3 U5 to TrueNAS 12 RC1. The system was previously stable under 11.3. The Kernel panics started several hours after upgrade to 12 RC1 and have continued on a 24-48 hour basis ever since.

Basic Error Details (full crash dump attached):

Dump header from device: /dev/da5p1
Architecture: amd64
Architecture Version: 4
Dump Length: 1611776
Blocksize: 512
Compression: none
Dumptime: Thu Oct 8 11:45:25 2020
Hostname: nas.lan
Magic: FreeBSD Text Dump
Version String: FreeBSD 12.2-PRERELEASE 4912790fb32(HEAD) TRUENAS
Panic String: spin lock held too long
Dump Parity: 2374960721
Bounds: 0
Dump Status: good

System Config:

Board: Supermicro X10SDV-4C-TLN2F

CPU: Intel Xeon D-1521 (4-core)

RAM: Kingston 64GB ECC

HBA: LSI 9211-8i (Firmware 20.00.07.00-IT)

Boot Pool: 2x Kingston A400 120GB SSD

System/Jails/VM Pool: 2x Samsung 250GB SSD

Pool 1 (Mirror): 2x 6TB HGST Ultrastar 7k6 (6TB usable)

Pool 2 (Mirror): 2x 4TB Ironwolf NAS (4TB usable)

Pool 3 (Mirror): 2x 2TB WD RED (2TB usable)

The instances of this happening are captured by my board's IPMI - note that the watchdog feature is DISABLED in BIOS so I am unsure specifically what is being captured here (power cycles are manual not automatic):

6
2020/10/08 10:45:38
#0xca
Watchdog 2
Timer Interrupt - Assertion

7
2020/10/08 10:49:37
#0xca
Watchdog 2
Timer Interrupt - Assertion

8
2020/10/08 10:51:40
#0xca
Watchdog 2
Power Cycle - Assertion

9
2020/10/09 10:02:49
#0xca
Watchdog 2
Timer Interrupt - Assertion

10
2020/10/09 10:06:57
#0xca
Watchdog 2
Timer Interrupt - Assertion

11
2020/10/09 10:08:58
#0xca
Watchdog 2
Power Cycle - Assertion

12
2020/10/12 01:07:31
#0xca
Watchdog 2
Timer Interrupt - Assertion

13
2020/10/12 01:11:30
#0xca
Watchdog 2
Timer Interrupt - Assertion

14
2020/10/12 01:13:31
#0xca
Watchdog 2
Power Cycle - Assertion

15
2020/10/12 21:56:06
#0xca
Watchdog 2
Timer Interrupt - Assertion

16
2020/10/12 21:59:56
#0xca
Watchdog 2
Timer Interrupt - Assertion

17
2020/10/12 22:01:57
#0xca
Watchdog 2
Power Cycle - Assertion

18
2020/10/13 23:55:40
#0xca
Watchdog 2
Timer Interrupt - Assertion

19
2020/10/13 23:59:44
#0xca
Watchdog 2
Timer Interrupt - Assertion

20
2020/10/14 00:01:45
#0xca
Watchdog 2
Power Cycle - Assertion

Problem/Justification

None

Impact

None

SmartDraw Connector

Katalon Manual Tests (BETA)

Activity

Show:

Alexander Motin 
April 6, 2022 at 5:14 PM

Unfortunately not.  I see you have watchdog resets almost daily, but no kernel dumps since one in December.  Do you get any more feedback on console or in debug (alike to posted before) if you disable watchdog with `service watchdogd stop`?  If you get any, please open separate ticket.

Taney 
April 6, 2022 at 2:40 AM

Does this provide anything useful?

Alexander Motin 
February 14, 2022 at 3:10 PM
(edited)

, this ticket is more than a year old.  If you still have the issue (assuming it is the same) on the latest TrueNAS 12.0-U8, please create a new ticket attaching the debug information.  You may also try TrueNAS 13.0-BETA, published recently.

Taney 
February 11, 2022 at 4:31 PM

Has this been all resolved in the latest version? It's still happening to my set up and I have the same motherboard.

Alexander Motin 
November 23, 2020 at 5:59 PM

Closing this, since the fix provided by upstream looks very promising.

Complete

Details

Assignee

Reporter

Labels

Impact

Components

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created October 17, 2020 at 10:40 AM
Updated July 1, 2022 at 4:54 PM
Resolved November 23, 2020 at 5:59 PM