TrueNAS Upgrade Constant Rebooting

Description

I am having a constant reboot issue that I described here .

Below, I have attached my debug file for both of the machines that are rebooting.

Thanks again in advance!

Problem/Justification

None

Impact

None

SmartDraw Connector

Katalon Manual Tests (BETA)

Activity

Alexander Motin 
August 23, 2021 at 9:32 PM

I can't say anything with information available.

Alexander Motin 
August 23, 2021 at 9:31 PM

Please create your own ticket if you still experiencing issue after updating to the latest 12.0-U5.  Your issue may be completely unrelated.

Quinten 
May 17, 2021 at 9:27 AM

By looking at the Open Issues I came across the same issue.
My /var/logs/messages gave me these 2 errors in two different unscheduled reboots:

TrueNAS-12.0-U2-U3 (maybe also from 12.0-U1)
Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x30
fault code = supervisor read data, page not present
Instruction pointer = 0x20:0xffffffff80fd0af0
stack pointer = 0x28:0xfffffe00f2d4f4b0
frame pointer = 0x28:0xfffffe00f2d4f5a0
code segment = base 0x0, limit 0xfffff, type 0x1b
<hostname> = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 92487 (python3.8)
trap number = 12
panic: page fault

TrueNAS-12.0-U3.1
Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 02
instruction pointer = 0x20:0xffffffff80fd0cfa
stack pointer = 0x28:0xfffffe00eeaa64b0
frame pointer = 0x28:0xfffffe00eeaa65a0
code segment = base 0x0, limit 0xfffff, type 0x1b
<hostname> = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 45783 (python3.8)
trap number = 9
panic: general protection fault

Does this provide useful information for you?

Alexander Motin 
May 13, 2021 at 1:50 PM

I can't say anything without kernel dumps in debug or anything in logs.  I see no recent records in BMC logs, so it is likely not a watchdog resets, and so there should likely be something printed on the console.  You could try to set up and record serial console (physical or LAN) output to get some idea about what is going on.

I see that both systems are running BIOS from 2013.  Is that the latest version for that hardware?  TrueNAS 12 started updating CPU microcode to cover vulnerabilities, but I always prefer it to be done by BIOS.  Though I really doubt Dell would release even security patches 10 years later.

William Gryzbowski 
May 13, 2021 at 1:12 PM

 none of the debugs attached have a kernel crash dump, which is odd.

Can you attach a newer one, please?

Cannot Reproduce

Details

Assignee

Reporter

Impact

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created February 6, 2021 at 4:55 AM
Updated July 1, 2022 at 5:14 PM
Resolved August 23, 2021 at 9:32 PM