Boot Issues with TrueNAS-SCALE-22.02.0.1

Description

I am having two boot issues with a fresh install of Truenas Scale on my Lenovo TD340.

System info:
Lenovo ThinkServer TD340 bios: A3TSF5A
MoBo: L32TT2 (Lenovo)
CPU: Intel Xeon E5-2420 v2
Memory: 48G DDR3 1333 ECC
Boot Drive: Crucial CT256M4SSD2
NIC: AOC-STGN-I2S (Intel 82599)

1. Booting with SR-IOV enabled: When I had SR-IOV enabled in the BIOS the Truenas system would never fully boot. It booted to GRUB and then started the Truenas bootup process however it would go into what appeared to some sort of loop in the bootup process. I left the bootup process running for about an hour and never fully booted It constantly showed the same error messages:

May 1 19:18:33 truenas kernel: Sending NMI from CPU 6 to CPUs 2:
May 1 19:18:33 truenas kernel: NMI backtrace for cpu 2
May 1 19:18:33 truenas kernel: CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.93+truenas #1
May 1 19:18:33 truenas kernel: Hardware name: LENOVO ThinkServer TD340 /ThinkServer TD340, BIOS A3TSF5A 09/11/2020
May 1 19:18:33 truenas kernel: RIP: 0010:io_serial_in+0x14/0x20
May 1 19:18:33 truenas kernel: Code: 00 00 d3 e6 48 63 f6 48 03 77 10 8b 06 c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 0f b6 8f b9 00 00 00 8b 57 08 d3 e6 01 f2 ec <0f> b6 c0 c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 0f b6 8f b9 00
May 1 19:18:33 truenas kernel: RSP: 0000:ffffb467400239d8 EFLAGS: 00000002
May 1 19:18:33 truenas kernel: RAX: ffffffffb7be5700 RBX: ffffffffb95f5220 RCX: 0000000000000000
May 1 19:18:33 truenas kernel: RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffffb95f5220
May 1 19:18:33 truenas kernel: RBP: 0000000000001eec R08: 0000000000000002 R09: 0000000000000882
May 1 19:18:33 truenas kernel: R10: 000000000000075d R11: 207375625f696370 R12: 0000000000000020
May 1 19:18:33 truenas kernel: R13: ffffffffb94ea3ee R14: 0000000000000001 R15: 0000000000000000
May 1 19:18:33 truenas kernel: FS: 0000000000000000(0000) GS:ffff9fbcffa80000(0000) knlGS:0000000000000000
May 1 19:18:33 truenas kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 1 19:18:33 truenas kernel: CR2: 0000000000000000 CR3: 000000072520a001 CR4: 00000000001706e0
May 1 19:18:33 truenas kernel: Call Trace:
May 1 19:18:33 truenas kernel: wait_for_xmitr+0x40/0xb0
May 1 19:18:33 truenas kernel: serial8250_console_putchar+0x18/0x30
May 1 19:18:33 truenas kernel: ? wait_for_xmitr+0xb0/0xb0
May 1 19:18:33 truenas kernel: uart_console_write+0x43/0x50
May 1 19:18:33 truenas kernel: serial8250_console_write+0x300/0x380
May 1 19:18:33 truenas kernel: ? vt_console_print+0x2bb/0x3f0
May 1 19:18:33 truenas kernel: console_unlock+0x3c6/0x530
May 1 19:18:33 truenas kernel: vprintk_emit+0x208/0x250
May 1 19:18:33 truenas kernel: dev_vprintk_emit+0x12c/0x150
May 1 19:18:33 truenas kernel: dev_printk_emit+0x4e/0x65
May 1 19:18:33 truenas kernel: ? __dev_printk+0x2d/0x69
May 1 19:18:33 truenas kernel: _dev_info+0x6c/0x83
May 1 19:18:33 truenas kernel: pci_bus_dump_resources.cold+0x14/0x19
May 1 19:18:33 truenas kernel: pci_bus_dump_resources+0x5b/0x70
May 1 19:18:33 truenas kernel: pci_assign_unassigned_root_bus_resources+0xfd/0x1c0
May 1 19:18:33 truenas kernel: pci_assign_unassigned_resources+0x1f/0x7c
May 1 19:18:33 truenas kernel: pcibios_assign_resources+0x1b/0xcd
May 1 19:18:33 truenas kernel: ? xsk_init+0xbe/0xbe
May 1 19:18:33 truenas kernel: do_one_initcall+0x44/0x1d0
May 1 19:18:33 truenas kernel: kernel_init_freeable+0x21e/0x280
May 1 19:18:33 truenas kernel: ? rest_init+0xb4/0xb4
May 1 19:18:33 truenas kernel: kernel_init+0xa/0x10c
May 1 19:18:33 truenas kernel: ret_from_fork+0x22/0x30
May 1 19:18:33 truenas kernel: pci_bus 0000:0c: resource 10 [mem 0x80000000-0xfbffffff window]
May 1 19:18:33 truenas kernel: rcu: rcu_sched kthread starved for 706 jiffies! g-1119 f0x0 RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=1

2. Long boot up times with SR-IOV disabled: The other issue I have am having is what I think would be considered long bootup times. If I time the bootup process from the moment GRUB starts to when the web gui is available the boot times is usually in the 20-25 minute range.

If these two issues need to be slit into seperate JIRAs let me know.

I also tested Truenas Bluefin 22.12-master-20220429-150401 version and had the same results.

I did some additional testing boot environments for a sanity check and they all seem to be working as expected.

1. Debian 11.3 with SR-IOV Enabled: It was just a base system so not many packages had to load, but at least I know linux can boot correctly with SR-IOV enabled.
2. Truenas Core 12.0-U8 SR-IOV Enabled: System booted successfully, and was well under 2 minutes until the system was fully booted.

I will attach debug files from both Truenas installs I did 22.02.0.1 and 22.15. Currently the system is not in production so I can preform additional tests if required.

Problem/Justification

None

Impact

None

Attachments

5

Activity

Show:
Done

Details

Assignee

Reporter

Labels

Impact

Original estimate

Time remaining

0m

Components

Fix versions

Priority

Katalon Platform

Created May 3, 2022 at 9:45 PM
Updated December 15, 2022 at 6:05 PM
Resolved August 23, 2022 at 6:20 PM