gptzfsboot takes 45+ minutes to boot after upgrade from 11.2U6 to 11.2U7.

Description

Upgraded from 11.2U6 to 11.2U7. The reboot into U7 took 45+ minutes. All this time was spent with gptzfsboot. It was spamming the console with errors like this:

gptzfsboot: error 128 lba [some number]

45 minutes later, it quickly printed out:

BTX loader 1.00 BTX version is 1.02

Consoles: internal video/keyboard

BIOS drive C: is disk0

BIOS drive D: is disk1

... and so on until

BIOS drive P: is disk13

... and from here on out it booted normally. System runs perfectly once booted. From what I can tell gptzfsboot is scanning every single drive in the system. I've done some reading on both the iXSystens forum and the FreeBSD forms, and people say it is is scanning each drive looking for bootable partitions, and the error is printed out when it can't find a proper partition table. I don't care what it's doing, but 45 minutes is way too long to be doing it. People who have tried reverting back to an older boot environment say that it stll takes the same amount of time; this makes sense since the boot loder code comes before boot environments, so once it's updated it's there for all of them. I was looking in Jira, and it looks like you were doing enhancements and work arount gptzfsboot and zfsboot. Maybe something unexpected has happened as a result?

Some of the forum links:

https://forums.freebsd.org/threads/is-there-a-way-to-tell-gptzfsboot-to-skip-probing-other-hdds.73805/

https://forums.freebsd.org/threads/gptzfsboot-error-128-after-adding-new-disks.65677/#post-450688

https://www.ixsystems.com/community/threads/gptzfsboot-error-128-lba-some-block-prints-these-errors-on-boot-but-eventually-after-30-mins-boots.81439/#post-569818

Problem/Justification

None

Impact

None

SmartDraw Connector

Katalon Manual Tests (BETA)

Activity

Peter Rapcan 
March 4, 2020 at 10:25 PM
(edited)

I am booting from a PATA SSD (actually 2 of those are mirrored). My data disks are SATA & there is no need to scan those (so in my case an option to suppress scanning of SATA drives [for other people some defined set of drives] would do). In general, why would I want even the second mirrored drive to be scanned in a situation when gptzfsboot is running from one of the drives (hence it can boot from that drive, without scanning any other drive).

Alexander Motin 
March 4, 2020 at 10:08 PM

How to identify what disks are "other" to not scan if we allow mirrored boot pools?  What device are you booing from?

Peter Rapcan 
March 4, 2020 at 9:54 PM
(edited)

The problem persists for me on 11.3 and 11.3-U1 as well - booting more than 30 minutes with 4 data drives!!! This issue renders freenas unusable for me.

As for "I don't see what practically we can do here", a simple option for gptzfsboot to skip scanning other disks would do the job. 

Alexander Motin 
February 21, 2020 at 1:28 AM

Good that it helped you.  I don't see what practically we can do here.  Obviously boot loader could be improved, but it is a very complicated area, considering number of random BIOS issues.  Hope switch to UEFI finally solve them.

Edmond Shwayri 
February 20, 2020 at 6:57 PM

I expelled both Samsung SSDs.  As I said, I've always wanted to do it, but never took the time; the Intel SSDs I use for the ZIL remain.  I plan to installed a 16 drive SSD data volume at some point in the future.  When I do that, I will likely also install a PCIe Optane drive to act as ZIL for the slow hard drives.  I may just skip it if I can get everything I need to use sync writes on the SSD storage.

I updated the firmware on all 3xLSI HBAs in the system.  Most importantly, I erased the BIOS on all three cards so the drives wouldn't be seen at boot time.  This doesn't fix the issue, but works around the issue.  Boot time is now back to normal.  This is only possible for me since my data pool isn't on the on-board SATA ports.  This probably won't work for others.  I don't think this is a FreeNAS bug, but is an issue with how FreeBSD's gptzfsboot handles legacy boot when there are many visible drives.  I don't know if it would do the same in EFI mode.   One day maybe I will re-install in EFI mode.

I also went ahead and upgraded to U8.  There is no way I will install 11.3 until U4 or U5 is released.  The issues being reported in the forums are alarming to say the least.  I will allow more time for these issues to all iron out.  I usually run a FreeNAS install in a VM to test new versions till I feel comfortable upgrading.

User Configuration Error

Details

Assignee

Reporter

Labels

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created February 16, 2020 at 8:36 PM
Updated July 1, 2022 at 4:50 PM
Resolved February 21, 2020 at 1:28 AM