gptzfsboot takes 45+ minutes to boot after upgrade from 11.2U6 to 11.2U7.
Description
Problem/Justification
Impact
SmartDraw Connector
Katalon Manual Tests (BETA)
Activity
Peter Rapcan March 4, 2020 at 10:25 PM(edited)
I am booting from a PATA SSD (actually 2 of those are mirrored). My data disks are SATA & there is no need to scan those (so in my case an option to suppress scanning of SATA drives [for other people some defined set of drives] would do). In general, why would I want even the second mirrored drive to be scanned in a situation when gptzfsboot is running from one of the drives (hence it can boot from that drive, without scanning any other drive).
Alexander Motin March 4, 2020 at 10:08 PM
How to identify what disks are "other" to not scan if we allow mirrored boot pools? What device are you booing from?
Peter Rapcan March 4, 2020 at 9:54 PM(edited)
The problem persists for me on 11.3 and 11.3-U1 as well - booting more than 30 minutes with 4 data drives!!! This issue renders freenas unusable for me.
As for "I don't see what practically we can do here", a simple option for gptzfsboot to skip scanning other disks would do the job.
Alexander Motin February 21, 2020 at 1:28 AM
Good that it helped you. I don't see what practically we can do here. Obviously boot loader could be improved, but it is a very complicated area, considering number of random BIOS issues. Hope switch to UEFI finally solve them.
Edmond Shwayri February 20, 2020 at 6:57 PM
I expelled both Samsung SSDs. As I said, I've always wanted to do it, but never took the time; the Intel SSDs I use for the ZIL remain. I plan to installed a 16 drive SSD data volume at some point in the future. When I do that, I will likely also install a PCIe Optane drive to act as ZIL for the slow hard drives. I may just skip it if I can get everything I need to use sync writes on the SSD storage.
I updated the firmware on all 3xLSI HBAs in the system. Most importantly, I erased the BIOS on all three cards so the drives wouldn't be seen at boot time. This doesn't fix the issue, but works around the issue. Boot time is now back to normal. This is only possible for me since my data pool isn't on the on-board SATA ports. This probably won't work for others. I don't think this is a FreeNAS bug, but is an issue with how FreeBSD's gptzfsboot handles legacy boot when there are many visible drives. I don't know if it would do the same in EFI mode. One day maybe I will re-install in EFI mode.
I also went ahead and upgraded to U8. There is no way I will install 11.3 until U4 or U5 is released. The issues being reported in the forums are alarming to say the least. I will allow more time for these issues to all iron out. I usually run a FreeNAS install in a VM to test new versions till I feel comfortable upgrading.
Upgraded from 11.2U6 to 11.2U7. The reboot into U7 took 45+ minutes. All this time was spent with gptzfsboot. It was spamming the console with errors like this:
gptzfsboot: error 128 lba [some number]
45 minutes later, it quickly printed out:
BTX loader 1.00 BTX version is 1.02
Consoles: internal video/keyboard
BIOS drive C: is disk0
BIOS drive D: is disk1
... and so on until
BIOS drive P: is disk13
... and from here on out it booted normally. System runs perfectly once booted. From what I can tell gptzfsboot is scanning every single drive in the system. I've done some reading on both the iXSystens forum and the FreeBSD forms, and people say it is is scanning each drive looking for bootable partitions, and the error is printed out when it can't find a proper partition table. I don't care what it's doing, but 45 minutes is way too long to be doing it. People who have tried reverting back to an older boot environment say that it stll takes the same amount of time; this makes sense since the boot loder code comes before boot environments, so once it's updated it's there for all of them. I was looking in Jira, and it looks like you were doing enhancements and work arount gptzfsboot and zfsboot. Maybe something unexpected has happened as a result?
Some of the forum links:
https://forums.freebsd.org/threads/is-there-a-way-to-tell-gptzfsboot-to-skip-probing-other-hdds.73805/
https://forums.freebsd.org/threads/gptzfsboot-error-128-after-adding-new-disks.65677/#post-450688
https://www.ixsystems.com/community/threads/gptzfsboot-error-128-lba-some-block-prints-these-errors-on-boot-but-eventually-after-30-mins-boots.81439/#post-569818