142 HDDs in 71 mirror VDEVs
4x NVMe L2ARC
A long running, read-heavy workload had loaded over 4 TiB of data into L2ARC. Upon failover, the following message appeared on the serial console and the newly elected active controller would not display the UI. During the L2ARC rebuild, one CPU core per NVMe device was 100% pegged so the rebuild seems to be CPU bound.
ctl_ha_role_sysctl: CTL_LUNREQ_MODIFY returned 3 'no file argument specified'
Once the rebooted controller fully booted, we saw the following message:
carp: 20@ntb0: MASTER -> BACKUP (more frequent advertisement received)
In the end, after 33 minutes, the UI came up and all was well, the import succeeded. However, ZFS is ready to service IO long before the L2ARC rebuild is complete, so we believe middleware should not block on the complete rebuild (given how massive L2ARC has a potential to be).