TrueNAS Scale: Stuck on boot for 15min exception in import_on_boot

Description

Setup: TrueNAS Scale 22.02-RC.2

QNAP ts-453D

DOM: EFI-BOOT, mounts boot-pool, starts truenas debian kernel
2 PCIE SATA Disks (mirrored pools: boot-pool, system)
2 SATA HDD (mirrored pool: backup)
2 SATA SSD (mirrored pool: data)

Boot runs up to importing pools. backup pool import fails (error: not found in middlewared/pool.py). Stuck for 15min until boot continues. During this stuck not access to TrueNAS possible (neither WebGUI nor ssh). System boots afterwards into ok status (all pools imported and fine). In some exceptions boot is stuck indefinitely (when middlewared shows "no limit", otherwise it shows these 15min limit).

Logs attached:

middlewared.log, kern.log, dmesg.out, disks, syslog, zpool.list, zpool.status

Problem/Justification

None

Impact

None

is duplicated by

relates to

Activity

Show:

Ryan Moeller 
June 17, 2022 at 3:03 PM

We cannot reproduce this and cannot sink more time into investigating what may possibly be related to this unsupported configuration you have created.

Ameer 
June 16, 2022 at 6:56 PM
(edited)

Hello ,

There seems to be a misconfiguration in the boot device setup. By default, the system dataset is located in the boot pool but it can be moved to a storage pool (different disk) if configured. But the attached logs show that the system dataset is on a separate pool but somehow it is residing on the same disk with the boot pool, which indicates that SCALE is installed on a partitioned disk or there might be some custom configuration. The SCALE needs to own the whole boot disk to operate correctly.

Moreover, I have tested the a scenario by creating a docker application from scratch on one storage pool and reboot worked just fine after its replication to another storage pool.

Kenan Isgör 
March 22, 2022 at 12:41 PM
(edited)

Hi there,

just reproduced the problem again (timeout ZFS import @boot).

System: 22.02-RELEASE

Changed and tested:

  • Unset pool for ix-applications

  • Completely deleted ix-applications dataset inkl. snapshots

  • Replication of system dataset completely removed from backup dataset

  • Set pool for ix-applications again to dataset system and installed docker applications from scratch (Observation: before, ix-application had sub-datasets for each subdirectory except "backup" directory, now it is only one dataset "ix-applications", but after replication of system dataset to backup dataset there are different snapshots per subdirectory again, e.g. "backup/snapshots/system/ix-applications/docker@auto-xx, ...ix-applications/release@auto-xx, etc.)

  • Done again replication of system dataset incl. ix-applications dataset to backup dataset

  • Reboot timed out at import of ZFS pools

  • Debug saved and attached in private section

No timeout occurs when replication of system dataset is destroyed in backup dataset (as before).

Please advise if you need another session. Thanks!

Kenan Isgör 
March 14, 2022 at 2:19 PM

... and also the replication of system/ix-applications/docker is the problem. if deleted the boot runs through fine.

Kenan Isgör 
March 14, 2022 at 11:40 AM

Issue persists also in RELEASE version

Cannot Reproduce

Details

Assignee

Reporter

Labels

Impact

Time remaining

0m

Components

Fix versions

Affects versions

Priority

Katalon Platform

Created January 3, 2022 at 9:50 PM
Updated October 31, 2022 at 9:50 AM
Resolved June 17, 2022 at 3:03 PM