Uploaded image for project: 'TrueNAS'
  1. TrueNAS
  2. NAS-109396

12.0-U2 - TrueNAS VM crash if using more than 64GB RAM

    XMLWordPrintable

    Details

    • Impact:
      Medium

      Description

      Upgrade was done from U1.1 where it was stable for 26 days with 0 issues. For some reason, I get unscheduled reboots on this build after 1hr 10m~1hr 15m of uptime. Entire ESXi host has to be restarted otherwise attempts to restart the VM hang at mps0 (LSI2008) failing with "Doorbell handshake failed". Noticed that in the vmware.log it reports the following before shutting down the VM on initial crash:

      vmx| E105: PANIC: PCI passthru device 0000:02:00.0 (LSI2008) caused an IOMMU fault type 6 at address 0xc0000000.

      Reverted to U1.1 w/ 96GB RAM (old config) and stress tested for a few hours with no issues.

      Upgraded back to U.2 and set to 96GB RAM, was able to crash the system in 10 minutes of stress testing (Plex + torrents).

      Rebooted ESXi host to recover the HBA, set VM to 64GB RAM, stable after 10 hours (torrents only, brief Plex streaming) where it would crash at 1hr 10-15m previously.


      Hardware:
      R720xd
      12x1TB drives, 2xRAID-Z2 VDEVs
      192GB RAM (tests clean on memtest, no errors from iDRAC)
      2xIntel Xeon E5-2630v2 CPUs
      PERC H710P serving RAID0 of 2xIntel DC S3500 600GB SSDs for VM datastore
      ESXi-6.7.0-20201104001-standard (Dell, latest)
      VM is allocated 8 CPUs, 32GB disk, and 96GB RAM. RAM is reserved due to LSI2008 passthrough.

        Attachments

          Attachments

            JEditor

              Activity

                People

                Assignee:
                releng Triage Team
                Reporter:
                freph91 freph
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                  Dates

                  Created:
                  Updated:
                  Resolved: