Cannot Reproduce
Details
Details
Assignee
Triage Team
Triage TeamReporter
freph
frephLabels
Impact
Medium
Fix versions
Affects versions
Priority
More fields
More fields
Katalon Platform
Katalon Platform
Created February 13, 2021 at 5:49 PM
Updated July 1, 2022 at 5:13 PM
Resolved August 23, 2021 at 9:37 PM
Upgrade was done from U1.1 where it was stable for 26 days with 0 issues. For some reason, I get unscheduled reboots on this build after 1hr 10m~1hr 15m of uptime. Entire ESXi host has to be restarted otherwise attempts to restart the VM hang at mps0 (LSI2008) failing with "Doorbell handshake failed". Noticed that in the vmware.log it reports the following before shutting down the VM on initial crash:
vmx| E105: PANIC: PCI passthru device 0000:02:00.0 (LSI2008) caused an IOMMU fault type 6 at address 0xc0000000.
Reverted to U1.1 w/ 96GB RAM (old config) and stress tested for a few hours with no issues.
Upgraded back to U.2 and set to 96GB RAM, was able to crash the system in 10 minutes of stress testing (Plex + torrents).
Rebooted ESXi host to recover the HBA, set VM to 64GB RAM, stable after 10 hours (torrents only, brief Plex streaming) where it would crash at 1hr 10-15m previously.
Hardware:
R720xd
12x1TB drives, 2xRAID-Z2 VDEVs
192GB RAM (tests clean on memtest, no errors from iDRAC)
2xIntel Xeon E5-2630v2 CPUs
PERC H710P serving RAID0 of 2xIntel DC S3500 600GB SSDs for VM datastore
ESXi-6.7.0-20201104001-standard (Dell, latest)
VM is allocated 8 CPUs, 32GB disk, and 96GB RAM. RAM is reserved due to LSI2008 passthrough.