Failed to check for alert IPMISEL (and IPMISELSpaceLeft) alert mail

Description

I get these alerts sporadically, since upgrading to 11.3.

I'm able to run these commands from a login shell without errors.

I've tried lowering the severity of IPMI alerts as well as reducing the frequenecy to never, but it doesn't seem to make a difference.

Problem/Justification

None

Impact

None

SmartDraw Connector

Katalon Manual Tests (BETA)

Activity

Bug Clerk 
April 29, 2020 at 5:55 PM

Bug Clerk 
April 29, 2020 at 5:39 PM

Kai Groner 
April 20, 2020 at 6:49 PM
(edited)

Former user I ran

and reset the BMC via web UI.  During the reset, ipmitool produced a few different errors, sometimes I would see a non-zero exit status, but usually not.  Here is the output from one reset sequence.

Here's a different one:

And another:

I think I saw a reset sequence without a non-zero status, but it's out of my scrollback buffer now.

I'll note that several "check failed" alerts were generated while running these tests, and the kernel error sequences also appeared (considerably shorter however).

Vladimir Vinogradenko 
April 20, 2020 at 5:52 PM

 can you please try running these IPMI commands in the infinite loop to see how they crash? (ipmitool -c sel elist or ipmitool sel info)

Kai Groner 
April 20, 2020 at 5:23 PM

I've updated the IPMI firmware on this machine to see if the KCS channel errors go away.

If those errors are the root cause of my issue, I guess there are two things that FreeNAS could be doing that would make this better:

  1. Report the error text from ipmitool somewhere. I'm guessing there's an I/O error, which would have been helpful to see.

  2. If BMC flakiness is just a fact of life, maybe it should be possible to ignore it outright or require multiple consecutive failures before sending alerts. (I tried setting the IPMISELSpaceLeft alerting frequency to never, but it doesn't seem to affect these "check failed" alerts.)

Complete

Details

Assignee

Reporter

Due date

Components

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created April 18, 2020 at 6:20 PM
Updated May 4, 2020 at 6:16 PM
Resolved April 29, 2020 at 5:55 PM