repeated POLLHUP on zfsd

Description

like the title says, I'm getting continual POLLHUPs detected spamming the console.

If it helps the only task running is a ZFS send -R | ZFS recv.

debug dump/screenshot to follow.

Problem/Justification

None

Impact

None

SmartDraw Connector

Katalon Manual Tests (BETA)

Activity

Show:

Ryan Moeller 
May 10, 2021 at 6:21 PM

Rate limiting for deadman events has been merged with https://github.com/truenas/ports/pull/1015

Ryan Moeller 
March 11, 2021 at 5:38 PM
(edited)

I think that the slow_io_ms tuning explains it. Only delay and checksum events are rate limited, not the deadman events. That lines up with the number of delay reports being much lower than the deadman reports. I'll see if we can get the deadman events rate limited as well. In the meantime, you should avoid tuning the threshold so low.

Oh correction, the deadman events use a different threshold: vfs.zfs.deadman.ziotime_ms: 300000

I'll have a look in the debug and see if I find some further clues what's happening.

Stilez 
March 10, 2021 at 1:31 PM
(edited)

Debug file added.

No obvious reason comes to mind, except one possibility.

The sysctl vfs.zfs.zio.slow_io_ms is set to 200 ms - it's a pool built of fast devices and not-so-long queues even under load, according to gstat - for example, optane mirror metadata vdev - and I can't think of a reason why an IO should take more than a few tens of milliseconds once issued to the disk, so I use the "slow IO" column to watch for device latency issues. It's also "muted" to 20 events a second max (vfs.zfs.slow_io_events_per_second). But it could be that for reasons I don't have the knowledge to appreciate, it's sometimes experiencing large numbers of slow IO events and that also simultaneously, the limit of 20 doesnt mute internal raising of events but only mutes their external reporting. Could that be a possibility?

Ryan Moeller 
March 9, 2021 at 2:38 PM

Ok, the text file was probably converted to UTF-16 by something later in the chain I would guess. That seems to be a red herring.

The deadman failmode is relevant because as you can see in my previous comment, of the devd events you captured in the most recent text file, 90% of them are deadman error reports (type=ereport.fs.zfs.deadman), and the other 10% are IO delay error reports (type=ereport.fs.zfs.delay). All of the events you captured in that file are for delays exceeding the threshold of 30 seconds (that is the default, I cannot confirm for your system as you have not provided a full debug archive).

Here is the summary of all your text files now (number of events, followed by the type of event):

I assume POLLHUP is occurring because the buffer on the socket that passes the events is full, due to the sheer volume of these events.

Do you have any ideas why these vdevs might be experiencing delays? A full debug would help answer some questions of mine, if you would be so kind as to attach one please.

Stilez 
March 9, 2021 at 1:18 AM
(edited)

Neither of those seem to be unexpected.

Locale output is "en_US.UTF-8" for all items except LC_ALL that is empty/unset. I didnt set that, or any locale info except UK keyboard, so I'm guessing thats system default.

Failmode=continue is indeed locally set on the pools, I believe by middleware, but should have no effect except "in the event of a catastrophic pool failure" according to "man zpoolprops". It's the default for TrueNAS data pools according to the github repo (see pool.py line 679). This post also seems to suggest the default is "wait" for the boot pool but "continue" for data pools, which would fit.

Whether manually or automatically set, there's been no such failure, and zero checksum errors etc - the POLLHUP behaviour occurs during routine everyday activity, including during an ongoing 40 hour zfs send/recv, and there are no warnings of a pool or even single device failure, catastrophic or otherwise. Its effect is purely that if a pool "catastrophically fails", it will become read only rather than totally blocked. If that was the circumstance, I would surely have also noticed writes aborting and the send/recv aborted, but they didn't then or at any time after, despite use.

Complete

Details

Assignee

Reporter

Labels

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created December 30, 2020 at 12:10 PM
Updated July 1, 2022 at 3:30 PM
Resolved May 10, 2021 at 6:21 PM