Thanks for using the TrueNAS Community Edition issue tracker! TrueNAS Enterprise users receive direct support for their reports from our support portal.

Transfers fail with krb5i

Description

I have TrueNAS12 Core installation joined to Active Directory and serving nfs with krb5i (auth+integrity). Clients are Linux with kernel 5.8. Directory listing works, but transfers of all except the tiniest files fail with input/output errors.

Some observations:

1) Transfers start with 70MB/s or so traffic, then clients hang, network traffic stops, but gssd service remains with high CPU usage on TrueNAS side for a while. Remounting on client side allows getting directory listing again, but further filetransfers hang too.

2) Switching to krb5 (auth only) fixes the transfers.

3) Switching to krb5p (auth+integrity+privacy) somewhat postpones the hang. Network traffic is present for longer.

4) Hangs also happen with SMB when mounted with krb5i.

5) I can not reproduce this with TrueNAS 12 running in KVM virtual machine on Intel i7 6850K CPU.

6) I can repoduce this running with TrueNAS12 on bare metal Intel Atom C3558 or in bhyve VM on the same CPU.

7) Tested with both aes128-cts-hmac-sha1-96 and aes256-cts-hmac-sha1-96

8) Same setup with FreeBSD 11.3 works fine on Intel Atom C3558.

I suspect this might be related to CPU. Intel Atom C3558 not only has AES-NI, but it also has SHA extensions for HW accelerated computation of SHA checksums. FreeBSD11 does not support using SHA extensions and uses software approach, but FreeBSD12 does use these extensions if CPU supports them. This might explain why FreeNAS11.3 works.

iXsystems sell TrueNAS Mini X which also has Intel Atom C3558 CPU. If you are going to try to reproduce this issue, don't forget to try on that CPU.

Problem/Justification

None

Impact

None

SmartDraw Connector

Katalon Manual Tests (BETA)

Activity

Alexander Motin 
April 28, 2021 at 2:12 PM

Thank you for pushing it through.  I saw the commit yesterday.  Merged it in for TrueNAS 12.0-U4: https://github.com/truenas/os/commit/1b64a7ced53b5bc6b413a6345c2234243b0642ef

Žilvinas Žaltiena 
April 28, 2021 at 6:30 AM

Heads-up on this issue: it was just fixed upstream in FreeBSD stable/12. The relevant commit: https://github.com/freebsd/freebsd-src/commit/62e32cf9140e6c13663dcd69ec3b3c7ca4579782

Žilvinas Žaltiena 
November 29, 2020 at 5:53 PM
(edited)

I decided to take another angle by testing similar NFS server setup on upstream OS (FreeBSD) with Intel Atom C3558.

FreeBSDs from 12.0 to 12.2 all failed large nfs v4 transfers with krb5i the same way as TrueNAS 12 did, if aesni module was enabled. They all worked correctly if aesni module wasn't loaded.

I also tried to patch aesni module (so that detection of SHA support on CPU would always fail) on FreeBSD 12.2 and recompiled the kernel. Then large transfers succeeded even with aesni module loaded.

It seems like upstream issue, so I will also fill the bug report there.

 EDIT: upstream bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=251462

Žilvinas Žaltiena 
November 26, 2020 at 6:54 PM
(edited)

Added debug dump after time sync (clock skew fix).

Žilvinas Žaltiena 
November 26, 2020 at 6:52 PM

I made that error go away, but I am not sure why it was there to begin with as "date" command showed the same datetime as on DC. I reentered the same date/time and error went away. In anyway nothing has changed regarding failing transfers.

I want to emphasize that kerberos (at least ticket wise) has worked and works because I can browse nfs shares configured for kerberos and can transfer smallish files, but it fails/hangs with larger ones (i.e. 200MB file always fails.).

Complete

Details

Assignee

Reporter

Labels

Impact

Components

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created November 22, 2020 at 6:36 PM
Updated July 1, 2022 at 5:00 PM
Resolved April 28, 2021 at 2:12 PM