Replication Between Pools Causes Corruption

Description

Hi iX!

I had a problem with TN SCALE replication causing corruption. After working with a few folks from the community, we determined the problem is that lz4-compressed data is not decompressed before presentation to the user.

I have attached datasets that can be used for reproducing the issue in the post .

I no longer have the original zpool that contained these datasets, as I only encountered this ZFS bug when trying to replicate the contents away in order to refactor the zpool. I have since accomplished this. But the datasets linked in comment #50 can repro the problem 100%.

Hopefully you can work with the OpenZFS team to get to the bottom of this before anyone else runs into it without realizing it! I expect this is some decompression bug that could be patched to allow the data to be recovered, too.

Thanks again for an awesome product, and please do reach out if I can be of further assistance!

Problem/Justification

None

Impact

None

Activity

Show:

William Gryzbowski 
September 8, 2022 at 6:30 PM

Unfortunately we dont have resources to try ourselves.

It seems we had reports of related panics being fixed for 22.02.3.

If wants to investigate it still, please reopen, otherwise I am closing this until we see another occurrence.

Thanks.

Jacob McDonald 
August 31, 2022 at 8:26 PM

Sorry, I neglected to include the link in the original post, or maybe the link was somehow cleared. I no longer seem to have access to the account I used to post this bug… I can’t login with any of the email addresses I would have used. Anyway, new account here, same person.

Please see the forums post, and comment #45 has links to the affected datasets.

I no longer have the affected zpool, as commented here earlier, so I’m unsure if 22.02.3 would have fixed anything. Hopefully you can replicate (pun intended?) the bug from the datasets I attached in the forum post.

William Gryzbowski 
August 31, 2022 at 7:30 PM

Which forum link? I dont see any?

Also wondering if this could be replicated with latest (22.02.3).

Jacob McDonald 
February 28, 2022 at 3:08 PM

Hi , I was running RC2 at the time, and am now running RELEASE.

I have added the debug attachment as requested, but please note as I said in the original comment, that I no longer have the troublesome zpool or datasets. But the forums comments I linked have download links to the original dataset that can be used to reproduce the problem. This seems to be a ZFS problem, not TrueNAS.

Thanks!

Bonnie Follweiler 
February 28, 2022 at 2:29 PM

Thank you for the report, .

What version of SCALE are you running on the systems you are using for replication?

Can you please attach a debug file to the "Private Attachments" section of this ticket? To generate a debug file on TrueNAS SCALE, log in to the TrueNAS web interface, go to System Settings > Advanced, then click SAVE DEBUG and wait for the file to download to your local system.

Need additional information

Details

Assignee

Reporter

Labels

Impact

Components

Fix versions

Affects versions

Priority

More fields

Katalon Platform

Created February 27, 2022 at 12:11 AM
Updated September 8, 2022 at 6:30 PM
Resolved September 8, 2022 at 6:30 PM