Replication Between Pools Causes Corruption
Description
Problem/Justification
Impact
Activity
William Gryzbowski September 8, 2022 at 6:30 PM
Unfortunately we dont have resources to try ourselves.
It seems we had reports of related panics being fixed for 22.02.3.
If wants to investigate it still, please reopen, otherwise I am closing this until we see another occurrence.
Thanks.
Jacob McDonald August 31, 2022 at 8:26 PM
Sorry, I neglected to include the link in the original post, or maybe the link was somehow cleared. I no longer seem to have access to the account I used to post this bug… I can’t login with any of the email addresses I would have used. Anyway, new account here, same person.
Please see the forums post, and comment #45 has links to the affected datasets.
I no longer have the affected zpool, as commented here earlier, so I’m unsure if 22.02.3 would have fixed anything. Hopefully you can replicate (pun intended?) the bug from the datasets I attached in the forum post.
William Gryzbowski August 31, 2022 at 7:30 PM
Which forum link? I dont see any?
Also wondering if this could be replicated with latest (22.02.3).
Jacob McDonald February 28, 2022 at 3:08 PM
Hi , I was running RC2 at the time, and am now running RELEASE.
I have added the debug attachment as requested, but please note as I said in the original comment, that I no longer have the troublesome zpool or datasets. But the forums comments I linked have download links to the original dataset that can be used to reproduce the problem. This seems to be a ZFS problem, not TrueNAS.
Thanks!
Bonnie Follweiler February 28, 2022 at 2:29 PM
Thank you for the report, .
What version of SCALE are you running on the systems you are using for replication?
Can you please attach a debug file to the "Private Attachments" section of this ticket? To generate a debug file on TrueNAS SCALE, log in to the TrueNAS web interface, go to System Settings > Advanced, then click SAVE DEBUG and wait for the file to download to your local system.
Hi iX!
I had a problem with TN SCALE replication causing corruption. After working with a few folks from the community, we determined the problem is that lz4-compressed data is not decompressed before presentation to the user.
I have attached datasets that can be used for reproducing the issue in the post .
I no longer have the original zpool that contained these datasets, as I only encountered this ZFS bug when trying to replicate the contents away in order to refactor the zpool. I have since accomplished this. But the datasets linked in comment #50 can repro the problem 100%.
Hopefully you can work with the OpenZFS team to get to the bottom of this before anyone else runs into it without realizing it! I expect this is some decompression bug that could be patched to allow the data to be recovered, too.
Thanks again for an awesome product, and please do reach out if I can be of further assistance!