Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 917224 - sys-fs/zfs: further CoW bugs, some installed files are corrupted (stripped, replaced with chunks of zeroes) in e.g. dev-lang/go-1.21.4
Summary: sys-fs/zfs: further CoW bugs, some installed files are corrupted (stripped, r...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal blocker with 1 vote (vote)
Assignee: Georgy Yakovlev
URL: https://www.theregister.com/2023/11/2...
Whiteboard:
Keywords: PMASKED
Depends on: 919746
Blocks: 635020 CVE-2023-49298
  Show dependency tree
 
Reported: 2023-11-12 02:42 UTC by terinjokes@gmail.com
Modified: 2024-01-07 07:07 UTC (History)
15 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
failing emerge with normal settings (dev-lang:go-1.21.4:normal.log,5.17 KB, text/plain)
2023-11-12 21:22 UTC, terinjokes@gmail.com
Details
failing emerge without native-extensions (dev-lang:go-1.21.4:without-native-extensions.log,5.17 KB, text/plain)
2023-11-12 21:23 UTC, terinjokes@gmail.com
Details

Note You need to log in before you can comment on or make changes to this bug.
Description terinjokes@gmail.com 2023-11-12 02:42:12 UTC
After emerging dev-lang/go I'm unable to compile any Go programs as the internal compiler tools have been striped to the point where they are no longer executable programs.

Reproducible: Always

Steps to Reproduce:
1. emerge -1 dev-lang/go
2. file /usr/lib/go/pkg/tool/linux_amd64/* | grep data
Actual Results:  
$  file /usr/lib/go/pkg/tool/linux_amd64/* | grep data
/usr/lib/go/pkg/tool/linux_amd64/asm:       data
/usr/lib/go/pkg/tool/linux_amd64/cgo:       data
/usr/lib/go/pkg/tool/linux_amd64/compile:   data
/usr/lib/go/pkg/tool/linux_amd64/covdata:   ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=xHCzRQtrkEP-Bbxql0SF/zxsofCJFlBoPlUclgwBG/TrsgK6SKiY4q6TIhyBjU/UwcISvZgqfQaEf3Kr_Tq, not stripped
/usr/lib/go/pkg/tool/linux_amd64/cover:     data
/usr/lib/go/pkg/tool/linux_amd64/link:      data
/usr/lib/go/pkg/tool/linux_amd64/vet:       data

Expected Results:  
All programs in the linux_amd64 directory should be identified as executable programs.

The files seem to have the correct size:

$ du -b /usr/lib/go/pkg/tool/linux_amd64/compile
18060169        /usr/lib/go/pkg/tool/linux_amd64/compile

But the file is mostly empty, excepting only a section that repeats once (a build ID?).

$ hexdump /usr/lib/go/pkg/tool/linux_amd64/compile
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0000fa0 0000 0000 0000 0000 0000 0000 5a41 3447
0000fb0 336a 3933 5a49 4f2d 6641 6342 7a6d 3646
0000fc0 582f 5930 5a4d 6761 5659 6f34 6d39 4130
0000fd0 4957 6555 2f67 686d 6a63 6675 5976 4e6a
0000fe0 346c 3070 5157 494e 5f41 5a2f 336d 6342
0000ff0 4e6d 4a4f 306c 4277 4a72 774d 4d41 006c
0001000 0000 0000 0000 0000 0000 0000 0000 0000
*
0ac9280 5a41 3447 336a 3933 5a49 4f2d 6641 6342
0ac9290 7a6d 3646 582f 5930 5a4d 6761 5659 6f34
0ac92a0 6d39 4130 4957 6555 2f67 686d 6a63 6675
0ac92b0 5976 4e6a 346c 3070 5157 494e 5f41 5a2f
0ac92c0 336d 6342 4e6d 4a4f 306c 4277 4a72 774d
0ac92d0 4d41 006c 0000 0000 0000 0000 0000 0000
0ac92e0 0000 0000 0000 0000 0000 0000 0000 0000
*
1139380 0000 0000 0000 0000 0000
1139389

I've had the same result upgrading from 1.21.3, and from starting over again with dev-lang/go-bootstrap-1.18.6.

$ emerge --info
Portage 3.0.51 (python 3.11.5-final-0, default/linux/amd64/17.1/desktop/plasma/systemd/merged-usr, gcc-13, glibc-2.37-r7, 6.5.11-gentoo-dist x86_64)
=================================================================
System uname: Linux-6.5.11-gentoo-dist-x86_64-AMD_Ryzen_7_7840U_w-_Radeon_780M_Graphics-with-glibc2.37
KiB Mem:    32014740 total,  24107492 free
KiB Swap:          0 total,         0 free
Timestamp of repository gentoo: Sun, 12 Nov 2023 00:30:01 +0000
Head commit of repository gentoo: e604d48d9516a499627e7e774e0efea3648c27f6
sh bash 5.1_p16-r6
ld GNU ld (Gentoo 2.40 p5) 2.40.0
app-misc/pax-utils:        1.3.5::gentoo
app-shells/bash:           5.1_p16-r6::gentoo
dev-lang/perl:             5.38.0-r1::gentoo
dev-lang/python:           3.11.5::gentoo
dev-lang/rust:             1.71.1::gentoo
dev-util/cmake:            3.26.5-r2::gentoo
dev-util/meson:            1.2.1-r1::gentoo
sys-apps/baselayout:       2.14::gentoo
sys-apps/sandbox:          2.38::gentoo
sys-apps/systemd:          254.5-r1::gentoo
sys-devel/autoconf:        2.13-r7::gentoo, 2.71-r6::gentoo
sys-devel/automake:        1.16.5-r1::gentoo
sys-devel/binutils:        2.40-r5::gentoo
sys-devel/binutils-config: 5.5::gentoo
sys-devel/clang:           16.0.6::gentoo
sys-devel/gcc:             13.2.1_p20230826::gentoo
sys-devel/gcc-config:      2.11::gentoo
sys-devel/libtool:         2.4.7-r1::gentoo
sys-devel/lld:             16.0.6::gentoo
sys-devel/llvm:            16.0.6::gentoo
sys-devel/make:            4.4.1-r1::gentoo
sys-kernel/linux-headers:  6.1::gentoo (virtual/os-headers)
sys-libs/glibc:            2.37-r7::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://100.112.16.18/gentoo-portage
    priority: -1000
    volatile: False
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-metamanifest: yes
    sync-rsync-verify-max-age: 24
    sync-rsync-extra-opts:

local
    location: /var/db/repos/local
    masters: gentoo
    volatile: False

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE @BINARY-REDISTRIBUTABLE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-march=native -O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O2 -pipe"
GENTOO_MIRRORS="[SNIP]"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LEX="flex"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/zsh"
USE="X a52 aac acl acpi activities alsa amd64 bluetooth branding bzip2 cairo cdda cdr cli crypt cups dbus declarative dri dts dvd dvdr encode exif flac fortran gdbm gif gpm gtk gui iconv icu ipv6 jpeg kde kwallet lcms libnotify libtirpc mad mng mp3 mp4 mpeg multilib ncurses networkmanager nls nptl ogg opengl openmp pam pango pcre pdf pipewire plasma png policykit ppds pulseaudio qml qt5 readline screencast sdl seccomp semantic-desktop sound spell ssl startup-notification svg systemd test-rust tiff truetype udev udisks unicode upower usb vorbis vulkan wayland widgets wxwidgets x264 xattr xcb xft xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 aes avx avx2 avx512bw avx512cd avx512dq avx512f avx512vbmi avx512vl f16c fma3 pclmul popcnt rdrand sha sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-1" POSTGRES_TARGETS="postgres15" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11" RUBY_TARGETS="ruby31" VIDEO_CARDS="amdgpu radeonsi radeon" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, MAKEOPTS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-12 02:59:58 UTC
Reporter mentioned on IRC they're using ZFS.

Few immediate questions:
* What version? zfs -V output please

* Did you enable the new block cloning pool feature in ZFS 2.2?

* What version of sys-apps/coreutils? (cp --version as well please)

* Please try to grab the build.log by running e.g. PORTAGE_LOGDIR="/var/log/portage" emerge -v1 dev-lang/go. It should put a log in /var/log/portage/build or so when it's done. (You have to do this for Portage to save a "successful" build log.)

Workaround:
* It's likely that setting USE="-native-extensions" on sys-apps/portage will work.

Notes:
* If this is what I think it is, this is _not_ a Portage or Go bug, but is instead another version of an insidious ZFS bug which pops up every so often.

* See also https://wiki.gentoo.org/wiki/User:Sam/Memorable_bugs_I_like_to_reference#Bugs_found_by_Portage.27s_native_file_copying.

* You can see https://github.com/openzfs/zfs/issues/11900#issuecomment-927568640 onwards for one previous incarnation where it manifested *extremely* similarly (with chunks of Go being replaced by zeroes).
Comment 2 terinjokes@gmail.com 2023-11-12 21:22:51 UTC
Created attachment 874611 [details]
failing emerge with normal settings
Comment 3 terinjokes@gmail.com 2023-11-12 21:23:11 UTC
Created attachment 874612 [details]
failing emerge without native-extensions
Comment 4 terinjokes@gmail.com 2023-11-12 21:29:17 UTC
Thanks for taking a look. Your theory seems like it might be correct.

* $ zfs --version
zfs-2.2.0-r0-gentoo
zfs-kmod-2.2.0-r0-gentoo

* Yes, the zpool has been upgraded.

* sys-apps/coreutils-9.3-r3

* $ cp --version
cp (GNU coreutils) 9.3
Packaged by Gentoo (9.3-r3 (p0))

* Rebuilding portage with USE="-native-extensions" still results in corrupted files. Portage log attached.

* Building with portage's TMPDIR on tmpfs does not exhibit corruption on multiple test runts.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-12 22:19:55 UTC
OK, that's consistent then (there's _two_ points of failure: 1) when the go build system runs `cp` within PORTAGE_TMPDIR, and 2) when Portage itself merges to the live filesystem fromp tmpdir (affected by native-extensions)).

Given you can consistently hit this, and the machine I previously used to consistently hit the previous problem is not running ZFS right now, would you mind reporting it upstream?

I'm happy to help with grabbing needed info and such but it's important we get it addressed, especially with someone who can easily reproduce it involved.
Comment 6 terinjokes@gmail.com 2023-11-13 00:00:29 UTC
Sure. Is there a good way to narrow it down to one of those two possibilities before I do so?
Comment 7 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-15 04:04:06 UTC
(In reply to terinjokes@gmail.com from comment #6)
> Sure. Is there a good way to narrow it down to one of those two
> possibilities before I do so?

(We discussed it on IRC after, ftr.)

Already a bunch of people seeing the same behaviour on your bug as well at https://github.com/openzfs/zfs/issues/15526...
Comment 8 Francesco Riosa 2023-11-16 09:47:36 UTC
just a +1

> * What version? zfs -V output please
zfs-2.2.0-rc3
zfs-kmod-2.2.0-rc3
# uname -r
6.1.42-serv

> * Did you enable the new block cloning pool feature in ZFS 2.2?

# zpool get  feature@block_cloning
NAME  PROPERTY               VALUE                  SOURCE
B100  feature@block_cloning  active                 local <- we are here
B102  feature@block_cloning  enabled                local



> * What version of sys-apps/coreutils? (cp --version as well please)
sys-apps/coreutils-9.4::gentoo  USE="acl caps openssl xattr -gmp -hostname -kill -multicall (-nls) (-selinux) (-split-usr) -static -test -vanilla -verify-sig"
cp (GNU coreutils) 9.4
Packaged by Gentoo (9.4 (p0))

> * Please try to grab the build.log by running e.g. PORTAGE_LOGDIR="/var/log/portage" emerge -v1 dev-lang/go. It should put a log in /var/log/portage/build or so when it's done. (You have to do this for Portage to save a "successful" build log.)


> Workaround:
> * It's likely that setting USE="-native-extensions" on sys-apps/portage will work.

sys-apps/portage-3.0.55::gentoo  USE="(ipc) xattr -apidoc -build -doc -gentoo-dev -native-extensions -rsync-verify (-selinux) -test" PYTHON_TARGETS="python3_11 -pypy3 -python3_10 -python3_12"

# file /usr/lib/go/pkg/tool/linux_amd64/*  | grep data
/usr/lib/go/pkg/tool/linux_amd64/asm:       data
/usr/lib/go/pkg/tool/linux_amd64/cgo:       data
/usr/lib/go/pkg/tool/linux_amd64/compile:   data
/usr/lib/go/pkg/tool/linux_amd64/covdata:   ELF 64-bit LSB executable, x86-64, version [...] not stripped
/usr/lib/go/pkg/tool/linux_amd64/cover:     data
/usr/lib/go/pkg/tool/linux_amd64/link:      data
/usr/lib/go/pkg/tool/linux_amd64/vet:       data
Comment 9 Joshua Kinard gentoo-dev 2023-11-21 05:05:47 UTC
Pretty sure this is the third time that me using a tmpfs-backed PORTAGE_TMPDIR has saved me from a random silent-data-corruption bug in ZFS.

Is block_cloning a toggle somewhere?  I know there's the pool feature flag, but is there a way to turn it off so it's not used anymore?  I dug around in /proc and /sys, but I am not seeing a parameter file or anything that appears to control it.  My FreeBSD systems have a sysctl tunable that controls whether block_cloning is used or not, even if the feature flag is enabled on the pool.
Comment 10 thulle 2023-11-21 07:23:48 UTC
(In reply to Joshua Kinard from comment #9)
> Pretty sure this is the third time that me using a tmpfs-backed
> PORTAGE_TMPDIR has saved me from a random silent-data-corruption bug in ZFS.
> 
> Is block_cloning a toggle somewhere?  I know there's the pool feature flag,
> but is there a way to turn it off so it's not used anymore?  I dug around in
> /proc and /sys, but I am not seeing a parameter file or anything that
> appears to control it.  My FreeBSD systems have a sysctl tunable that
> controls whether block_cloning is used or not, even if the feature flag is
> enabled on the pool.

A bit too tire, so can't find the issues rn. Pretty sure I read it was decided against a toggle in linux for 2.2.0. Might be introduced in 2.2.1.
One workaround is downgrading to coreutils-8.32.
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-21 23:50:24 UTC
(In reply to Joshua Kinard from comment #9)
> Pretty sure this is the third time that me using a tmpfs-backed
> PORTAGE_TMPDIR has saved me from a random silent-data-corruption bug in ZFS.
> 

This won't fully help because c_f_r might be used when merging from PORTAGE_TMPDIR->live filesystem, but also, it could happen with anything else using c_f_r anyway.

> Is block_cloning a toggle somewhere?  I know there's the pool feature flag,
> but is there a way to turn it off so it's not used anymore?  I dug around in
> /proc and /sys, but I am not seeing a parameter file or anything that
> appears to control it.  My FreeBSD systems have a sysctl tunable that
> controls whether block_cloning is used or not, even if the feature flag is
> enabled on the pool.

There is a new toggle being added at https://github.com/openzfs/zfs/pull/15529 but note that this _isn't_ sufficient to prevent the corruption.

See https://github.com/openzfs/zfs/issues/15526#issuecomment-1815457739 but also note that both this and the previous bug are ultimately to do with freshness of state in memory vs on disk and race conditions. The code is broken anyway and users can and will hit it.
Comment 12 Larry the Git Cow gentoo-dev 2023-11-22 10:43:13 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a6c49ddd0067b6e4a272a9b9c1f9ade21da535d9

commit a6c49ddd0067b6e4a272a9b9c1f9ade21da535d9
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-11-22 10:42:26 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-11-22 10:43:00 +0000

    sys-fs/zfs: add 2.2.1
    
    Note that it may not fix the issues reported entirely as the race still exists.
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs/Manifest         |   2 +
 sys-fs/zfs/zfs-2.2.1.ebuild | 306 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 308 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e798aa5a89a092be0a82ed2302ada3d1b7951c21

commit e798aa5a89a092be0a82ed2302ada3d1b7951c21
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-11-22 10:41:57 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-11-22 10:43:00 +0000

    sys-fs/zfs-kmod: add 2.2.1
    
    Note that it may not fix the issues reported entirely as the race still exists.
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs-kmod/Manifest              |   2 +
 sys-fs/zfs-kmod/zfs-kmod-2.2.1.ebuild | 217 ++++++++++++++++++++++++++++++++++
 sys-fs/zfs-kmod/zfs-kmod-9999.ebuild  |   2 +-
 3 files changed, 220 insertions(+), 1 deletion(-)
Comment 14 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-22 19:18:20 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=d6a9c7f40ffb7f393a707b6d0face1c2f39d3901

commit d6a9c7f40ffb7f393a707b6d0face1c2f39d3901
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-11-22 19:12:13 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-11-22 19:15:00 +0000

    profiles: mask buggy zfs-2.2.0
    
    Further bugs with CoW via copy_file_range (bug #917224, https://github.com/openzfs/zfs/issues/15526).
    The issue is very similar to bug #815469.
    
    ZFS 2.2.1 has a workaround but if you haven't already upgraded your pool to
    use the new block cloning feature, consider using <zfs-2.2 for now.
    
    Bug: https://github.com/openzfs/zfs/issues/15526
    Bug: https://bugs.gentoo.org/815469
    Bug: https://bugs.gentoo.org/91722
    Signed-off-by: Sam James <sam@gentoo.org>

 profiles/package.mask | 8 ++++++++
 1 file changed, 8 insertions(+)
Comment 15 Mike 2023-11-23 15:07:35 UTC
Is it possible to detect which files/folders/etc. are damaged by executing zdb?
Comment 16 Mike 2023-11-23 15:16:38 UTC
(In reply to Mike from comment #15)
> Is it possible to detect which files/folders/etc. are damaged by executing
> zdb?

Update: zdb pull request is still not merged:

https://github.com/openzfs/zfs/pull/15541

It may be very helpful.
Comment 17 Mike 2023-11-23 15:23:38 UTC
(In reply to Mike from comment #16)
> (In reply to Mike from comment #15)
> > Is it possible to detect which files/folders/etc. are damaged by executing
> > zdb?
> 
> Update: zdb pull request is still not merged:
> 
> https://github.com/openzfs/zfs/pull/15541
> 
> It may be very helpful.

Useful script:

https://github.com/0x5c/zfs-bclonecheck

I suggest updating Gentoo news (eselect news list) after zdb pull request will be merged. All Gentoo users that are using zfs 2.2.* should receive news. Should execute zfs-bclonecheck and then manually re-create (e.g., re-emerge) all the corrupted files.
Comment 18 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-11-23 15:24:32 UTC
(In reply to Mike from comment #17)

Note that it's not comprehensive and there may be corrupted files not returned by it, as corruption can happen outside of cloning, it appears.
Comment 19 Larry the Git Cow gentoo-dev 2023-11-24 21:53:04 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=ea74809fc56791c2f45fc46815a7d5a8fd462961

commit ea74809fc56791c2f45fc46815a7d5a8fd462961
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-11-24 21:48:39 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-11-24 21:51:35 +0000

    sys-fs/zfs-kmod: disable zfs_dmu_offset_next_sync tunable by default
    
    As a mitigation until more is understood and fixes are tested & reviewed, change
    the default of zfs_dmu_offset_next_sync from 1 to 0, as it was before
    05b3eb6d232009db247882a39d518e7282630753 upstream.
    
    There are no reported cases of The Bug being hit with zfs_dmu_offset_next_sync=1:
    that does not mean this is a cure or a real fix, but it _appears_ to be at least
    effective in reducing the chances of it happening. By itself, it's a safe change
    anyway, so it feels worth us doing while we wait.
    
    Note that The Bug has been reproduced on 2.1.x as well, hence we do it for both
    2.1.13 and 2.2.1.
    
    Bug: https://github.com/openzfs/zfs/issues/11900
    Bug: https://github.com/openzfs/zfs/issues/15526
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 ...s_dmu_offset_next_sync-tunable-by-default.patch |  40 ++++
 ...s_dmu_offset_next_sync-tunable-by-default.patch |  43 ++++
 sys-fs/zfs-kmod/zfs-kmod-2.1.13-r1.ebuild          | 178 +++++++++++++++++
 sys-fs/zfs-kmod/zfs-kmod-2.2.1-r1.ebuild           | 218 +++++++++++++++++++++
 sys-fs/zfs-kmod/zfs-kmod-9999.ebuild               |   1 +
 5 files changed, 480 insertions(+)
Comment 20 Larry the Git Cow gentoo-dev 2023-11-24 22:13:55 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=4301b22c2a2b3909bea574678b160ed4161c9009

commit 4301b22c2a2b3909bea574678b160ed4161c9009
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-11-24 22:13:18 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-11-24 22:13:18 +0000

    sys-fs/zfs-kmod: stabilize 2.1.13-r1 for amd64, arm64, ppc64
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs-kmod/zfs-kmod-2.1.13-r1.ebuild | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 21 Joshua Kinard gentoo-dev 2023-11-28 05:33:56 UTC
Adding a link to an article by The Register which gives a good overview of the bug, so I think it's quite suitable for the URL field.
Comment 22 Graham Perrin 2023-11-29 01:57:00 UTC
See also: 

zfs-2.2.2 patchset by tonyhutter · Pull Request #15602 · openzfs/zfs

(The URL, which I can not yet post, includes 15602.)
Comment 23 Adrien Dessemond 2023-11-29 18:41:42 UTC
Patchset in the oven for OpenZFS 2.2.2:

https://github.com/openzfs/zfs/pull/15602
Comment 24 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-12-01 02:58:36 UTC
2.1.14 and 2.2.2 are out now. They will be in tree shortly.
Comment 25 Larry the Git Cow gentoo-dev 2023-12-01 03:26:53 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=4caaee5dcb723d594ceae8fe4dc2f889ca13d0b0

commit 4caaee5dcb723d594ceae8fe4dc2f889ca13d0b0
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-12-01 03:25:33 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-12-01 03:25:33 +0000

    sys-fs/zfs-kmod: add 2.2.2
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs-kmod/Manifest              |   2 +
 sys-fs/zfs-kmod/zfs-kmod-2.2.2.ebuild | 217 ++++++++++++++++++++++++++++++++++
 2 files changed, 219 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f514cb6977d2532915365753e4be976b994acc4c

commit f514cb6977d2532915365753e4be976b994acc4c
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-12-01 03:24:49 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-12-01 03:24:49 +0000

    sys-fs/zfs: add 2.2.2
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs/Manifest         |   2 +
 sys-fs/zfs/zfs-2.2.2.ebuild | 306 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 308 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=44de969fbb5705ebb658700ee0d5cc2da361a107

commit 44de969fbb5705ebb658700ee0d5cc2da361a107
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-12-01 03:21:52 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-12-01 03:21:52 +0000

    sys-fs/zfs: add 2.1.14
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs/Manifest          |   2 +
 sys-fs/zfs/zfs-2.1.14.ebuild | 311 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 313 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e29451c2bdb20d489bff977e1892fdf4f0582c6b

commit e29451c2bdb20d489bff977e1892fdf4f0582c6b
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-12-01 03:21:34 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-12-01 03:21:44 +0000

    sys-fs/zfs-kmod: add 2.1.14
    
    Bug: https://bugs.gentoo.org/917224
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-fs/zfs-kmod/Manifest               |   2 +
 sys-fs/zfs-kmod/zfs-kmod-2.1.14.ebuild | 177 +++++++++++++++++++++++++++++++++
 2 files changed, 179 insertions(+)
Comment 26 terinjokes@gmail.com 2023-12-17 20:24:51 UTC
I've been unable to reproduce this after upgrading to zfs-2.2.2
Comment 27 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-01-07 07:07:51 UTC
I think we're all done here.