Summary: | [post-2.6.17 regression] NFSv3 - file corruption on non-simultaneous append | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | HarveyAPS <harvey> |
Component: | [OLD] Core system | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
Status: | VERIFIED UPSTREAM | ||
Severity: | major | ||
Priority: | High | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | linux-2.6.??-regression linux-bugzilla-pending | ||
Package list: | Runtime testing required: | --- | |
Attachments: |
Kernel Configuration used...
Example of flawed output |
Description
HarveyAPS
2007-09-04 20:51:50 UTC
Created attachment 130037 [details]
Kernel Configuration used...
Created attachment 130038 [details]
Example of flawed output
testlinux ~ # emerge --info Portage 2.1.2.9 (default-linux/x86/2006.1, gcc-3.4.4, glibc-2.5-r4, 2.6.22-gentoo-r6-vmware i686) ================================================================= System uname: 2.6.22-gentoo-r6-vmware i686 Intel(R) Xeon(R) CPU E5335 @ 2.00GHz Gentoo Base System release 1.12.9 Timestamp of tree: Tue, 04 Sep 2007 16:00:01 +0000 app-shells/bash: 3.2_p15-r1 dev-java/java-config: 1.2.11-r1 dev-lang/python: 2.4.4-r4 dev-python/pycrypto: 2.0.1-r5 sys-apps/baselayout: 1.12.9-r2 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.61 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1, 1.10 sys-devel/binutils: 2.17 sys-devel/gcc-config: 1.3.12-r6 sys-devel/libtool: 1.5.23b virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="x86" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -mtune=i686 -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib/X11/xkb" CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c" CXXFLAGS="-O2 -mtune=i686 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="distlocks metadata-transfer sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="berkdb bitmap-fonts cli cracklib crypt cups dri fortran gdbm gpm iconv ipv6 isdnlog midi mudflap ncurses nls nptl nptlonly openmp pam pcre perl ppds pppd python readline reflection session spl ssl tcpd truetype-fonts type1-fonts unicode x86 xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="apm ark chips cirrus cyrix dummy fbdev glint i128 i740 i810 imstt mach64 mga neomagic nsc nv r128 radeon rendition s3 s3virge savage siliconmotion sis sisusb tdfx tga trident tseng v4l vesa vga via vmware voodoo" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY Can you please test this with the latest development kernel (2.6.23-rc5 as of this writing)? Also, can you include the dmesg output for both your test machines, including the logging during your test? Thanks. I've tested 2.6.23-rc5, and 2.6.23-rc6. The problem was fixed in 2.6.23-rc6... I suggest backporting this, it is a serious that can lead to file corruption when multiple gentoo boxes are accessing the same files over NFS. commit 1b3b4a1a2deb7d3e5d66063bd76304d840c966b3 Author: Trond Myklebust <Trond.Myklebust@netapp.com> Date: Tue Aug 28 10:29:36 2007 -0400 NFS: Fix a write request leak in nfs_invalidate_page() Ryusuke Konishi says: The recent truncate_complete_page() clears the dirty flag from a page before calling a_ops->invalidatepage(), ^^^^^^ static void truncate_complete_page(struct address_space *mapping, struct page *page) { ... cancel_dirty_page(page, PAGE_CACHE_SIZE); <--- Inserted here at kernel 2.6.20 if (PagePrivate(page)) do_invalidatepage(page, 0); ---> will call a_ops->invalidatepage() ... } and this is disturbing nfs_wb_page_priority() from calling nfs_writepage_locked() that is expected to handle the pending request (=nfs_page) associated with the page. int nfs_wb_page_priority(struct inode *inode, struct page *page, int how) { ... if (clear_page_dirty_for_io(page)) { ret = nfs_writepage_locked(page, &wbc); if (ret < 0) goto out; } ... } Since truncate_complete_page() will get rid of the page after a_ops->invalidatepage() returns, the request (=nfs_page) associated with the page becomes a garbage in nfs_inode->nfs_page_tree. ------------------------ Fix this by ensuring that nfs_wb_page_priority() recognises that it may also need to clear out non-dirty pages that have an nfs_page associated with them. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Nevermind, I was wrong, this is still broken in 2.6.23-rc6... Thanks for testing. Please file an upstream bug report at http://bugzilla.kernel.org and post the new bug URL here. closing after no response. Harvey, if this is still an issue then please reopen this bug after you have tested the latest development kernel (currently v2.6.28-rc7) and filed a bug report upstream |