When using nfs with drbd, the network RAID daemon, it's necessary to have nfs data structures on disk in /var/lib/nfs/ mirrored. Generally, this is done by symlinking /var/lib/nfs to a separate mirrored filesystem. When executing a manual failover, this filesystem must be unmounted, but nfsd has an invisible (at least to lsof) lock on files here. Therefore, the kernel blocks the mount with "device is busy" messages. To avoid that, the nfs script must release nfsd resources with rpc.nfsd 0 exportfs -f See also http://marc.info/?l=linux-nfs&m=121340518108820&w=2 (I'm not certain if rpc.nfsd 16 is required at the restart of the script.)
the "rpc.nfsd 0" stop should already be handled by the ssd, and the exportfs should be handled by the -ua call. do some testing on your side to see what combinations result in what behavior.
I'm not sure what you mean by the "ssd". I don't know that exportfs -f is truly necessary. When I ran rpc.nfsd 0, it worked and allowed me to unmount the filesystem holding /var/lib/nfs. exportfs -ua seems to be what the script is supposed to do and if that is indeed the case, it's not enough. rpc.nfsd 0 is not getting called.
exportfs and stopping nfsd are two separate issues there is already a call right before the exportfs that kills the nfsd daemons you need to try the different combinations in the init.d script and report the results. no one else is reporting these issues you are.
I'm not sure what you mean by try different combinations. Of what? From what I can tell executing /etc/init.d/nfs stop doesn't do much of anything. After the call, the output of ps ax | grep nfs is 11843 ? S< 0:00 [nfsd4] 11845 ? S 0:00 [nfsd] 11846 ? S 0:00 [nfsd] 11847 ? S 0:00 [nfsd] 11848 ? S 0:00 [nfsd] 11849 ? S 0:00 [nfsd] 11850 ? S 0:00 [nfsd] 11851 ? S 0:00 [nfsd] 11852 ? S 0:00 [nfsd] 25476 pts/0 S+ 0:00 grep --colour=auto nfs And of rpcinfo -p program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100021 1 udp 35762 nlockmgr 100021 3 udp 35762 nlockmgr 100021 4 udp 35762 nlockmgr 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100003 4 udp 2049 nfs 100021 1 tcp 38659 nlockmgr 100021 3 tcp 38659 nlockmgr 100021 4 tcp 38659 nlockmgr 100003 2 tcp 2049 nfs 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100024 1 udp 42399 status 100024 1 tcp 52421 status
then the question is why ... it works just fine for me (and apparently many other people) you didnt post `emerge --info` as the bug reporting page says you're supposed to. please rectify this.
Portage 2.1.5_rc6 (default-linux/amd64/2007.0/server, gcc-4.2.3, glibc-2.7-r2, 2.6.25-gentoo-r1 x86_64) ================================================================= System uname: 2.6.25-gentoo-r1 x86_64 AMD Opteron(tm) Processor 240 EE Timestamp of tree: Fri, 25 Apr 2008 15:33:01 +0000 app-shells/bash: 3.2_p33 dev-java/java-config: 1.3.7, 2.1.5 dev-lang/python: 2.4.4-r9 dev-python/pycrypto: 2.0.1-r6 sys-apps/baselayout: 1.12.12 sys-apps/sandbox: 1.2.18.1-r2 sys-devel/autoconf: 2.13, 2.61-r1 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.1 sys-devel/binutils: 2.18-r1 sys-devel/gcc-config: 1.4.0-r4 sys-devel/libtool: 1.5.26 virtual/os-headers: 2.6.25-r1 ACCEPT_KEYWORDS="amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=opteron -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo /etc/udev/rules.d" CXXFLAGS="-march=opteron -O2 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="distlocks parallel-fetch sandbox sfperms strict unmerge-orphans userfetch" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" LANG="en_US.utf8" LC_ALL="en_US.utf8" LDFLAGS="" LINGUAS="en" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/config/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="acl amd64 apache2 cli cracklib crypt ctype dri gdbm iconv innodb jpeg jpeg2k lm_sensors logrotate mailwrapper mmx mudflap mysql ncurses nls nolvm1 nptl nptlonly openmp pam pcre perl pppd readline reflection sensord session snmp spl sse sse2 ssl tcpd truetype unicode vhosts xml zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en" USERLAND="GNU" VIDEO_CARDS="apm ark chips cirrus cyrix dummy fbdev glint i128 i810 mach64 mga neomagic nv r128 radeon rendition s3 s3virge savage siliconmotion sis sisusb tdfx tga trident tseng v4l vesa vga via vmware voodoo" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
This is probably related to http://bugs.gentoo.org/show_bug.cgi?id=243088. Replacing the ssd --stop --name nfs line in /etc/init.d/nfs with the mentioned "rpc.nfsd 0" works for me. And according to the other bug, --name is kind of a hack anyways .. so avoiding it could be a good idea.
calling both is pretty easy http://sources.gentoo.org/net-fs/nfs-utils/files/nfs.initd?r1=1.16&r2=1.17