Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 318847 - net-fs/nfs-utils-1.2.2-r1: distfiles on NFS: "Cannot chown a lockfile" with nfsv4
Summary: net-fs/nfs-utils-1.2.2-r1: distfiles on NFS: "Cannot chown a lockfile" with n...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Network Filesystems
URL:
Whiteboard:
Keywords:
: 326079 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-05-07 10:35 UTC by Nikolaj Šujskij
Modified: 2012-06-20 17:24 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Full emerge fail log (nfs_distfiles.log,2.76 KB, text/plain)
2010-05-07 10:40 UTC, Nikolaj Šujskij
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nikolaj Šujskij 2010-05-07 10:35:11 UTC
I've got an ~amd64-workstation with NFS-shared distfiles:

net-fs/nfs-utils-1.2.2-r1  USE="caps ipv6 nfsv3 nfsv4 tcpd -kerberos"

/etc/exports:
/usr/portage/distfiles	192.168.50.0/24(rw,subtree_check,all_squash,anonuid=250,anongid=250)

 % zgrep NFS /proc/config.gz    
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
# CONFIG_NFS_V3_ACL is not set
CONFIG_NFS_V4=y
# CONFIG_NFS_V4_1 is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V3_ACL is not set
CONFIG_NFSD_V4=y
CONFIG_NFS_COMMON=y

I mount distfiles directory from ~amd64 laptop with pretty much the same setup for NFS. After upgrading to net-fs/nfs-utils-1.2.2-r1 (can't really be sure that it's the reason, but the timing seems to be close), I still can mount exported directory, but emerge fails:

Cannot chown a lockfile: '/usr/portage/distfiles/.nfs-utils-1.2.2.tar.bz2.portage_lockfile'
 * nfs-utils-1.2.2.tar.bz2 RMD160 SHA1 SHA256 size ;-) ...               [ ok ]
Traceback (most recent call last):
  File "/usr/lib64/portage/bin/ebuild", line 268, in <module>
    debug=debug, tree=mytree)
  File "/usr/lib64/portage/pym/portage/proxy/objectproxy.py", line 32, in __call__
    return result(*args, **kwargs)
  File "/usr/lib64/portage/pym/portage/package/ebuild/doebuild.py", line 838, in doebuild
    fetchonly=fetchonly):
  File "/usr/lib64/portage/pym/portage/proxy/objectproxy.py", line 32, in __call__
    return result(*args, **kwargs)
  File "/usr/lib64/portage/pym/portage/package/ebuild/fetch.py", line 612, in fetch
    stat_cached=mystat)
  File "/usr/lib64/portage/pym/portage/util/__init__.py", line 910, in apply_secpass_permissions
    stat_cached=stat_cached, follow_links=follow_links)
  File "/usr/lib64/portage/pym/portage/util/__init__.py", line 745, in apply_permissions
    os.chown(filename, uid, gid)
  File "/usr/lib64/portage/pym/portage/__init__.py", line 228, in __call__
    rval = self._func(*wrapped_args, **wrapped_kwargs)
OSError: [Errno 22] Invalid argument: '/usr/portage/distfiles/nfs-utils-1.2.2.tar.bz2'
 * Fetch failed for 'net-fs/nfs-utils-1.2.2-r1', Log file:
 *  '/var/tmp/portage/net-fs/nfs-utils-1.2.2-r1/temp/build.log'
Cannot chown a lockfile: '/usr/portage/distfiles/.nfs-utils-1.2.2.tar.bz2.portage_lockfile'
 * nfs-utils-1.2.2.tar.bz2 RMD160 SHA1 SHA256 size ;-) ...               [ ok ]
Traceback (most recent call last):
  File "/usr/lib64/portage/bin/ebuild", line 268, in <module>
    debug=debug, tree=mytree)
  File "/usr/lib64/portage/pym/portage/proxy/objectproxy.py", line 32, in __call__
    return result(*args, **kwargs)
  File "/usr/lib64/portage/pym/portage/package/ebuild/doebuild.py", line 838, in doebuild
    fetchonly=fetchonly):
  File "/usr/lib64/portage/pym/portage/proxy/objectproxy.py", line 32, in __call__
    return result(*args, **kwargs)
  File "/usr/lib64/portage/pym/portage/package/ebuild/fetch.py", line 612, in fetch
    stat_cached=mystat)
  File "/usr/lib64/portage/pym/portage/util/__init__.py", line 910, in apply_secpass_permissions
    stat_cached=stat_cached, follow_links=follow_links)
  File "/usr/lib64/portage/pym/portage/util/__init__.py", line 745, in apply_permissions
    os.chown(filename, uid, gid)

I thought it was somehow related to new 'caps' USE-flag, and re-emerged nfs-utils enabling/disabling it (both on server and client with no result. I downgraded to 1.2.2 on client with no luck either. But downgrading to stable nfs-utils (1.1.4-r1) worked.

I attach full emerge log and would be glad to provide any additional information nedded.

Reproducible: Always

Steps to Reproduce:
Comment 1 Nikolaj Šujskij 2010-05-07 10:40:08 UTC
Created attachment 230661 [details]
Full emerge fail log

Full log of emerge failing on fetching files from mounted NFS directory
Comment 2 Michael Weber (RETIRED) gentoo-dev 2010-05-07 10:53:43 UTC
Hi, 
I'm not familiar with this version in detail, but can you try to reduce the problem to a 
`cd /usr/portage/disfiles ; touch test.lock ; chown portage test.lock` fo debugging?

Have u set no_root_squash on the server for the specific client? This was/is(?) needed in old nfs3 to handle remote roots as root with chown permission and not just as regular accounts.

Do you use nfsv4 or nfsv4 for the connection? nfsv4: have you configured /etc/idmapd.conf /etc/conf.d/nfs and /etc/exportfs (fsid=0) properly?.

How does the owner/group info look like on the clients? Correct or ubscure 4billion numbers or nogroup? 
Do you have ldap to keep ids in sync?

(on of my sites has this version of nfs-utils up and running for /usr/portage/disfiles and packages)
Comment 3 Nikolaj Šujskij 2010-05-07 12:41:07 UTC
 Thanks for answer, Michael.

> can you try to reduce the problem to a 
> `cd /usr/portage/disfiles ; touch test.lock ; chown portage test.lock` fo
> debugging?

chown: changing ownership of `test.lock': Invalid argument

> Have u set no_root_squash on the server for the specific client? This was/is(?)
> needed in old nfs3 to handle remote roots as root with chown permission and not
> just as regular accounts.

 I set "all_squash,anonuid=250,anongid=250" to force permissions for all users to portage's. But I tried to set no_root_squash with no luck.

> Do you use nfsv4 or nfsv4 for the connection?

 mount reports "vers=4"

> nfsv4: have you configured
> /etc/idmapd.conf /etc/conf.d/nfs and /etc/exportfs (fsid=0) properly?.

 I did not touch those files. Should I?
Described setup worked without that at least few months. But if it must be done, I will.

> How does the owner/group info look like on the clients? Correct or ubscure
> 4billion numbers or nogroup? 

 % ls -l test.lock 
-rw-r--r-- 1 4294967294 4294967294 0 May  7 16:25 test.lock

> Do you have ldap to keep ids in sync?

 No I haven't, but ids are the same (0 for root, 250 for portage and 1000 for my main user).

> (on of my sites has this version of nfs-utils up and running for
> /usr/portage/disfiles and packages)

 That is a matter of client too, since stable package works all right.
Comment 4 Michael Weber (RETIRED) gentoo-dev 2010-05-07 15:43:50 UTC
Ok, we had to set up idmapd (my colleague actually did this)

server+client ~ # grep -v ^# /etc/idmapd.conf | grep -v ^$
[General]
Domain = localdomain
[Mapping]
[Translation]
 
[Static]
[UMICH_SCHEMA]
LDAP_server = ldap-server.local.domain.edu
LDAP_base = dc=local,dc=domain,dc=edu

server+client ~ # grep -v ^# /etc/conf.d/nfs | grep -v ^$
NFS_NEEDED_SERVICES="rpc.idmapd"
OPTS_RPC_NFSD="8"
OPTS_RPC_MOUNTD=""
OPTS_RPC_STATD=""
OPTS_RPC_IDMAPD=""
OPTS_RPC_GSSD=""
OPTS_RPC_SVCGSSD=""
OPTS_RPC_RQUOTAD=""
EXPORTFS_TIMEOUT=30

server ~ # cat /etc/exports 
/self	<clients ipv4>(rw,no_root_squash,no_subtree_check,fsid=0)

clients ~ # cat /etc/fstab
server:/self         /server      nfs             noauto,user     0 0

-----

We had these 4 billion uids because of missing fsid=0 in exports- AFAIK

Test if idmapd is running: 
server+client ~ # ps ax | grep idmapd
 6347 ?        Ss     0:00 /usr/sbin/rpc.idmapd

But we have a sort of an n-to-n setup with 4 volumes on 4 machines accessible on every machine. /usr/portage/packages is linked to a volume on one machine, so every one of them can run emerge -bk.
Comment 5 Matti Bickel (RETIRED) gentoo-dev 2010-05-18 09:53:11 UTC
forwarding to maintainer
Comment 6 Nikolaj Šujskij 2010-05-28 08:31:35 UTC
There was some trouble with Bugzilla and my mailbox, so I got notification very late, sorry, Michael.

I cannot see why I should need such a complex setup, since all uids are the same on my boxes and overall setup is quite less complex than yours. I think setting fsid explicictly may be of some help, but I'm reading `man exports` and for now I don't see if it has something to do with issue in question
Comment 7 Matt Turner gentoo-dev 2010-07-06 23:39:44 UTC
I'm seeing the same problem with nfs-utils-1.2.2-r1.

I can work around it by mounting as nfsv3 using

mount -t nfs -o vers=3 ...
Comment 8 Sven Müller 2010-08-12 18:14:06 UTC
Same here
Comment 9 Nikolaj Šujskij 2010-09-15 08:58:11 UTC
> I can work around it by mounting as nfsv3 using
> mount -t nfs -o vers=3 ...

 Confirmed. Why haven't I tried it until now? Thanks for workaround (-:E


There's something new in Portage error trace now:

Group IDs of current user: 0 1 2 3 4 6 10 11 20 26 27
Comment 10 SpanKY gentoo-dev 2010-10-08 19:50:51 UTC
nfs-utils-1.2.3 now in the tree if you want to retest.  probably also make sure your kernel is up-to-date.
Comment 11 Nikolaj Šujskij 2010-10-11 09:21:12 UTC
> nfs-utils-1.2.3 now in the tree if you want to retest.

 The same result, but thanks for concern.

>  probably also make sure your kernel is up-to-date.

 2.6.35-gentoo-r9 on server, 2.6.35-tuxonice-r4 on client
Comment 12 Rick Warner 2010-10-19 18:10:52 UTC
I had the problem with chown not working.  I was not seeing 4 billion uids though.

All I had to do to fix it was start /etc/init.d/rpc.idmapd on the server side.  It wasn't running despite /etc/init.d/nfs having been started.
Comment 13 Kacper Kowalik (Xarthisius) (RETIRED) gentoo-dev 2010-10-20 09:03:45 UTC
Correct me if I am wrong, but setup presented in comment #0 is not suitable for NFS4, but perfectly sane for NFS3. Hence the hack with "-o vers=3" work just fine...

Could you try with having more "default" settings?
1) create virtual nfs4 root, then export what you want, e.g.:
/etc/exports:
/exports           192.168.16.0/24(ro,fsid=0,nohide,insecure,no_root_squash,no_subtree_check,sync)
/exports/portage   192.168.16.0/24(rw,no_subtree_check,insecure,no_root_squash,nohide,sync)
/exports/distfiles 192.168.16.0/24(rw,no_subtree_check,insecure,no_root_squash,nohide,sync)
2) Set proper domain in /etc/idmapd.conf and make sure you have idmap running

Having fulfilled 1-2), does the problem still persists?
Comment 14 Nikolaj Šujskij 2010-10-21 11:34:12 UTC
(In reply to comment #13)
> Correct me if I am wrong, but setup presented in comment #0 is not suitable for NFS4...

 Could be, I suppose. I never really bothered about setup, to be honest. I thought that giving lack of documentation about NFS4 in Gentoo it should be quite straightforward. Well, it was before I hit that bug.
 Kacper, could you please point me to a good NFS4-HowTo? It seems to me, I don't know enough to understand your suggestion fully (for example, I can't remember anything about "virtual root") \-:E

Thanks for help, much appreciated.
Comment 15 Guy 2010-10-24 23:48:51 UTC
I've encountered the same problem. The following information may be but probably won't be helpful:

Error message------------------

# emerge eselect-mesa
[Errno 22] Invalid argument: '/usr/local/portage/packages/.Packages.portage_lockfile': chown('/usr/local/portage/packages/.Packages.portage_lockfile', -1, 250)
Cannot chown a lockfile: '/usr/local/portage/packages/.Packages.portage_lockfile'
Group IDs of current user: 0 1 2 3 4 6 10 11 20 26 27 447 449

Server Information--------------

# emerge --info
Portage 2.2.0_alpha1 (default/linux/amd64/10.0, gcc-4.4.5, glibc-2.12.1-r1, 2.6.35.7 x86_64)
=================================================================
System uname: Linux-2.6.35.7-x86_64-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_5200+-with-gentoo-2.0.1
Timestamp of tree: Sun, 24 Oct 2010 01:30:01 +0000
ccache version 2.4 [disabled]
app-shells/bash:     4.1_p9
dev-java/java-config: 2.1.11
dev-lang/python:     2.5.4-r4, 2.6.6-r1, 3.1.2-r4
dev-util/ccache:     2.4-r8
dev-util/cmake:      2.8.1-r2
sys-apps/baselayout: 2.0.1
sys-apps/openrc:     0.6.3
sys-apps/sandbox:    2.3-r1
sys-devel/autoconf:  2.13, 2.68
sys-devel/automake:  1.8.5-r4, 1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.5
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.10
sys-devel/make:      3.82
virtual/os-headers:  2.6.35 (sys-kernel/linux-headers)
Repositories: gentoo zugaina x11 sping local
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA dlj-1.1 PUEL AdobeFlash-10 AdobeFlash-10.1"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=opteron -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=opteron -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="binpkg-logs buildpkg distlocks fixlafiles fixpackages news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-O1"
LINGUAS="en"
MAKEOPTS="-j5"
PKGDIR="/usr/local/portage/packages"


# cat /etc/exports
# /etc/exports: NFS file systems being exported.  See exports(5).

/home                           192.168.0.0/16(rw,no_root_squash,sync,no_subtree_check)
/public                         192.168.0.0/16(rw,no_root_squash,sync,no_subtree_check)
/pubroot                        192.168.0.0/16(rw,no_root_squash,sync,no_subtree_check)
/usr/portage/distfiles          192.168.0.0/16(rw,no_root_squash,sync,no_subtree_check)
/usr/local/portage/packages     192.168.0.0/16(rw,no_root_squash,sync,no_subtree_check)

*  net-fs/nfs-utils
      Latest version available: 1.2.3
      Latest version installed: 1.2.3
      Size of files: 656 kB
      Homepage:      http://linux-nfs.org/
      Description:   NFS client and server daemons
      License:       GPL-2

*  net-libs/libnfsidmap
      Latest version available: 0.23-r1
      Latest version installed: 0.23-r1
      Size of files: 350 kB
      Homepage:      http://www.citi.umich.edu/projects/nfsv4/linux/
      Description:   NFSv4 ID <-> name mapping library
      License:       BSD

Several comments:---------------------

1) I haven't changed any NFS settings for years. I use vanilla-sources and I've always used the same settings every time.

2) I didn't see this error message until I set up NFS shares /usr/portage/distfiles and /usr/local/portage/packages. i.e. On my systems, the error message only appears when I run emerge.

3) I did notice at one point that space was starting to be eaten up on the server's share. This may fit with the '4 billion uids' mentioned earlier. I put a stop to it by umounting and re-mounting the shares. I suspect a looping error condition.


I know less than the reporter. If there is information you'd like me to check, I'll be happy to do so but it would be best to suggest the commands you'd like me to run.

Hope this helps.
 
Comment 16 Guy 2010-10-24 23:52:47 UTC
I'd like to stress and clarify.

The /usr/portage/distfiles and /usr/local/portage/packages are **new** NFS shares for me. I've had no problems with any of my other shares. Note that I do not normally 'chown' my regular data files.
Comment 17 Guy 2010-10-25 00:44:03 UTC
I may have some partial answers:

This thread from gentoo forums appears relevant: http://forums.gentoo.org/viewtopic-p-6357635.html?sid=0265b31feff9fb92833a64ad15aba6ad

Apparently the Domain field in /etc/idmapd.conf needs to match between machines for nfs4.

The implication is that if this file gets changed between server and client systems, then you must manually set this field correctly.

One way to 'break' this would be to create binary packages without keeping the associated config files. Note following from my binary package server which also hosts the NFS shares:

pyrotekk ~ # cat /etc/idmapd.conf
[General]
#Verbosity = 0
# The following should be set to the local NFSv4 domain name
# The default is the host's DNS domain name.
#Domain = local.domain.edu

# The following is a comma-separated list of Kerberos realm
# names that should be considered to be equivalent to the
# local realm, such that <user>@REALM.A can be assumed to
# be the same user as <user>@REALM.B
# If not specified, the default local realm is the domain name,
# which defaults to the host's DNS domain name,
# translated to upper-case.
# Note that if this value is specified, the local realm name
# must be included in the list!
#Local-Realms = 

[Mapping]

#Nobody-User = nobody
#Nobody-Group = nobody

[Translation]

# Translation Method is an comma-separated, ordered list of
# translation methods that can be used.  Distributed methods
# include "nsswitch", "umich_ldap", and "static".  Each method
# is a dynamically loadable plugin library.
# New methods may be defined and inserted in the list.
# The default is "nsswitch".
#Method = nsswitch


Now note from the client system which is currently being updated via "emerge -guND @world":

-bash-4.1# cat /etc/idmapd.conf
# empty file because --include-config=n when `quickpkg` was used


I am using binary packages for the first time. I will have some cleanup to do ...
Comment 18 Nikolaj Šujskij 2010-10-29 10:01:17 UTC
Thanks to everyone here (and especially Kacper!), I resolved my issue. Here's what I did. 

 * set up "Domain" in /etc/idmapd.conf and made sure rpc.idmapd service is running
 * simplified options in /etc/exports to (rw,subtree_check,insecure,no_root_squash,sync)

Now everything seems to work all riight.
Comment 19 Matt Turner gentoo-dev 2010-11-02 01:00:55 UTC
*** Bug 326079 has been marked as a duplicate of this bug. ***
Comment 20 SpanKY gentoo-dev 2010-11-14 00:41:21 UTC
i wonder if we should set the default Domain to something like nfsv4.ebuild.gentoo.org so that things work "out of the box"
Comment 21 Kacper Kowalik (Xarthisius) (RETIRED) gentoo-dev 2010-11-14 08:21:06 UTC
(In reply to comment #20)
> i wonder if we should set the default Domain to something like
> nfsv4.ebuild.gentoo.org so that things work "out of the box"
That could prevent duplicates :) On the other hand, now it defaults to DNS domainname, so provided that one have sane net config it _should_ work "out of the box"...

Comment 22 Nikolaj Šujskij 2010-11-14 10:47:06 UTC
I wander, how this Domain-thing is supposed to work on laptops, which almost always are used in networks with different DNS domains.
Comment 23 SpanKY gentoo-dev 2010-11-14 20:21:46 UTC
that would a question to pose to the upstream lists i think
Comment 24 Nikolaj Šujskij 2010-11-14 20:45:48 UTC
(In reply to comment #23)
> that would a question to pose to the upstream lists i think
> 

Of course, but it's such an obvious case that I thought I missed something and hoped someone more experienced would be able to explain.
Comment 25 SpanKY gentoo-dev 2010-11-14 20:50:31 UTC
ive personally never used NFSv4.  i tried years ago to get it working, but couldnt, and never bothered again.  NFSv3 works fine for all my needs.
Comment 26 Jeroen Roovers (RETIRED) gentoo-dev 2011-03-25 17:23:24 UTC
Setting Domain = in /etc/idmapd.conf to the same value (I don't think it matters whether it's actually a valid domain) and running it on both servers and clients fixed it for me.
Comment 27 Fab 2011-05-15 10:10:52 UTC
(In reply to comment #26)
> Setting Domain = in /etc/idmapd.conf to the same value (I don't think it
> matters whether it's actually a valid domain) and running it on both servers
> and clients fixed it for me.

Same here. I had to start the rpc.idmapd init script manually on the client.
Shouldn't it be added to the dependency list of the netmount init script ?
Comment 28 James Le Cuirot gentoo-dev 2012-04-15 12:09:10 UTC
I have seen this problem occur sporadically despite nfsv4 working fine for me most of the time. I couldn't seem to shift the problem today so I tried the new NFS idmapper (see kernel and nfs-utils) and the problem has gone away. Fingers crossed. Maybe give it a try if you're still facing this problem. Remember to create /etc/request-key.d/id_resolver.conf with the following...

create	id_resolver	*	*	/usr/sbin/nfsidmap %k %d 600
Comment 29 David Heidelberg (okias) 2012-06-20 17:24:27 UTC
running rpc.idmapd on both computers helped, but it should be fixed better way than browsing on bugzilla :)

Thanks to all for help.