Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 33474 - chmod on socket fails on NFS filesystem | /var/run should be tmpfs for nfsroot systems
Summary: chmod on socket fails on NFS filesystem | /var/run should be tmpfs for nfsroo...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal
Assignee: x86-kernel@gentoo.org (DEPRECATED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-11-14 12:37 UTC by Gregory P. Smith
Modified: 2004-04-11 23:15 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
kernel config on the nfsroot client (config-2.4.20-gentoo-r7-nfsroot,31.01 KB, text/plain)
2003-11-15 14:14 UTC, Gregory P. Smith
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gregory P. Smith 2003-11-14 12:37:47 UTC
I have a network booting system that is entirely NFS based.

In order to run vmware I had to install nscd (otherwise i'm happy to live without nscd).

Doing so caused problems.  What I traced it down to was that the 
/var/run/.nscd_socket unix domain socket that nscd creates on startup was being
created with mode 0600 (or 0644?) followed by a chmod call to set it to the
appropriate 0666 mode.

When /var/run was on an NFS v3,tcp file system using strace I found that the 
chmod call was failing so the mode was left 0644.  This was the worst possible 
outcome because glibc would see the .nscd_socket and try to use it, but only root 
owned processed would succeed and be able to do any name lookups.

(system config: kernel 2.4.20-gentoo-r7 on this client; my server is an alpha running 
2.4.23-pre1 with 10 nfs patches from sourceforge to allow for 32k read/writes and 
some bug fixes)

I fixed it by deciding that /var/run only ever contained transient temporary files relevant on a single boot of the system so I mounted /var/run as tmpfs and all has been well since.

Is there any reason everyone's /var/run should not be tmpfs?

I don't see an NFS setattr call on the network when i do the chown (though it does
do a lookup on the file).

Here's a simple way to test this yourself:

% python
>>> import socket
>>> s = socket.socket(socket.AF_UNIX)
>>> s.bind('/path/on/nfs/filesystem/socketfile')
>>> ^Z
[1]+  Stopped

% ls -al /path/on/nfs/filesystem/socketfile
srwxr-xr-x    1 greg     users           0 Nov 14 12:26 socketfile=
% chmod 0666 /path/on/nfs/filesystem/socketfile
% ls -al /path/on/nfs/filesystem/socketfile
srwxr-xr-x    1 greg     users           0 Nov 14 12:26 socketfile=

Do that on a non-NFS filesystem and it works as expected.

This sounds like a linux kernel nfs client bug or possibly a glibc bug?  (glibc 2.3.2 here)
Comment 1 Martin Schlemmer (RETIRED) gentoo-dev 2003-11-15 08:08:23 UTC
We do not want to depend on tmpfs.  If its a kernel bug, then let it be fixed.
Comment 2 Tim Yamin (RETIRED) gentoo-dev 2003-11-15 08:32:03 UTC
Can we have some kernel info please. Can you also attach your .config to this bug? Thanks.
Comment 3 Brian Jackson (RETIRED) gentoo-dev 2003-11-15 11:35:02 UTC
WRXsti src # cp -a /tmp/ksocket-root/kdeinit-\:0 /usr/portage/test.sock
WRXsti src # ls -lah /usr/portage/test.sock
srw-------    1 root     root            0 Nov 10 15:29 /usr/portage/test.sock
WRXsti src # chmod 777 /usr/portage/test.sock
WRXsti src # ls -lah /usr/portage/test.sock
srwxrwxrwx    1 root     root            0 Nov 10 15:29 /usr/portage/test.sock
WRXsti src # mount | grep portage
192.168.0.1:/usr/portage on /usr/portage type nfs (rw,noatime,addr=192.168.0.1)
WRXsti src # uname -a
Linux WRXsti 2.4.22-gentoo-test-r1 #2 Mon Nov 10 14:50:35 CST 2003 i686 AMD Athlon(tm) XP 2600+ AuthenticAMD GNU/Linux
WRXsti src #

Seems to work fine over here, maybe you can try a different kernel and let us know if it's a problem with gentoo-sources-2.4.20?
Comment 4 Gregory P. Smith 2003-11-15 14:13:39 UTC
Kernel on the gentoo client is 2.4.20-gentoo-r7.

Kernel parameters (via pxelinux):
root=/dev/nfs nfsroot=192.168.2.200:/home/nfsroot,v3,rw,posix,rsize=8192,wsize=8192,actimeo=300 ip=bootp

/etc/fstab:
192.168.2.200:/home/nfsroot   /  nfs    rw,nfsvers=3,tcp,lock,intr,posix,actimeo=300,rsize=32768,wsize=32768
192.168.2.200:/home/nfsroot/usr /usr    nfs      rw,nfsvers=3,tcp,lock,intr,posix,actimeo=300,rsize=32768,wsize=32768
192.168.2.200:/home/nfsroot/home /home     nfs   rw,nfsvers=3,tcp,lock,intr,posix,actimeo=300,rsize=32768,wsize=32768

#/dev/SWAP      none    swap   sw   0 0
/dev/cdroms/cdrom0    /mnt/cdrom iso9660        noauto,ro           0 0

# NOTE: The next line is critical for boot!
none    /proc         proc    defaults     0 0
none    /proc/bus/usb usbdevfs defaults    0 0

# glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
# POSIX shared memory (shm_open, shm_unlink). 
none    /dev/shm      tmpfs    defaults    0 0
none    /tmp          tmpfs    defaults    0 0
# /var/run contains state only useful to a booted system
# i was having trouble with permissions on /var/run/.nscd_socket not
# being set to 0666 as nscd chmod'ed it to.  hopefully this fixes that.
none    /var/run      tmpfs    defaults    0 0


i'll attach the kernel config file.
Comment 5 Gregory P. Smith 2003-11-15 14:14:47 UTC
Created attachment 20790 [details]
kernel config on the nfsroot client
Comment 6 Gregory P. Smith 2003-11-15 14:18:17 UTC
I'm compiling a 2.4.22-gentoo-sources kernel now and will let you know how it goes.
Comment 7 Gregory P. Smith 2004-02-14 20:53:36 UTC
2.4.22-gentoo-r5 does have this bug.
Comment 8 Gregory P. Smith 2004-02-14 20:59:41 UTC
but 2.4.22-gentoo (which no longer seems to be in portage) does not have this bug.
Comment 9 Jason Cox (RETIRED) gentoo-dev 2004-04-08 20:40:54 UTC
How about anything newer? Does the error still exist?
Comment 10 Gregory P. Smith 2004-04-11 23:15:48 UTC
yay.  2.4.25-gentoo does not have this bug.  i haven't tried anything 2.6 on this machine; i'll reopen this bug or file a new one for any 2.6 nfs issues.