Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 37545 - OpenAFS on Gentoo seems to be very broken
Summary: OpenAFS on Gentoo seems to be very broken
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: Stefaan De Roeck (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-01-07 16:58 UTC by Brett I. Holcomb
Modified: 2005-08-16 06:26 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Diff file for changes to fix $PART variable check (afs.txt,376 bytes, text/plain)
2004-01-09 22:02 UTC, Brett I. Holcomb
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Brett I. Holcomb 2004-01-07 16:58:29 UTC
I have been attempting to install and get OpenAFS 1.2.10 running for the past week.  I spent part of the previous week going through the Gentoo and OPENAFS docs  several times.  It appears either the Gentoo docs are very wrong or the ebuilds are very broken.  

I'm willing to help fix this by running things and giving feedback but I don't have the knowledge to fix the ebuilds and make it run.

1.  OpenAFS installs.  
2.  I am doing a server install and the docs seem to work until I have to set the acl.  Then I get an error "function /afs not implemented" (see the bug on that).  Looking at the Gentoo forums that error has been around for a long time.
3.  If I try and start openafs with /etc/init.d/afs start I get errors.
First try - nothing starts.

/sbin/runscript.sh: line 40: [: /dev/loop0: binary operator expected
expr: syntax error
expr: syntax error
 * Starting AFS services...
ParseCacheInfoFile: Format error in cache info file!
        2 out of 3 fields successfully parsed.
 * Error starting AFS                                                     [ !! ]

Next try:

root@gandalf etc # /etc/init.d/afs start 
/sbin/runscript.sh: line 40: [: /dev/loop0: binary operator expected
 * Starting AFS services...
Failed to load AFS client, not starting AFS services.
/sbin/runscript.sh: line 310: [: Error Starting AFS client: integer expression expected                                                                   [ !! ]

/sbin/runscript.sh: line 329: return: Error: numeric argument required
ParseCacheInfoFile: Format error in cache info file!
        2 out of 3 fields successfully parsed.
 * Error starting AFS                                                     [ !! ]
Comment 1 Steven Jenkins 2004-01-07 22:31:05 UTC
Please post the result of:

cat /proc/mounts
cat /usr/vice/etc/cacheinfo
Comment 2 Brett I. Holcomb 2004-01-08 17:06:09 UTC
Here they are. The cacheinfo file I had to modify to acc the 100000 after the last : since it was blank - only the first two fields wer there.  The text in /etc/afs/afs.conf says that the  startup should automatically figure that but it  evidently did not.  OpenAFS would not start until I fixed the cacheinfo.

afs:/usr/vice/cache:100000

root@gandalf etc # cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / xfs rw 0 0
none /dev devfs rw 0 0
none /proc proc rw 0 0
/dev/scsi/host0/bus0/target0/lun0/part3 /home xfs rw 0 0
/dev/scsi/host0/bus0/target0/lun0/part6 /usr xfs rw 0 0
/dev/scsi/host0/bus0/target0/lun0/part5 /files xfs rw 0 0
/dev/loop0 /usr/vice/cache ext2 rw 0 0
/dev/loop1 /vicepa ext2 rw 0 0
none /dev/shm tmpfs rw 0 0
strider:/home /mnt/strider/home nfs rw,nosuid,nodev,v2,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=strider 0 0
strider:/usr/local/portage /mnt/strider/lportage nfs rw,nosuid,nodev,v2,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=strider 0 0

Comment 3 Steven Jenkins 2004-01-08 19:45:20 UTC
This is almost certainly Bug 26213 again. Line 53 of /etc/init.d/afs is trying to determine the size of the partition mounted on /usr/vice/cache. The attempt fails if 'df' returns a multi-line answer, as it does when the device name is too long.

What does 'df' return?
Comment 4 Brett I. Holcomb 2004-01-09 18:08:24 UTC
Here's df.  It is multiline (I put it to a file df >tmp.txt and looked at it with a text editor.  The mounts for strider are NFS mounts.

root@gandalf root # df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/scsi/host0/bus0/target0/lun0/part2
                      10706684    635864  10070820   6% /
/dev/scsi/host0/bus0/target0/lun0/part3
                      10706684   3566032   7140652  34% /home
/dev/scsi/host0/bus0/target0/lun0/part6
                      10734620   3003984   7730636  28% /usr
/dev/scsi/host0/bus0/target0/lun0/part5
                     220178188  25459328 194718860  12% /files
/var/cache/openafs/openafs.file
                        298471       613    282447   1% /usr/vice/cache
/files/openafs/vicepapart
                          4838        72      4516   2% /vicepa
none                    516084         0    516084   0% /dev/shm
strider:/home         10706684   5414300   5292384  51% /mnt/strider/home
strider:/usr/local/portage
                      10706652   6243904   4462748  59% /mnt/strider/lportage

Comment 5 Brett I. Holcomb 2004-01-09 20:52:32 UTC
I've been looking at this and it may not be the multiline bug.  The code checks for the existence of /usr/vice/etc/cacheinfo.  If it does NOT exist it goes through the auto creation.   However, it does exist on my system.  I even modifed /etc/init.d/afs to use the patch given in bug 26213 and it did not help.  The fix should be included in this init file though. It looks like OpenAFS did not include it.

However, if I execute PART=`cat /proc/mounts | grep vice | grep ext2 | awk '{print $1}'` $PART contains /dev/loop0 and /dev/loop1.  When I start /etc/init.d/afs I get this error.

/sbin/runscript.sh: line 40: [: /dev/loop0: binary operator expected

Any ideas on why this is showing up?  I looked at /sbin/runscript.sh but line 40 is a comment and it I can't see why any of the surrounding lines should have a problem with this.
Comment 6 Brett I. Holcomb 2004-01-09 21:48:31 UTC
I put a set -x in the start() function of /etc/init.d/afs and found that it's upset because $PART contains "/dev/loop0 /dev/loop1" and the -z test evidently doesn't like it (see below).  I found that if I made this [ -z "$PART" ] then the test worked.  This situation may not be normal but it needs to be accounted for.  I was out of partitions so I created two files, make ext2 filesystems on them and then loop mounted them.

With original:

+ PART=/dev/loop0
/dev/loop1
+ '[' -z /dev/loop0 /dev/loop1 ']'
/sbin/runscript.sh: line 40: [: /dev/loop0: binary operator expected

Fixed:

+ PART=/dev/loop0 
/dev/loop1
+ '[' -z '/dev/loop0
/dev/loop1' ']'
+ :
Comment 7 Brett I. Holcomb 2004-01-09 22:02:13 UTC
Created attachment 23528 [details]
Diff file for changes to fix $PART variable check
Comment 8 Brett I. Holcomb 2004-01-09 22:04:17 UTC
Now that AFS does the check and tries to start I get to this point.  What next?

root@gandalf init.d # /etc/init.d/afs start
 * Starting AFS services...
afsd: All AFS daemons started.
afsd: Can't mount AFS on /afs(22)
 * Error starting AFS                                                     [ !! ]
Comment 9 Steven Jenkins 2004-01-10 08:12:28 UTC
That problem could be a lot of things, probably not related to the ebuild.

Make sure the kernel sources you compiled against match your running kernel.

Make sure the directory /afs exists.

Check /usr/vice/etc/ThisCell and /usr/vice/etc/CellServDB.
Comment 10 Ryan Phillips (RETIRED) gentoo-dev 2004-01-21 09:49:56 UTC
Is this bug still valid?
Comment 11 Jon Nials 2004-02-14 09:41:48 UTC
AFS not mounting is probably due to following the gentoo AFS documentation.  It is unclear on a number of things.  Unless you have created your root.afs and root.cell entries properly /afs cannot mount.  /afs is a "magic" mount point.

To work around this try the following:

Assuming you have created your "admin" id:

klog admin
vos create root.afs <servername> <vice partition name>
vos create root.cell <servername> <vice partition name>

now restart your AFS process (you may have to reboot) and continue with the Gentoo documentation.
Comment 12 Brett I. Holcomb 2004-02-14 10:28:07 UTC
Okay.  I may try that.  I have had to drop working on getting OpenAFS running for a while due to other things taking my time.  I got OpenAFS to start/run - kind of - by dumping the Gentoo docs and going to the OpenAFS docs.  I got everything created and ready to run.  I also had to fix the /etc/init.d/afs file because a fix for multiline fstab output wasn't in there (it had gotten lost).  

I may just unmerge OpenAFS and then start over again.

Thanks.
Comment 13 Maurice van der Pot (RETIRED) gentoo-dev 2005-06-30 14:30:06 UTC
OpenAFS needs a developer to take up maintenance.
Comment 14 Martin Mokrejš 2005-07-02 03:35:59 UTC
Per comment #8 and #11: make sure bosserver is running when you attempt to start
afsd.

PATH=/usr/afs/bin:$PATH

bosserver -noauth &

bos adduser aquarius mmokrejs.admin -cell doma


# the next commands make bosserver immediately start server processes
bos create aquarius ptserver simple /usr/afs/bin/ptserver -cell doma
bos create aquarius vlserver  simple /usr/afs/bin/vlserver -cell doma
bos create aquarius fs fs /usr/afs/bin/fileserver /usr/afs/bin/volserver
/usr/afs/bin/salvager -cell doma
# bos stop -instance fs -server aquarius
# bos delete -instance fs -server aquarius

pts createuser mmokrejs -cell doma
pts createuser mmokrejs.admin -cell doma
pts adduser mmokrejs.admin system:administrators -cell doma
pts membership mmokrejs.admin -cell doma

vos create aquarius /vicepa root.afs -cell doma

mkdir -p /afs
chmod a+rx /afs

mkdir -p /usr/vice/cache
/usr/vice/etc/afsd -nosettime -verbose &

If this works, reboot the machine because: bosserver runs in insecure mode and
you cannot easily get rid of the afsd kernel process. after the machine comes
up, start bosserver without the "-noauth" option, start again afsd and continue:

for kth-krb4 do:
  kauth mmokrejs.admin

for heimdal do:
  kauth mmokrejs/admin

ls -la /afs
df
vos listvol aquarius
vos listvldb
fs sa /afs system:anyuser rl
fs sa /afs system:authuser rl
fs examine /afs
vos create aquarius /vicepa root.cell -cell doma
fs mkmount /afs/doma root.cell
fs setacl /afs/doma system:anyuser rl
fs setacl /afs/doma system:authuser rl
fs mkmount /afs/.doma root.cell -rw
fs setacl /afs/.doma system:anyuser rl
fs setacl /afs/.doma system:authuser rl
fs examine /afs
vos addsite aquarius /vicepa root.afs
vos addsite aquarius /vicepa root.cell
vos release root.afs
vos release root.cell
fs checkvolumes
ls -la /afs
vos release root.cell
fs checkvolumes
vos release root.afs
Comment 15 Stefaan De Roeck (RETIRED) gentoo-dev 2005-07-28 09:01:52 UTC
New ebuilds for openafs 1.2.13 (stable) and 1.3.85 (experimental) are available
for testing.  According to openafs-ml, 1.3.85 is currently undergoing testing so
it can become 1.4rc.

The new init script doesn't make a /etc/openafs/cacheinfo by itself anymore, the
administrator is supposed to create it.  Also the check if the cache-location
contains a mounted ext2-partition has been removed (cache doesn't need to be a
seperate partition, much less can it only be ext2, care with 1.2.x is probably
advisable though)

It'd be great if the new proposed ebuilds could be tested to determine the
current degree of brokenness of "OpenAFS on Gentoo" as mentioned in the bug's
summary :)
Comment 16 Stefaan De Roeck (RETIRED) gentoo-dev 2005-08-16 06:26:36 UTC
Problems were most likely caused by misconfiguration by the user.