Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 3582 - cp -pr fails on XFS filesystems (fileutils)
Summary: cp -pr fails on XFS filesystems (fileutils)
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High major (vote)
Assignee: Brandon Low (RETIRED)
URL:
Whiteboard:
Keywords:
: 4464 5525 5587 7423 (view as bug list)
Depends on:
Blocks: 7423
  Show dependency tree
 
Reported: 2002-06-10 10:47 UTC by Sascha Silbe
Modified: 2003-02-04 19:42 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Typescript of a failed "emerge sys-libs/db" for db-3.2.9 (db-3.2.9.script.bz2,9.61 KB, application/x-bzip)
2002-07-07 11:36 UTC, Sascha Silbe
Details
Typescript of a failed 'cp -pr' (cp.script.bz2,219 bytes, application/x-bzip)
2002-07-08 05:30 UTC, Sascha Silbe
Details
Kernel config for 2.4.18-2-hybrid (sys-kernel/xfs-sources-2.4.18) (config-2.4.18-2-hybrid,25.98 KB, text/plain)
2002-07-09 07:13 UTC, Sascha Silbe
Details
Patch to fs/xfs/xfs_acl.c from SGI cvs to fix the cp -pr issues (XFS.patch,6.87 KB, patch)
2002-08-01 10:43 UTC, Disconnect
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sascha Silbe 2002-06-10 10:47:32 UTC
sys-libs/db-3.2.3h-r4 fails to compile, see attachment. 
I fetched a clean copy from rsync.nl.gentoo.org (i.e. removed /usr/portage/sys-libs/db beforce syncing) to be sure it's not a rsync problem again and also removed /var/cache/edb/dep/dep-db-3.2.3h-r4.ebuild prior to fetch clean archives (which was necessary to get openssh to build properly).
Comment 1 Sascha Silbe 2002-06-10 10:51:45 UTC
Since Bugzilla fails to attach a file to the bug (complains about not having specified a file), I've put it onto my homepage:
http://sascha.silbe.org/db.script.bz2
Comment 2 Jon Nelson (RETIRED) 2002-06-13 20:22:17 UTC
Try unmasking 3.2.9 and build that. 3.2.3h has some issues.
Comment 3 Jon Nelson (RETIRED) 2002-06-25 19:48:09 UTC
I am unable to reproduce the problem, and this is a singular report.
If it is reproducable with a more recent version of portage, please re-open the
bug report and provide as much detail as possible.

Thanks!
Comment 4 Sascha Silbe 2002-07-07 11:36:36 UTC
Created attachment 2002 [details]
Typescript of a failed "emerge sys-libs/db" for db-3.2.9
Comment 5 Sascha Silbe 2002-07-07 11:38:04 UTC
It still happens with sys-libs/db-3.2.9. But I found out that it works with an ext2 /var filesystem, so it probably has something to do with the XFS / (root) filesystem.
I've attached the typescript.
Comment 6 wiregauze 2002-07-08 01:26:19 UTC
I also had the exactly same problem with sys-libs/db. I first noticed the
problem when upgrading it to 3.2.9, but while unmerging/remerging it, I found
the same problem arises with 3.2.3h-r4, too.

My / filesystem is XFS.

As a temporary fix, I modified $(cp) -pr to $(cp) -r in install_docs target in
/var/.../build_unix/Makefile.Don't know what side-effect there would be by
omitting "-p" option, but nothing noticeable so far.

I needed to manually run ebuild install and ebuild qmerge in order to keep
portage happy.
Comment 7 Sascha Silbe 2002-07-08 05:30:24 UTC
Created attachment 2030 [details]
Typescript of a failed 'cp -pr'
Comment 8 Sascha Silbe 2002-07-08 05:34:48 UTC
I've verified that the real problem is in sys-apps/fileutils, not sys-libs/db (see attached typescript).
You probably want to reassign the bug to the maintainer of sys-apps/fileutils.
Comment 9 Jon Nelson (RETIRED) 2002-07-08 07:34:18 UTC
Could you run that typescript again, and this time do 2 things different?

1. When you ls, use ls --color=off (the ansi codes make this unreadable)
2. Attach the typescript without bzipping it -- it makes it impossible to view
without downloading it.

Thanks!
Comment 10 Sascha Silbe 2002-07-08 17:01:09 UTC
This time as a screenshot:

hybrid root # mkdir x y 
hybrid root # cp -pr x y
cp: preserving permissions for `y/x': Invalid argument
hybrid root # mount
/dev/main_vg/gentoo-root on / type xfs (rw,noatime)
proc on /proc type proc (rw)
none on /dev type devfs (rw)
tmpfs on /mnt/.init.d type tmpfs (rw,mode=0644,size=1024k)
tmpfs on /dev/shm type tmpfs (rw)
sphere:/home on /sphere/home type nfs (rw,addr=192.168.1.3)
/dev/hda1 on /boot type ext2 (rw,noatime)
hybrid root # uname -a
Linux hybrid.sascha.silbe.org 2.4.19-gentoo-1 #2 SMP Thu Apr 4 00:10:50 CEST 2002 i586 AuthenticAMD
hybrid root #
Comment 11 Seemant Kulleen (RETIRED) gentoo-dev 2002-07-09 02:12:09 UTC
sashca, does this happen if you run the vanilla-sources kernel as well?  if you
could please try to do that with the vanilla-sources kernel that would be fantastic.
Comment 12 Joachim Blaabjerg (RETIRED) gentoo-dev 2002-07-09 06:20:52 UTC
Hmm. Which kernel is this? gentoo-sources? It would be nice if you could test if
it is reproducable with mjc-sources as well, as I think it might be related to
the 54_xfs-2.4.18-split-xattr.bz2 patch in there... Not sure though, just a
theory. Anyway, I'm unfortunately working twelve hours a day for three weeks
now, so I've got ~1 hour a day for Gentoo, so... Anyone? mjc? =)
Comment 13 Joachim Blaabjerg (RETIRED) gentoo-dev 2002-07-09 06:30:44 UTC
jmorgan pointed out on IRC, perhaps this is a SMP problem? Would you mind
testing without SMP support, too? Thanks
Comment 14 Sascha Silbe 2002-07-09 07:12:38 UTC
I've just tried sys-kernel/xfs-sources-2.4.18. Same problem.
sys-kernel/vanilla-sources cannot work (no XFS support).
I'll attach the kernel config of 2.4.18-2-hybrid (xfs-sources-2.4.18).
Comment 15 Sascha Silbe 2002-07-09 07:13:38 UTC
Created attachment 2068 [details]
Kernel config for 2.4.18-2-hybrid (sys-kernel/xfs-sources-2.4.18)
Comment 16 Sascha Silbe 2002-07-09 08:45:41 UTC
Happens for sys-kernel/mjc-sources-2.4.19_pre10, too.
Comment 17 Ryan Phillips (RETIRED) gentoo-dev 2002-07-09 15:52:46 UTC
*** Bug 4464 has been marked as a duplicate of this bug. ***
Comment 18 Sascha Silbe 2002-07-10 19:06:15 UTC
From Bug #4464:

> This happens on my PII-266 with gcc 2.95.3, kernel 2.4.19-r7 and with XFS, but on my athlon 1.4, with the same packages, and with XFS it dosn't happen.
I'm running Gentoo on an AMD K6-2. I don't believe it's a processor issue, though.
Do you use exactly the same USE flags on both systems? If the answer is 'yes', please try different optimizations and post your results.
Thanks!
Comment 19 Sascha Silbe 2002-07-10 19:10:06 UTC
Just verified that it happens on any XFS filesystem, not only on /.
Comment 20 Sam Yates 2002-07-14 12:47:31 UTC
Just a note: I experience the same issue with XFS filesystems and cp
reporting Invalid argument. It -seems- to be an issue peculiar to the
added support in fileutils for acls.

For me the kernel is built from xfs-sources (synched July 10,
xfs-sources-2.4.18.ebuild).

The problem is exhibited on both a Via C3 machine
(CHOST="i586-pc-linux-gnu" CFLAGS="-march=pentium -O2 -pipe) and on
a Pentium4 machine (CHOST="i686-pc-linux-gnu" CFLAGS="-march=i686 -O2
-pipe"). The USE keyword "acl" is set.

I noticed on both machines that ls -l will list every file or directory
on the XFS volumes with a '+' sign indicating the presence of an ACL. It
seems odd that it does this even on files with just the default (normal
Unix) ACLs, but that might be by design.

For me, it does not occur with cp -p for files, even when the files have
extra acls. It always occurs with directories though:

host% mkdir a
host% cp -a a b
cp: preserving permissions for `b': Invalid argument

The directory 'b' however is still created, and ACLs are copied across
correctly.

If 'a' has default acls, then the copy produces no error messages:
host% mkdir a
host% setfacl -m d:dummy:rwx a
host% cp -a a b
host% getfacl b
# file: b
# owner: root
# group: root
user::rwx
group::rwx
other::r-x
default:user::rwx
default:user:ctdummy:rwx
default:group::rwx
default:mask::rwx
default:other::r-x


Running strace: (complete log!)

strace cp -a a b
execve("/bin/cp", ["cp", "-a", "a", "b"], [/* 30 vars */]) = 0
brk(0)                                  = 0x8054804
open("/etc/ld.so.preload", O_RDONLY)    = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
close(3)                                = 0
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=45824, ...}) = 0
old_mmap(NULL, 45824, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40016000
close(3)                                = 0
open("/lib/libacl.so.1", O_RDONLY)      = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\23"..., 1024) = 1024
fstat64(3, {st_mode=S_IFREG|0644, st_size=23930, ...}) = 0
old_mmap(NULL, 21924, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40022000
mprotect(0x40027000, 1444, PROT_NONE)   = 0
old_mmap(0x40027000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
0x4000) = 0x40027000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\205"..., 1024) = 1024
fstat64(3, {st_mode=S_IFREG|0755, st_size=1335898, ...}) = 0
old_mmap(NULL, 1188992, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40028000
mprotect(0x40141000, 38016, PROT_NONE)  = 0
old_mmap(0x40141000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
0x118000) = 0x40141000
old_mmap(0x40147000, 13440, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40147000
close(3)                                = 0
open("/lib/libattr.so.1", O_RDONLY)     = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\220\n\0"..., 1024) = 1024
fstat64(3, {st_mode=S_IFREG|0644, st_size=9244, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x4014b000
old_mmap(NULL, 10116, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4014c000
mprotect(0x4014e000, 1924, PROT_NONE)   = 0
old_mmap(0x4014e000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3,
0x1000) = 0x4014e000
close(3)                                = 0
munmap(0x40016000, 45824)               = 0
brk(0)                                  = 0x8054804
brk(0x805482c)                          = 0x805482c
brk(0x8055000)                          = 0x8055000
geteuid32()                             = 0
lstat64("b", 0xbffffa30)                = -1 ENOENT (No such file or directory)
lstat64("a", {st_mode=S_IFDIR|0775, st_size=6, ...}) = 0
mkdir("b", 040775)                      = 0
lstat64("b", {st_mode=S_IFDIR|0775, st_size=6, ...}) = 0
stat64("b", {st_mode=S_IFDIR|0775, st_size=6, ...}) = 0
open("/dev/null", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = -1 ENOTDIR (Not a directory)
open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0775, st_size=6, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
brk(0x8057000)                          = 0x8057000
getdents64(0x3, 0x8054d10, 0x1000, 0)   = 48
getdents64(0x3, 0x8054d10, 0x1000, 0)   = 0
close(3)                                = 0
utime("b", [2002/07/14-16:26:06, 2002/07/14-16:25:37]) = 0
chown32(0xbffffceb, 0, 0)               = 0
SYS_229(0xbffffce9, 0x400260c3, 0xbffff540, 0x84, 0x400274e4) = 44
SYS_226(0xbffffceb, 0x400260c3, 0x8054dc0, 0x2c, 0) = 0
SYS_229(0xbffffce9, 0x40026094, 0xbffff540, 0x84, 0x400274e4) = -1 ENODATA (No
data available)
SYS_226(0xbffffceb, 0x40026094, 0x8054d10, 0x4, 0) = -1 EINVAL (Invalid argument)
write(2, "cp: ", 4cp: )                     = 4
write(2, "preserving permissions for `b\'", 30preserving permissions for `b') = 30
write(2, ": Invalid argument", 18: Invalid argument)      = 18
write(2, "\n", 1
)                       = 1
_exit(1)                                = ?


Following the code, in fileutils-4.1.8/lib/acl.c,

line 157: acl = acl_get_file (src_path, ACL_TYPE_DEFAULT);

returns an acl, but sets errno to ENODATA.
Then when setting it,

line 164:       if (acl_set_file (dst_path, ACL_TYPE_DEFAULT, acl))

errno is set to EINVAL.

The acl_get_file code has:
        retval = getxattr(path_p, name, ext_acl_p, size_guess);
        /* ... */
        else if (retval == 0 || errno == ENOATTR) {
                if (type == ACL_TYPE_ACCESS) {
                        struct stat st;

                        if (stat(path_p, &st) == 0)
                                return acl_from_mode(st.st_mode);
                        else
                                return NULL;
                } else
                        return acl_init(0);
        } /* ...*/
In this situation, getxattr returns ENOATTR, and so acl_get-file
returns acl_init(0) and ERRNO is still set to ENOATTR (== ENODATA).

This is not in accordance with the 1003.1e spec (withdrawn though it
may be) which says it should not return an error in this case, but
instead return an empty acl list (it does return an empty acl list,
but errno is set in this code.)

Moving on, acl_set_file(...) returns an error.
This is fair, because according to the spec, the acl passed to acl_set_file
must be valid - which implies it must containt the user, group and other
entries. Being an empty acl list, it is thus invalid.

Following the chain into the kernel (erk!) it does indeed check to see if
the acl has zero entries before writing it, and returns EINVAL if so
(linux/fs/xfs/xfs_acl.c:87, in acl_ext_attr_to_xfs()) (As far as I can
tell though, it doesn't check for general validity of the acl in the sense
of acl_valid().)

The following patch to lib/acl.c in fileutils fixes this error,
as well as fixing another bug due to the same code, where if a directory
'a' had a default acl, and had a subdirectory 'c' which had no default acl,
the command cp -a c c2 within the directory 'a' would create a subdirectory
'c2' with inherited default acls.

START-OF-PATCH
*** lib/acl.c.orig      Mon Jul 15 02:24:56 2002
--- lib/acl.c   Mon Jul 15 02:25:12 2002
***************
*** 155,158 ****
--- 155,160 ----
    if (S_ISDIR (mode))
      {
+       acl_entry_t dummy;
+ 
        acl = acl_get_file (src_path, ACL_TYPE_DEFAULT);
        if (acl == NULL)
***************
*** 162,174 ****
        }
  
!       if (acl_set_file (dst_path, ACL_TYPE_DEFAULT, acl))
!       {
!         error (0, errno, _("preserving permissions for %s"),
!                quote (dst_path));
          acl_free(acl);
          return -1;
        }
!       else
!         acl_free(acl);
      }
    return 0;
--- 164,196 ----
        }
  
!       switch (acl_get_entry (acl,ACL_FIRST_ENTRY,&dummy))
!         {
!       case -1:
!         error (0, errno, "%s", quote (src_path));
          acl_free(acl);
          return -1;
+ 
+       case 0:
+         /* empty acl */
+         if (acl_delete_def_file (dst_path))
+           {
+             error (0, errno, _("preserving permissions for %s"),
+                    quote (dst_path));
+             acl_free(acl);
+             return -1;
+           }
+         break;
+             
+       default:
+           if (acl_set_file (dst_path, ACL_TYPE_DEFAULT, acl))
+           {
+             error (0, errno, _("preserving permissions for %s"),
+                    quote (dst_path));
+             acl_free(acl);
+             return -1;
+           }
        }
! 
!       acl_free(acl);
      }
    return 0;
END-OF-PATCH

On a related note, there is an error in the acl-20020330 library in
acl_extended_file.c; in this version of xfs at least, xfs_acl_vget()
in linux/fs/xfs/xfs_acl.c returns a larger size than required to hold
the acl (it does this on purpose) and so the size-based checking in
acl_extended_file() does not work. This is the reason why ls -l marks
everything with a '+' on XFS file systems with the acl USE option.
Perhaps getxattr() should be returning ENOATTR instead if there's no acl,
but that's not what it does at the moment.

It *does* however return the actual size when given a non-null pointer
and size. In fact, it should return E2BIG if the offered size is
insufficient. If a non-zero size is supplied, getxattr() also returns ENOATTR
as it possibly should in the absence of an acl.

START-OF-PATCH
*** cmd/acl/libacl/acl_extended_file.c.orig     Fri Mar  1 10:08:36 2002
--- cmd/acl/libacl/acl_extended_file.c  Mon Jul 15 03:11:26 2002
***************
*** 31,47 ****
  acl_extended_file(const char *path_p)
  {
!       int base_size = sizeof(acl_ea_header) + 3 * sizeof(acl_ea_entry);
        int retval;
  
!       retval = getxattr(path_p, ACL_EA_ACCESS, NULL, 0);
!       if (retval < 0 && errno != ENOATTR)
!               return -1;
!       if (retval > base_size)
!               return 1;
!       retval = getxattr(path_p, ACL_EA_DEFAULT, NULL, 0);
!       if (retval < 0 && errno != ENOATTR)
!               return -1;
!       if (retval >= base_size)
!               return 1;
        return 0;
  }
--- 31,50 ----
  acl_extended_file(const char *path_p)
  {
!       char buf[sizeof(acl_ea_header) + 3 * sizeof(acl_ea_entry)];
        int retval;
  
!       retval = getxattr(path_p, ACL_EA_ACCESS, buf, sizeof buf);
!       if (retval < 0)
!               if (errno == E2BIG)
!                       return 1;
!               else if (errno != ENOATTR)
!                       return -1;
! 
!       retval = getxattr(path_p, ACL_EA_DEFAULT, buf, sizeof buf);
!       if (retval < 0)
!               if (errno == E2BIG)
!                       return 1;
!               else if (errno != ENOATTR)
!                       return -1;
        return 0;
  }
END-OF-PATCH

If such a patch is applied, the equivalent one with fgetxattr() should
probably be applied to acl_extend_fd.c
Comment 21 Sam Yates 2002-07-14 12:54:07 UTC
Just a note: current XFS code has getxattr() return E2BIG when size is too small
(but non-zero.) The manpage says it should return ERANGE. These clearly are
inconsistent. Also, the manpage is unclear as to what behaviour should happen
when a zero size is passed and there is no such attribute on the file: should it
return -1 and set ENOATTR, or should it return a size sufficient to hold the
attribute should it have existed? The latter is what XFS currently does for the
attribute holding acls, but it's not clear if this behaviour is correct. Haven't
checked to see if this is consistent with the absence of other sorts of
attribute on the file.
Comment 22 Michael Cohen (RETIRED) gentoo-dev 2002-07-16 21:45:39 UTC
this is a bug in fileutils.
Comment 23 Sam Yates 2002-07-16 21:53:45 UTC
I've been running my system with the patches above now for a couple of days.
Both cp and ls seem to be doing the right thing now. Would it be possible to
examine those patches and if fine,update the acl and fileutils packages?
Comment 24 Sam Yates 2002-07-16 21:57:11 UTC
Oh, actually the patch I'm using for libacl/acl_extended_file.c also checks for
ERANGE (as per the manpage) as well as E2BIG; I also applied the equivalent
patch to acl_extended_fd.c
Comment 25 Dennis Conrad 2002-07-24 18:44:44 UTC
*** Bug 5525 has been marked as a duplicate of this bug. ***
Comment 26 Chris Sorisio 2002-07-25 18:45:44 UTC
*** Bug 5587 has been marked as a duplicate of this bug. ***
Comment 27 Chris Sorisio 2002-07-25 18:49:16 UTC
Does an ebuild exist that would allow me to apply the patches you refer to so I 
may test your resolution on my system?
Comment 28 Nicholas Wourms 2002-07-30 11:49:57 UTC
I agree, this *needs* to be integrated ASAP.  Or at least provide a testing
version  of fileutils.  I think this is a *definite* showstopper for those of us
using XFS/ACL.
Comment 29 Disconnect 2002-08-01 10:43:45 UTC
Created attachment 2725 [details, diff]
Patch to fs/xfs/xfs_acl.c from SGI cvs to fix the cp -pr issues

After much searching I found info in the SGI CVS log about this problem; this
was the next cvs revision of xfs_acl.c that fixed it.  It has been running on
my laptop (used fairly constantly) since Jul 2 without any problems, as well as
a friend's "I want to play with Gentoo and I like XFS" workstation.

As I recall the root cause is a possible bug in libacl.  More info at 
http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.4-xfs/linux/fs/xfs/xfs_acl.c
(this patch is CVS revision 1.23)
Comment 30 Nicholas Wourms 2002-08-04 08:37:50 UTC
I can confirm that Disconnect's patch seems to have resolved the issue.  It did
require a little bit of hand patching for the mjc sources, but otherwise it was
fairly routine.
Comment 31 Roman Majer 2002-08-29 04:21:42 UTC
I report the same error with ghostscript ebuild (cp -a)... on fresh 1.4 install
with XFS and Posix ACL switched on in kernel...
Without ACL in kernel all is ok...
Comment 32 Jon Nelson (RETIRED) 2002-09-04 07:21:42 UTC
*** Bug 7423 has been marked as a duplicate of this bug. ***
Comment 33 Brandon Low (RETIRED) gentoo-dev 2002-09-06 17:49:22 UTC
so, this is a fileutils bug, needs a version update, and or fixed in recent xfs
versions, am I on the right page?  If this is still a problem with
xfs-sources-2.4.19-r1, please re-open the bug.
Comment 34 Sascha Silbe 2002-09-07 03:35:16 UTC
cube root # emerge =sys-kernel/xfs-sources-2.4.19-r1
Calculating dependencies   

emerge: all ebuilds that could satisfy "=sys-kernel/xfs-sources-2.4.19-r1" have been masked.
cube root # grep xfs-sources /usr/portage/profiles/package.mask 
# New xfs-sources, please test.
>=sys-kernel/xfs-sources-2.4.19

I cannot find where it is masked, so I cannot test it. :(

Comment 35 Joachim Blaabjerg (RETIRED) gentoo-dev 2002-09-07 05:24:02 UTC
Sascha: that _is_ the mask. >= means "higher than or equal to", e.g everything newer 
than or matching xfs-sources-2.4.19 will be masked.
Comment 36 Sascha Silbe 2002-09-07 06:35:02 UTC
Oops, of course. I was interpreting it like /etc/make.profile/packages, which is exactly the opposite. :)
Just tried xfs-sources-2.4.19-r1 and had serious trouble. Mount choked at boot time with a kernel error and hung on the manual invocation. When trying to sync filesystems with <SysRq>+<s>, even the kernel hung.
The kernel config is basically the same as for xfs-sources-2.4.18. I used "make oldconfig" and entered some save values for the new options.
I'm using XFS over LVM for all filesystems except /boot (ext2 on a primary partition) and /usr/vice/cache (ext2 over LVM).
Comment 37 Sascha Silbe 2002-09-30 19:53:53 UTC
I'm still getting a kernel OOPS as soon as mount tries to mount the first XFS partition:

XFS mounting filesystem lvm(58,0)
Unable to handle kernel NULL pointer dereference at virtual address 00000008
 printing eip:
c022c4bf
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c022c4bf>]    Not tainted
EFLAGS: 00010256
eax: 00000000   ebx: d6b96240   ecx: d6b96040   edx: 00000000
esi: d6af4017   edi: 00002a05   ebp: 00000000   esp: d6bf7d54
ds: 0018   es: 0018   ss: 0018
Process mount (pid: 2831, stackpage=d6bf7000)
Stack: 00000000 00000000 00000000 00000246 00000000 00000000 d6b79c20 00000000 
       00000000 d6b96240 d6af4017 00002a05 d6b79c00 c022c702 00000000 00000000 
       00000000 00000200 00002a05 d6b96240 00000000 00000000 d6b79c20 d6af4017 
Call Trace:    [<c022c702>] [<c021876d>] [<c0221cde>] [<c02220c9>] [<c0236fc3>]
  [<c0133e38>] [<c01410ba>] [<c0141a83>] [<c015309e>] [<c014102c>] [<c0141dfd>]
  [<c0154273>] [<c0154590>] [<c01543d9>] [<c01549c1>] [<c0108ec7>]

Code: 8b 45 08 0f b7 40 10 89 04 24 e8 b2 f7 ff ff 89 44 24 18 8d 


I've completely rebuilt the kernel after "make mrproper" to be sure there was nothing messed up during the compile.

Comment 38 Brandon Low (RETIRED) gentoo-dev 2002-10-01 08:15:21 UTC
wowzer, lemme get mjc back in on this one, apparently the issues with
xfs-sources-r1 are LVM related and I dont use it and know little about it.
Comment 39 Sascha Silbe 2002-10-08 05:51:22 UTC
I'm now running a vanilla 2.4.19 kernel patched with ftp://oss.sgi.com/projects/xfs/download/patches/2.4.19/xfs-2.4.19-all-i386.bz2 (+freeswan+loop-aes, but that does not matter). This one works fine, no kernel panics after all.
It seems like one of the additional patches from http://gentoo.lostlogicx.com/patches-2.4.19-xfs-r1.tar.bz2 is causing the problems. What's in there?
Comment 40 Brandon Low (RETIRED) gentoo-dev 2002-10-08 08:00:08 UTC
xfs-sources-2.4.19-r2 hit Rsync recently, can you test that (assuming you don't
use grsecurity as grsecurity is currently broken in that patch)
Comment 41 Brandon Low (RETIRED) gentoo-dev 2002-10-08 08:03:11 UTC
and sorry to request more testing... but it is a whole new version of xfs an
stuff... I'm currently working on resolving the grsecurity issue...
Comment 42 Sascha Silbe 2002-10-31 11:55:25 UTC
I've now tested xfs-sources-2.4.19-r2. It seems to work fine (i.e. I could boot it properly and use w3m to read some pages on the console. Because my current kernel includes FreeSWAN 1.98b + FreeSWAN-alg-0.8.0 and loop-AES, I'll keep it instead of changing back to xfs-sources, so I cannot say anything about its long-time reliability. When 2.4.20 is out, I'll come back to you. :)
Thanks!