Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 188249 - sys-apps/findutils-4.3.8: Segmentation fault (SIGSEGV)
Summary: sys-apps/findutils-4.3.8: Segmentation fault (SIGSEGV)
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
: 187241 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-08-09 18:01 UTC by Martin von Gagern
Modified: 2007-08-26 13:14 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge.info,4.03 KB, text/plain)
2007-08-09 18:01 UTC, Martin von Gagern
Details
findutils-4.3.8-overflow.patch (findutils-4.3.8-overflow.patch,271 bytes, patch)
2007-08-09 19:54 UTC, Harald van Dijk (RETIRED)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin von Gagern 2007-08-09 18:01:04 UTC
I've got here a strange case of find dying on me with SIGSEGV after printing one line of output. I've located the following factors that seem to play a role:

- LC_MESSAGES=C and LANG=de_DE.utf8
- Compiled with -O2
- --prefix=/usr so it can find installed files
- find called with arguments -type f -ls
- at least two files in current directory

The problem does not occur if I set either LC_MESSAGES or LC_CTYPE to de_DE, or when I compile with -O0, or when I configure without arguments and don't install that build, or when I drop either argument to find.

I've tryed to debug the issue. I traced the actual pagefault to an scenario where due to a wrong address a function pointer gets incremented instead of a simple integer counter. The address is four bytes off suggesting another off-by-one error, but I couldn't trace that. It's very difficult to trace the actual source of the error. Might possibly have security implications.
Comment 1 Martin von Gagern 2007-08-09 18:01:51 UTC
Created attachment 127373 [details]
emerge --info
Comment 2 Martin von Gagern 2007-08-09 19:31:54 UTC
These might be steps to reproduce this issue:

tar xf /usr/portage/distfiles/findutils-4.3.8.tar.gz
cd findutils-4.3.8
./configure --prefix=/usr CFLAGS="-O2 -g"
make
cd ..
mkdir sandbox
cd sandbox
touch a b
env -i LANG=de_DE.utf8 LC_MESSAGES=C ../findutils-4.3.8/find/find -type f -ls

Please tell mewhether these steps work for you as well, at least if your arch and gcc is the same I'd expect they should, but I'm not sure.

The expression will be parsed to a binary tree that looks like this in prefix notation: "-a" ( "-a" ( NULL, "-type f" ( NULL, NULL ) ), "-ls" ( NULL, NULL ) )
You can get this from "find -D tree -type f -ls". My primary breakpoint while debuggung this issue was apply_predicate. There you have a pointer p denoting the current predicate, and from its elements you can get the whole subtree.

I found out that after prrinting ./a the expression "++(p->perf.successes);" in apply_predicate used a wrong address and instead of incrementing that counter incremented pred_ls which is the first element of struct predicate. The problem is that the pointer p used to address this counter is stored in ebx in apply_predicate, and somewhere in that register saving stuff I got lost.

Next I tried the other direction, to find out in what way the factors described above influence the problem. I ltraced two calls and replaced all hexadecimal numbers (which includes all those addresses that change with each run) so they match. Diffing the result showed a difference in the setlocale result string, in one bsearch call which I don't understand yet and then the SIGSEGV itself caused by an fprintf(NULL, ...). So I guess it's going on somewhere in the library itself, perhaps somewhere where the results from localeconv are used.
Comment 3 Harald van Dijk (RETIRED) gentoo-dev 2007-08-09 19:52:42 UTC
Could you please try editing lib/listfile.c and in the function list_file change modebuf's size from 11 to 12? strmode as provided by gnulib initialises elements 0 through 11, so a buffer of size 11 is insufficient. I'm not sure this is what's causing your specific problem though.
Comment 4 Harald van Dijk (RETIRED) gentoo-dev 2007-08-09 19:54:00 UTC
Created attachment 127395 [details, diff]
findutils-4.3.8-overflow.patch
Comment 5 Martin von Gagern 2007-08-09 20:07:41 UTC
Works here as well. Did you find the impact of my environment, or was I just unlucky enough to hit this, and there is no real connection?

How about this line "modebuf[10] = '\0';"? In the gnulib sources in that tarball that character will always be a space, corresponding to the space added by the fprintf in list file. So right now it doesn't make a difference. I'm not sure about the relation between findutils and gnulib, whether they will stay in touch and if not which one is more likely to introduce a + for files with ACLs.

Should this issue be reported upstream as well? Will you do so?
Comment 6 Harald van Dijk (RETIRED) gentoo-dev 2007-08-09 20:27:16 UTC
(In reply to comment #5)
> Works here as well.

I'm assuming you mean it still doesn't work :)

> Did you find the impact of my environment, or was I just
> unlucky enough to hit this, and there is no real connection?

It's very likely that you were just unlucky enough to hit this.

> How about this line "modebuf[10] = '\0';"? In the gnulib sources in that
> tarball that character will always be a space, corresponding to the space added
> by the fprintf in list file. So right now it doesn't make a difference.

By calling strmode, an extra byte on the stack will be set to zero, which could with a specific compiler version and a specific optimisation level and specific versions of all used headers be an important byte. The problem is often enough something as simple as that, and as hard to trace as your own problem.

> I'm not
> sure about the relation between findutils and gnulib, whether they will stay in
> touch and if not which one is more likely to introduce a + for files with ACLs.
> 
> Should this issue be reported upstream as well?

If it is a bug in findutils, then yes, but it's good to rule out a rare bug in the compiler and other parts of the environment first.

> Will you do so?

I'm not one of the maintainers of the Gentoo findutils package (I just have an interest in it); normally, for system packages, the issue will be reported upstream if/when the bug can be confirmed.
Comment 7 Martin von Gagern 2007-08-09 20:39:02 UTC
(In reply to comment #6)
> I'm assuming you mean it still doesn't work :)

???
That patch works as it fixes my local build, and with that fix find works as it doesn't crash any more. So why would I mean it doesn't work?

> If it is a bug in findutils, then yes, but it's good to rule out a rare bug in
> the compiler and other parts of the environment first.

I guess we can be pretty sure that's a bug in findutils. Once you've looked at it, it's pretty obvious that's going to break one day, and you only wonder why it didn't do so earlier.

> I'm not one of the maintainers of the Gentoo findutils package (I just have an
> interest in it); normally, for system packages, the issue will be reported
> upstream if/when the bug can be confirmed.

On the other hand, you found the solution (and pretty quick at that, considering how long I debugged that trying to write a better report. How did you do it?), so the honour of presenting that fix to the upstream devs should imho be yours. :-)
Comment 8 Harald van Dijk (RETIRED) gentoo-dev 2007-08-09 20:59:51 UTC
I thought you were saying your findutils still segfaulted in the rest of your message, but I must have been misreading. I'm glad it's working.

> How did you do it?

I did little more than check how it ran when compiled with -fmudflap.

@base-system: I'd really prefer it if one of you would contact upstream about this.
Comment 9 SpanKY gentoo-dev 2007-08-25 15:19:23 UTC
upstream has this fixed and ive merged that into 4.3.8-r1
Comment 10 SpanKY gentoo-dev 2007-08-26 13:14:04 UTC
*** Bug 187241 has been marked as a duplicate of this bug. ***