Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 475284

Summary: app-shells/bash-4.2_p39-r1: "exec 3>&2" causes illegal instruction (seen with app-admin/eselect-1.3.5)
Product: Gentoo Linux Reporter: MATSUI Tetsushi <VED03370>
Component: [OLD] Core systemAssignee: Gentoo's Team for Core System packages <base-system>
Status: CONFIRMED ---    
Severity: normal CC: eselect, nojspam, prefix
Priority: Normal    
Version: unspecified   
Hardware: x86   
OS: OS X   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 451150    
Attachments: emerge --info
env -i bash eselect
env -i bash -v eselect
Patch (reverts commit 12e3ecb)
Alternative patch for stderr redirection
Updated patch that tests for bash version
A workaround for emerging zsh-5.0.2-r3 on prefix OS X
diagnostic report

Description MATSUI Tetsushi 2013-06-30 05:50:14 UTC
app-portage/eselect-1.3.5 dies with the message "Illegal instruction" if there is no environment variable.

Reproducible: Always

Steps to Reproduce:
1. emerge =eselect-1.3.5
2. env -i /path/to/eselect

Actual Results:  
Illegal instruction

Expected Results:  
Show help message.

I have encountered this on Gentoo Prefix OS X through a script invoked by PORTAGE_ELOG_COMMAND, which does not provide any environment variables.  eselect-1.3.4 works without any environment variables; at least it can show help.
Comment 1 Ulrich Müller gentoo-dev 2013-06-30 06:56:00 UTC
I cannot reproduce this.

Could you please post:
- emerge --info
- output (stdout and stderr) of "env -i bash eselect"
Comment 2 MATSUI Tetsushi 2013-06-30 09:55:46 UTC
Created attachment 352294 [details]
emerge --info
Comment 3 MATSUI Tetsushi 2013-06-30 10:00:42 UTC
Created attachment 352300 [details]
env -i bash eselect
Comment 4 Ulrich Müller gentoo-dev 2013-06-30 10:55:56 UTC
(In reply to MATSUI Tetsushi from comment #3)
> Created attachment 352300 [details]
> env -i bash eselect

Sorry, I meant "env -i bash -v eselect".
Comment 5 MATSUI Tetsushi 2013-06-30 11:07:31 UTC
Created attachment 352306 [details]
env -i bash -v eselect
Comment 6 Ulrich Müller gentoo-dev 2013-06-30 14:11:30 UTC
So the problem disappear when you turn on debug output in bash.

Are core dumps enabled ("ulimit -c unlimited")? If yes, does the "illegal instruction" cause a core dump, and from what program?
Comment 7 MATSUI Tetsushi 2013-06-30 15:05:24 UTC
(In reply to Ulrich Müller from comment #6)
> So the problem disappear when you turn on debug output in bash.
> 
> Are core dumps enabled ("ulimit -c unlimited")? If yes, does the "illegal
> instruction" cause a core dump, and from what program?

When core dump is enabled, there are two core files created both from bash.

The first one(32321):
Core was generated by `/Users/tetsushi/Gentoo/bin/bash'.
Reading symbols for shared libraries . done
Reading symbols for shared libraries .............. done
#0  0x901ff862 in mbrlen ()
(gdb) bt
#0  0x901ff862 in mbrlen ()
#1  0x000058f4 in set_line_mbstate ()
#2  0x00007e12 in shell_getc ()
#3  0x0000a603 in read_token ()
#4  0x0000dcd8 in yyparse ()
#5  0x00004de7 in parse_command ()
#6  0x0005c0b0 in parse_and_execute ()
#7  0x0005b9c9 in _evalfile ()
#8  0x0005bbcd in source_file ()
#9  0x00064deb in source_builtin ()
#10 0x000178ba in execute_builtin ()
#11 0x0001b31f in execute_simple_command ()
#12 0x00019063 in execute_command_internal ()
#13 0x0001be44 in execute_command ()
#14 0x0001cc88 in execute_connection ()
#15 0x00018f50 in execute_command_internal ()
#16 0x0001be44 in execute_command ()
#17 0x0000509e in reader_loop ()
#18 0x00004b5c in main ()

The second one(32325):
Core was generated by `/Users/tetsushi/Gentoo/bin/bash'.
Reading symbols for shared libraries . done
Reading symbols for shared libraries ............. done
#0  0x00007dda in shell_getc ()
(gdb) bt
#0  0x00007dda in shell_getc ()
#1  0x0000a603 in read_token ()
#2  0x0000dcd8 in yyparse ()
#3  0x00004de7 in parse_command ()
#4  0x0005c0b0 in parse_and_execute ()
#5  0x0005b9c9 in _evalfile ()
#6  0x0005bbcd in source_file ()
#7  0x00064deb in source_builtin ()
#8  0x000178ba in execute_builtin ()
#9  0x0001b31f in execute_simple_command ()
#10 0x00019063 in execute_command_internal ()
#11 0x0001be44 in execute_command ()
#12 0x0001cc88 in execute_connection ()
#13 0x00018f50 in execute_command_internal ()
#14 0x0001be44 in execute_command ()
#15 0x0000509e in reader_loop ()
#16 0x00004b5c in main ()

Do these help? Or do you need other kinds of info?
Comment 8 Ulrich Müller gentoo-dev 2013-06-30 17:29:59 UTC
A bash script shouldn't be able to cause an illegal instruction in the shell, so my guess is that it's a bug in bash.

CCing Prefix team: Any ideas?
Comment 9 Fabian Groffen gentoo-dev 2013-07-01 08:34:58 UTC
I recall seeing those before, which was due to an upgraded system.  I'm wondering if this might be the case here also.
Comment 10 MATSUI Tetsushi 2013-07-03 15:21:02 UTC
(In reply to Fabian Groffen from comment #9)
> I recall seeing those before, which was due to an upgraded system.  I'm
> wondering if this might be the case here also.

I haven't upgraded my system recently. It's still 10.6.
Comment 11 Ulrich Müller gentoo-dev 2013-07-05 19:35:16 UTC
Created attachment 352690 [details, diff]
Patch (reverts commit 12e3ecb)

There are only few changes between eselect 1.3.4 and 1.3.5. The only one that looks suspicious to me is this: <http://git.overlays.gentoo.org/gitweb/?p=proj/eselect.git;a=commit;h=12e3ecb19d311b888abc118d806fee635602e3ee>

Can you try if attached patch makes the problem disappear?
Comment 12 MATSUI Tetsushi 2013-07-06 00:58:08 UTC
Yes, The patch makes the problem disappear.
Thank you!

PS
I get sure that this bug is OS X specific.
I ran valgrind to see what happens when bash dies, and
it always stops at the same address of libSystem.B.dylib.
Comment 13 Ulrich Müller gentoo-dev 2013-07-06 06:25:13 UTC
Created attachment 352704 [details, diff]
Alternative patch for stderr redirection

Could you do one more test for me please, and check if attached alternative patch would also fix the problem?
Comment 14 Fabian Groffen gentoo-dev 2013-07-06 07:15:01 UTC
That patch looks sensical to me, we can't just assume fd-3 is not in use, letting bash assign a free one is much more portable/safe.
Comment 15 Ulrich Müller gentoo-dev 2013-07-06 07:58:41 UTC
Created attachment 352706 [details, diff]
Updated patch that tests for bash version

This will work only for >=bash-4.1 though. Updated patch with bash version test is attached.

The crucial question is of course if it fixes (or rather, works around) the bug on OS X.
Comment 16 Fabian Groffen gentoo-dev 2013-07-06 08:28:50 UTC
Doesn't this show on *BSD?  I recall something about fd-3, or was it 7?

I'd like to know what FD is assigned in case bash does it though.  I agree bash shouldn't crash here.
Comment 17 Ulrich Müller gentoo-dev 2013-07-06 10:42:56 UTC
(In reply to Fabian Groffen from comment #16)
> I'd like to know what FD is assigned in case bash does it though.

With bash 4.2 I get this, both on Linux and FreeBSD:

   $ exec {fd}>&2
   $ echo ${fd}
   10
Comment 18 MATSUI Tetsushi 2013-07-06 11:57:51 UTC
(In reply to Ulrich Müller from comment #15)
> Created attachment 352706 [details, diff] [details, diff]
> Updated patch that tests for bash version
> 
> This will work only for >=bash-4.1 though. Updated patch with bash version
> test is attached.
> 
> The crucial question is of course if it fixes (or rather, works around) the
> bug on OS X.

yeah, it also works fine.
Comment 19 Ulrich Müller gentoo-dev 2013-07-06 14:32:27 UTC
I've added the workaround to eselect-1.3.6:
<http://git.overlays.gentoo.org/gitweb/?p=proj/eselect.git;a=commit;h=3a412426d924310abb59311dd3cc1133eb1c6849>

Reassigning to bash maintainers.
In a nutshell: "env -i /usr/bin/eselect" makes bash-4.2_p39-r1 crash with an illegal instruction on Mac OS X. The cause for the failure seems to be the line "exec 3>&2" (which is new in eselect 1.3.5).
Comment 20 John Gibson 2013-09-12 20:54:19 UTC
I've seen similar behavior with bash-4.2_p39-r1 on prefixed portage with OS X when trying to emerge zsh-5.0.2-r3.  In that case it was dying when executing "exec 3>&1 >$the_subdir/${the_makefile}.in".  Would you like me to open a separate bug for that issue?
Comment 21 Ulrich Müller gentoo-dev 2013-09-13 07:19:38 UTC
(In reply to John Gibson from comment #20)
> I've seen similar behavior with bash-4.2_p39-r1 on prefixed portage with OS
> X when trying to emerge zsh-5.0.2-r3.  In that case it was dying when
> executing "exec 3>&1 >$the_subdir/${the_makefile}.in".  Would you like me to
> open a separate bug for that issue?

Looks like the same issue, so no separate bug.

Setting status to CONFIRMED since it has been seen twice now.
Comment 22 John Gibson 2013-09-15 22:00:36 UTC
Created attachment 358742 [details, diff]
A workaround for emerging zsh-5.0.2-r3 on prefix OS X

I was able to get zsh to emerge under prefix portage on OS X by replacing the explicitly numbered file descriptor of 3 with a variable name.  Here's the patch that I used.  It doesn't check the version of bash and given earlier comments on the thread it probably won't work with bash 3.  However it may be useful as a quick workaround for other prefixers.

Also, I have no idea if it has any bearing on the underlying bash bug, but /dev/fd/3 already exists on my computer, and ls reports it as a directory:
my-pc:files jgibson$ ls -l /dev/fd
total 0
crw--w----  1 jgibson  tty       16,   0 Sep 15 14:58 0
crw--w----  1 jgibson  tty       16,   0 Sep 15 14:58 1
crw--w----  1 jgibson  tty       16,   0 Sep 15 14:58 2
drw-r--r--  2 portage  portage       272 Sep 14 17:31 3
dr--r--r--  1 root     wheel           0 Sep 14 16:46 4

Inspecting either 3 or 4 more closely results in the following:
my-pc:files jgibson$ ls -l /dev/fd/3 /dev/fd/4
ls: /dev/fd/3: Bad file descriptor
ls: /dev/fd/4: Bad file descriptor

I've seen similar behavior on OS X 10.6, 10.7, and 10.8.
Comment 23 SpanKY gentoo-dev 2013-11-30 05:56:57 UTC
(In reply to Ulrich Müller from comment #19)

please post a reduced test case.  like a single shell script you can run `env -i` on and see the crash.  i can't take `eselect` upstream, and i don't have an OS X system to reduce on.
Comment 24 Ulrich Müller gentoo-dev 2013-11-30 09:18:19 UTC
(In reply to SpanKY from comment #23)
> please post a reduced test case.  like a single shell script you can run
> `env -i` on and see the crash.  i can't take `eselect` upstream, and i don't
> have an OS X system to reduce on.

Same problem here.

Matsui-san, John, or Prefix team, could you provide us with a minimal test case, please?
Comment 25 John Gibson 2013-12-24 06:25:30 UTC
I don't know if I can provide you with a minimal one.  I only really saw the issue when building zsh (and some other packages like tiff) with emerge.

What I can do is provide you with my prefix as an OS X disk image and then you can have the entire environment to experiment with.  The only downsides to this approach is that you'll need a Mac and the disk image is ~4 GB, which I'm guessing is a little large to upload to Bugzilla.  I have images available for 10.7 and 10.8 (and I could probably dig one up for 10.6 if necessary).
Comment 26 Ulrich Müller gentoo-dev 2013-12-24 07:17:20 UTC
I'd assume that this trivial script:

#!/bin/bash
exec 3>&2

should already be enough for reproducing the error, but I cannot verify this here. (John?)
Comment 27 Fabian Groffen gentoo-dev 2013-12-24 09:52:59 UTC
I cannot reproduce.  I tried emerging zsh as John reports, no problem whatsoever.

% uname -a
Darwin Phoebe.local 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386 i386 MacBook3,1 Darwin
% /usr/bin/sw_vers 
ProductName:    Mac OS X
ProductVersion: 10.6.8
BuildVersion:   10K549
% which bash
/Volumes/Scratch/Gentoo/bin/bash
% bash --version
GNU bash, version 4.2.39(1)-release (i386-apple-darwin10)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
% cat fd3.sh 
#!/usr/bin/env bash

cat ${BASH_SOURCE[0]}

exec 3>&2
% ./fd3.sh 
#!/usr/bin/env bash

cat ${BASH_SOURCE[0]}

exec 3>&2
Comment 28 Fabian Groffen gentoo-dev 2013-12-24 09:55:58 UTC
fwiw, the following bash version runs the exec fine too:

% /Volumes/Scratch/Gentoo64/bin/bash --version
GNU bash, version 4.2.36(1)-release (x86_64-apple-darwin9)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

(that is, 64-bits, Leopard -- Darwin 9, binary from before I upgraded my laptop)
Comment 29 Ulrich Müller gentoo-dev 2013-12-24 11:07:01 UTC
(In reply to Fabian Groffen from comment #27)
> % ./fd3.sh 

What happens if you start it with an "env -i" wrapper, as in the original report?
Comment 30 Fabian Groffen gentoo-dev 2013-12-24 14:06:42 UTC
(In reply to Ulrich Müller from comment #29)
> (In reply to Fabian Groffen from comment #27)
> > % ./fd3.sh 
> 
> What happens if you start it with an "env -i" wrapper, as in the original
> report?

the same, it just executes for me:

% env -i /Volumes/Scratch/Gentoo/bin/bash ./fd3.sh
#!/usr/bin/env bash

cat ${BASH_SOURCE[0]}

exec 3>&2
Comment 31 MATSUI Tetsushi 2013-12-28 05:22:11 UTC
My smallest case to reproduce the bug is the following:
#!/Users/tetsushi/Gentoo/bin/bash
exec 3>&2

die() {
	echo "foo"
}

Then it fails once per 100 executions or so with "Illegal instruction" like:
$ for i in `seq 100`; do env -i ./foo.sh; done
Illegal instruction
$ for i in `seq 100`; do env -i ./foo.sh; done
Illegal instruction
Illegal instruction
$ for i in `seq 100`; do env -i ./foo.sh; done
Comment 32 MATSUI Tetsushi 2013-12-28 10:11:36 UTC
Created attachment 366358 [details]
diagnostic report

Update: 
The crash occurs only with
$ for i in `seq 1000`; do env -i /Users/tetsushi/Gentoo/bin/sh -c 'exec 3>&2'; done

There are several (about 6 or 7) crashes in the 1000 executions.
OS X recodes these into logs in ~/Library/Logs/DiagnosticReports, and the attachment is one of them.
Comment 33 Fabian Groffen gentoo-dev 2013-12-28 10:28:42 UTC
Tried it, still cannot reproduce (32&&64).  Can you tell me what exact machine you're running this on?  In particular the cpu is important here.  Maybe the cflags don't match or something.
Comment 34 MATSUI Tetsushi 2013-12-29 14:47:45 UTC
I guess CFLAGS is irrelevant but if you want to know, I have used -march=nocona for Core 2 duo CPU.

I'm trying to bootstrap another prefix, and bash in it doesn't show the symptom. hum..
Comment 35 John Gibson 2013-12-30 19:28:22 UTC
I can confirm Matsui Tetsushi's experiments.  I have two machines, one a 2.53 Core 2 Duo running 10.7.5 and the second a 2.4 Core i7 running 10.8.5.

On both machines running the foo.sh script directly or via /Library/Gentoo/bin/sh foo.sh almost always results in an illegal instruction.  Running via env -i does not produce an illegal instruction.  However, the later test that he tried:
for i in `seq 1000`; do env -i /Library/Gentoo/bin/sh -c 'exec 3>&2'; done
Does produce a handful of illegal instructions, but only on the C2D, not on the i7.  I'll see if I can try a fresh bootstrap this week.

I did notice on the C2D that some other packages like tiff would fail to emerge with illegal instruction as well.  I played around with the -j make option when emerging and it when I would see the illegal instruction, but it would still fail.  I did not see those errors on the i7.
Comment 36 John Gibson 2014-01-12 19:20:04 UTC
I did a fresh bootstrap on both machines (the 10.7 C2D and the 10.8 i7) and got almost the same results.  The only difference was that on the C2D I had to increase the iterations in this loop to 10000 to see any illegal instructions.

for i in `seq 10000`; do env -i /Library/Gentoo/bin/sh -c 'exec 3>&2'; done
Comment 37 Fabian Groffen gentoo-dev 2014-01-12 19:33:59 UTC
I can reproduce this on my i7 too, will have to check on the c2d if upping the iteration count works
Comment 38 Fabian Groffen gentoo-dev 2014-03-01 09:43:45 UTC
I actually find that comment in the crashlog sort of interesting:


Application Specific Information:
BUG IN CLIENT OF LIBDISPATCH: Do not close random Unix descriptors

Thread 0:  Dispatch queue: com.apple.main-thread
0   libSystem.B.dylib             	0x9a5a1dc6 dup2 + 10
1   sh                            	0x0006a94b do_redirection_internal + 3458
2   sh                            	0x00069268 do_redirections + 113
3   sh                            	0x00021c9b execute_builtin_or_function + 53
4   sh                            	0x00020b5e execute_simple_command + 2509
5   sh                            	0x0001ab70 execute_command_internal + 1907
6   sh                            	0x00074fef parse_and_execute + 1076
7   sh                            	0x000034e9 run_one_command + 271
8   sh                            	0x00002564 main + 2384
9   sh                            	0x00001c09 start + 53