Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 131174 - gcc crashed unexpecdedly when used with a cluster kernel "openmosix-sources-2.4.32" && with dinamic process migration
Summary: gcc crashed unexpecdedly when used with a cluster kernel "openmosix-sources-2...
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High major (vote)
Assignee: Gentoo Linux bug wranglers
URL:
Whiteboard:
Keywords:
: 130460 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-04-24 20:38 UTC by kerzol
Modified: 2006-04-28 22:36 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kerzol 2006-04-24 20:38:32 UTC
+++ This bug was initially created as a clone of Bug #130460 +++

*******
kerzol@darkside ~ $ emerge info
Portage 2.0.51.22-r2 (default-linux/x86/no-nptl/2.4, gcc-3.3.6, glibc-2.3.5-r2, 2.4.32-openmosix-419 i686)
=================================================================
System uname: 2.4.32-openmosix-419 i686 Celeron (Coppermine)
Gentoo Base System version 1.6.12
dev-lang/python:     2.3.5
sys-apps/sandbox:    1.2.11
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r10
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -pipe -fomit-frame-pointer -mcpu=i686"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/eselect/compiler /etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -pipe -fomit-frame-pointer -mcpu=i686"
DISTDIR="/var/distfiles"
FEATURES="autoconfig buildpkg distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://oper.asu.ru/pub/Linux/Gentoo"
MAKEOPTS=""
PKGDIR="/var/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="x86 X alsa apache2 apm arts audiofile avi bash-completion berkdb bitmap-fonts bzip2 cli crypt ctype cups dba dri eds emboss encode expat fastbuild foomaticdb force-cgi-redirect fortran ftp gd gdbm gif gnome gpm gstreamer gtk gtk2 imlib jpeg kde libg++ libwww mad memlimit mikmod mmx mp3 mpeg ncurses nls ogg opengl oss pam pcre pdflib perl png posix python qt quicktime readline sdl session simplexml soap sockets spell spl ssl tcpd tiff tokenizer truetype truetype-fonts type1-fonts unicode vorbis xml xmms xsl xv zlib userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS
************/var/log/kern.log (filtered)
Apr 15 14:44:16 darkside kernel: Unable to handle kernel NULL pointer dereference at virtual
address 0000004c
Apr 15 14:44:16 darkside kernel:  printing eip:
Apr 15 14:44:16 darkside kernel: c017141d
Apr 15 14:44:16 darkside kernel: *pde = 00000000
Apr 15 14:44:16 darkside kernel: Oops: 0002
Apr 15 14:44:16 darkside kernel: CPU:    0
Apr 15 14:44:16 darkside kernel: EIP:    0010:[set_brk+61/176]    Not tainted
Apr 15 14:44:16 darkside kernel: EIP:    0010:[<c017141d>]    Not tainted
Apr 15 14:44:16 darkside kernel: EFLAGS: 00010286
Apr 15 14:44:16 darkside kernel: eax: 00000000   ebx: 00000000   ecx: 0804ac58   edx: 0000000
0
Apr 15 14:44:16 darkside kernel: esi: 0804b000   edi: 0804b000   ebp: c76a9d34   esp: c76a9b2
4
Apr 15 14:44:16 darkside kernel: ds: 0018   es: 0018   ss: 0018
Apr 15 14:44:16 darkside kernel: Process gcc (pid: 3235, stackpage=c76a9000)
Apr 15 14:44:16 darkside kernel: Stack: c76a9b28 00000000 00000000 c76ab9c0 0804aaac c01734a9
 0804ac50 0804ac58 
Apr 15 14:44:16 darkside kernel:        00000c50 00000003 00001812 00000001 c76ca034 000005c8
 00000001 00010ba0 
Apr 15 14:44:16 darkside kernel:        00000020 00000296 c76d4ca0 cac67000 cac67000 00000000
 00000002 000001a4 
Apr 15 14:44:16 darkside kernel: Call Trace:    [load_elf_binary+1065/4064] [inet_recvmsg+80/
112] [load_elf_binary+0/4064] [search_binary_handler+308/464] [do_execve+329/800]
Apr 15 14:44:16 darkside kernel: Call Trace:    [<c01734a9>] [<c023fa30>] [<c0173080>] [<c015
a504>] [<c015a6e9>]
Apr 15 14:44:16 darkside kernel:   [sys_execve+66/128] [call_with_regs+75/148] [deputy_syscal
l+499/688] [sys_execve+0/128] [deputy_main_loop+897/1376] [mosix_pre_usermode_actions+138/176
]
Apr 15 14:44:16 darkside kernel:   [<c01079b2>] [<c010ba3f>] [<c0196ce3>] [<c0107970>] [<c019
5701>] [<c019bfba>]
Apr 15 14:44:16 darkside kernel:   [straight_to_mosix+5/13]
Apr 15 14:44:16 darkside kernel:   [<c010b91a>]
Apr 15 14:44:16 darkside kernel: 
Apr 15 14:44:16 darkside kernel: Code: 89 78 4c 89 78 48 31 c0 8b 5c 24 08 8b 74 24 0c 8b 7c 
24 10 
**************/var/log/kern.log (filtered)
Apr 15 14:49:03 darkside kernel:  <1>Unable to handle kernel NULL pointer dereference at virt
ual address 0000004c
Apr 15 14:49:03 darkside kernel:  printing eip:
Apr 15 14:49:03 darkside kernel: c017141d
Apr 15 14:49:03 darkside kernel: *pde = 00000000
Apr 15 14:49:03 darkside kernel: Oops: 0002
Apr 15 14:49:03 darkside kernel: CPU:    0
Apr 15 14:49:03 darkside kernel: EIP:    0010:[set_brk+61/176]    Not tainted
Apr 15 14:49:03 darkside kernel: EIP:    0010:[<c017141d>]    Not tainted
Apr 15 14:49:03 darkside kernel: EFLAGS: 00010286
Apr 15 14:49:03 darkside kernel: eax: 00000000   ebx: 00000000   ecx: 0804ac58   edx: 0000000
0
Apr 15 14:49:03 darkside kernel: esi: 0804b000   edi: 0804b000   ebp: c9a6dd34   esp: c9a6db2
4
Apr 15 14:49:03 darkside kernel: ds: 0018   es: 0018   ss: 0018
Apr 15 14:49:03 darkside kernel: Process gcc (pid: 3523, stackpage=c9a6d000)
Apr 15 14:49:03 darkside kernel: Stack: c9a6db28 00000000 00000000 c91b1e80 0804aaac c01734a9
 0804ac50 0804ac58 
Apr 15 14:49:03 darkside kernel:        00000c50 00000003 00001812 00000001 c98aa2af c7f6c07c
 00000001 0001974e 
Apr 15 14:49:03 darkside kernel:        00000020 00000296 ca02eb20 c7d12400 c7d12400 00000000
 00000002 000001a4 
Apr 15 14:49:03 darkside kernel: Call Trace:    [load_elf_binary+1065/4064] [inet_recvmsg+80/
112] [load_elf_binary+0/4064] [search_binary_handler+308/464] [do_execve+329/800]
Apr 15 14:49:03 darkside kernel: Call Trace:    [<c01734a9>] [<c023fa30>] [<c0173080>] [<c015
a504>] [<c015a6e9>]
Apr 15 14:49:03 darkside kernel:   [sys_execve+66/128] [call_with_regs+75/148] [deputy_syscal
l+499/688] [sys_execve+0/128] [deputy_main_loop+897/1376] [mosix_pre_usermode_actions+138/176
]
Apr 15 14:49:03 darkside kernel:   [<c01079b2>] [<c010ba3f>] [<c0196ce3>] [<c0107970>] [<c019
5701>] [<c019bfba>]
Apr 15 14:49:03 darkside kernel:   [straight_to_mosix+5/13]
Apr 15 14:49:03 darkside kernel:   [<c010b91a>]
Apr 15 14:49:03 darkside kernel: 
Apr 15 14:49:03 darkside kernel: Code: 89 78 4c 89 78 48 31 c0 8b 5c 24 08 8b 74 24 0c 8b 7c 
24 10 
****************
Comment 1 kerzol 2006-04-24 20:43:06 UTC
*** Bug 130460 has been marked as a duplicate of this bug. ***
Comment 2 Jakub Moc (RETIRED) gentoo-dev 2006-04-25 00:59:42 UTC
Not interested in gcc-3.3.6 bugs. Reopen if you can reproduce the problem w/ latest stable gcc.
Comment 3 kerzol 2006-04-27 04:31:45 UTC
Please note that this problem is not gcc-specific, but rather kernel-specific.  
I believe the kernel isn't supposed to catch ``kernel NULL pointer dereference'' when running non-privileged user programs, so I think the bug should be re-opened. 

Indeed, the very same happens with some other processes as well, e. g.:

Apr 18 15:40:55 darkside kernel:  <1>Unable to handle kernel NULL pointer dereference at virtual address 0000004c
Apr 18 15:40:55 darkside kernel:  printing eip:
Apr 18 15:40:55 darkside kernel: c017146d
Apr 18 15:40:55 darkside kernel: *pde = 00000000
Apr 18 15:40:55 darkside kernel: Oops: 0002
Apr 18 15:40:55 darkside kernel: CPU:    0
Apr 18 15:40:55 darkside kernel: EIP:    0010:[set_brk+61/176]    Not tainted
Apr 18 15:40:55 darkside kernel: EIP:    0010:[<c017146d>]    Not tainted
Apr 18 15:40:55 darkside kernel: EFLAGS: 00010286
Apr 18 15:40:55 darkside kernel: eax: 00000000   ebx: 00000000   ecx: 0804e988   edx: 00000000
Apr 18 15:40:55 darkside kernel: esi: 0804f000   edi: 0804f000   ebp: cabb5d34   esp: cabb5b24
Apr 18 15:40:55 darkside kernel: ds: 0018   es: 0018   ss: 0018
Apr 18 15:40:55 darkside kernel: Process mpirun (pid: 8036, stackpage=cabb5000)
Apr 18 15:40:55 darkside kernel: Stack: cabb5b28 00000000 00000000 c76c84c0 0804e000 c01734f9 0804e5d0 0804e988
Apr 18 15:40:55 darkside kernel:        000005d0 00000003 00001812 00000006 c840a7bb c995107c 00000001 00004d27
Apr 18 15:40:55 darkside kernel:        00000020 00000296 c7905080 cae5e400 cae5e400 00000000 00000002 000005d0
Apr 18 15:40:55 darkside kernel: Call Trace:    [load_elf_binary+1065/4064] [inet_recvmsg+80/112] [load_elf_binary+0/4064] [search_binary_handler+308/464] [do_execve+329/800]
Apr 18 15:40:55 darkside kernel: Call Trace:    [<c01734f9>] [<c023faf0>] [<c01730d0>] [<c015a524>] [<c015a709>]
Apr 18 15:40:55 darkside kernel:   [sys_execve+66/128] [call_with_regs+75/148] [deputy_syscall+499/688] [sys_execve+0/128] [deputy_main_loop+897/1376] [mosix_pre_usermode_actions+138/176]
Apr 18 15:40:55 darkside kernel:   [<c01079f2>] [<c010ba7f>] [<c0196d33>] [<c01079b0>] [<c0195751>] [<c019c00a>]
Apr 18 15:40:55 darkside kernel:   [straight_to_mosix+5/13]
Apr 18 15:40:55 darkside kernel:   [<c010b95a>]
Apr 18 15:40:55 darkside kernel:
Apr 18 15:40:55 darkside kernel: Code: 89 78 4c 89 78 48 31 c0 8b 5c 24 08 8b 74 24 0c 8b 7c 24 10

However, I will upgrade gcc and report whether it still breaks or not.
Comment 4 Jakub Moc (RETIRED) gentoo-dev 2006-04-27 04:34:40 UTC
Well, if your kernel randomly crashes when running random processes, then check your hardware first (RAM, overheating, etc.)
Comment 5 kerzol 2006-04-28 22:36:04 UTC
(In reply to comment #4)
> Well, if your kernel randomly crashes when running random processes,
> then check your hardware first (RAM, overheating, etc.)

The crashes aren't completely random (look at the call traces below), so the processes.
Indeed, the system runs quite stable when process migration is disabled, and it looks like the processes crash just after the migration is done.

> Apr 15 14:49:03 darkside kernel: Process
> gcc (pid: 3523, stackpage=c9a6d000)
...
> Apr 15 14:49:03 darkside kernel: Call Trace:
> [load_elf_binary+1065/4064] [inet_recvmsg+80/112]
> [load_elf_binary+0/4064] [search_binary_handler+308/464]
> [do_execve+329/800]

> Apr 18 15:40:55 darkside kernel: Process
> mpirun (pid: 8036, stackpage=cabb5000)
...
> Apr 18 15:40:55 darkside kernel: Call Trace:
> [load_elf_binary+1065/4064] [inet_recvmsg+80/112]
> [load_elf_binary+0/4064] [search_binary_handler+308/464]
> [do_execve+329/800]