Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 58702 - Coda 6.0.6 seems to exit shortly after startup without any error
Summary: Coda 6.0.6 seems to exit shortly after startup without any error
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Server (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-28 15:06 UTC by Maurice van der Pot (RETIRED)
Modified: 2005-12-05 11:10 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
STDERR output of coda emerge (codacomperr.log,6.90 KB, text/plain)
2004-07-29 14:35 UTC, Mike Nerone
Details
STDERR output of coda emerge (codacomperr.log,7.03 KB, text/plain)
2004-07-29 14:43 UTC, Mike Nerone
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maurice van der Pot (RETIRED) gentoo-dev 2004-07-28 15:06:15 UTC
This issue popped up after I moved to linux26-headers,
remerged glibc and remerged lwp/rvm/rpc2/coda.

I'm not sure what is the actual cause. It may not even
be coda's fault, but until I have done some debugging 
and find some solid evidence that it isn't I will keep 
this bug assigned to myself (as coda maintainer).

Anyone willing to help me track down the root cause of 
this problem can use this bug report to add comments.

To try out coda-6.0.6 with linux26-headers, replace 
the following dependency in the coda ebuild:

    >=sys-kernel/linux-headers-2.4

with this one:

    virtual/os-headers
Comment 1 Mike Nerone 2004-07-29 02:28:51 UTC
Are you referring to venus or vice? And what kind of time frame are we talking about before the failure?

I've had venus running for about 10 mins (like you, linux26-headers and remerged glibc) and it seems ok.
Comment 2 Maurice van der Pot (RETIRED) gentoo-dev 2004-07-29 02:49:09 UTC
Time frame is maybe 2 seconds. I noticed a lot of warnings 
during compilation. Did you see any? I'm using gcc 3.4.
Tonight I will debug some more and hopefully figure out 
what's causing it.
Comment 3 Mike Nerone 2004-07-29 14:35:06 UTC
Created attachment 36422 [details]
STDERR output of coda emerge

I'm using gcc-3.3.3-r6. There are some warning during the compile, but no more
than I see from 90% of compiles. :P I'm not a C programmer, though, so what do
I know? For your comparison pleasure, I've attached the STDERR of the emerge
(do glance at the odd mv error on the last line).

BTW, you didn't answer if you were referring to venus (client) or vice
(server), or both. I've had venus running for 12 hours now. Seems fine in my
environment (like I said in Bug #57996, though, I'm not actively using it...I'm
just keeping it merged for the time being to help you test).
Comment 4 Mike Nerone 2004-07-29 14:43:17 UTC
Created attachment 36425 [details]
STDERR output of coda emerge

Oops...mis-mimed it. *blush*
Comment 5 Maurice van der Pot (RETIRED) gentoo-dev 2004-07-29 14:55:06 UTC
I was referring to vice by the way.

This is what I normally see in the log:

[...]
22:18:12 Attached 1 volumes; 0 volumes not attached
lqman: Creating LockQueue Manager.....LockQueue Manager starting .....
22:18:12 LockQueue Manager just did a rvmlib_set_thread_data()
done
22:18:12 CallBackCheckLWP just did a rvmlib_set_thread_data()
22:18:12 CheckLWP just did a rvmlib_set_thread_data()
22:18:12 ServerLWP 0 just did a rvmlib_set_thread_data()
22:18:12 ServerLWP 1 just did a rvmlib_set_thread_data()
22:18:12 ServerLWP 2 just did a rvmlib_set_thread_data()
22:18:12 ServerLWP 3 just did a rvmlib_set_thread_data()
22:18:12 ServerLWP 4 just did a rvmlib_set_thread_data()
22:18:12 ServerLWP 5 just did a rvmlib_set_thread_data()
22:18:12 ResLWP-0 just did a rvmlib_set_thread_data()
22:18:12 ResLWP-1 just did a rvmlib_set_thread_data()
22:18:12 VolUtilLWP 0 just did a rvmlib_set_thread_data()
22:18:12 VolUtilLWP 1 just did a rvmlib_set_thread_data()
22:18:12 Starting SmonDaemon timer
22:18:12 File Server started Thu Jul 29 22:18:12 2004

And the last line I see in the log when it crashes:
22:40:56 Attached 1 volumes; 0 volumes not attached

There is nothing in the SrvErr log.


Some more info (what it was compiled with between parenthesis):

glibc (linux-headers + gcc3.4.1) + coda (linux-headers + gcc3.3.3) -> ok
glibc (linux-headers + gcc3.4.1) + coda (linux-headers + gcc3.4.1) -> ok
glibc (linux-headers + gcc3.4.1) + coda (linux26-headers + gcc3.4.1) -> ok
glibc (linux26-headers + gcc3.3.3) + coda (linux26-headers + gcc3.3.3) -> ok
glibc (linux26-headers + gcc3.3.3) + coda (linux26-headers + gcc3.4.1) -> ok
glibc (linux26-headers + gcc3.4.1) + coda (linux26-headers + gcc3.3.3) -> fail
glibc (linux26-headers + gcc3.4.1) + coda (linux26-headers + gcc3.4.1) -> fail

Apparently the problem is triggered when using glibc compiled with
linux26-headers and gcc 3.4.1

That's all I have time for today.
Comment 6 Maurice van der Pot (RETIRED) gentoo-dev 2004-07-30 19:28:35 UTC
The segfault occurs during initialisation of the LockQueue Manager.

When the stack for the LQM lwp is about to be mmapped, lwp_stackbase 
is set to 0x15027000. The stack size is 0x2000.

Here is the stack trace at the moment of the crash. I prefixed
each line with the value of the stack pointer:

    oops
      |
      V
0x15026874 #0  _IO_vfprintf (s=0x402b7de0, format=0x8110a08 "LockQueue                                                                      
               Manager starting .....\n", ap=0x15028f2c ",-./012345678
               9:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmn
               o
Comment 7 Maurice van der Pot (RETIRED) gentoo-dev 2004-07-30 19:28:35 UTC
The segfault occurs during initialisation of the LockQueue Manager.

When the stack for the LQM lwp is about to be mmapped, lwp_stackbase 
is set to 0x15027000. The stack size is 0x2000.

Here is the stack trace at the moment of the crash. I prefixed
each line with the value of the stack pointer:

    oops
      |
      V
0x15026874 #0  _IO_vfprintf (s=0x402b7de0, format=0x8110a08 "LockQueue                                                                      
               Manager starting .....\n", ap=0x15028f2c ",-./012345678
               9:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmn
               oªÓ\t@0\177\024\b") at vfprintf.c:185
0x15028f10 #1  0x401ecb14 in printf (format=0x402b7de0 "\200<­û") at printf.c:34
0x15028f28 #2  0x080c1a1d in lqman::func (this=0x816b008) at lockqueue.cc:92
0x15028fc8 #3  0x080c1845 in LQman_init (c=0x816b008) at lockqueue.cc:65
0x15028fd8 #4  0x4009cc64 in Create_Process_Part2 () at lwp.c:796
0x15028ff8 #5  0x4009dd1f in L1 () at process.S:455


Here is the code of vfprintf() when compiled with 2.6.7-rc4 kernel 
headers and gcc 3.4:

    0x401e3e82 <_IO_vfprintf+1>:    mov    %esp,%ebp
    0x401e3e84 <_IO_vfprintf+3>:    push   %edi
    0x401e3e85 <_IO_vfprintf+4>:    push   %esi
    0x401e3e86 <_IO_vfprintf+5>:    push   %ebx
    0x401e3e87 <_IO_vfprintf+6>:    call   0x401be346 <__i686.get_pc_thunk.bx>
    0x401e3e8c <_IO_vfprintf+11>:   add    $0xd623c,%ebx
    0x401e3e92 <_IO_vfprintf+17>:   sub    $0x2688,%esp
    0x401e3e98 <_IO_vfprintf+23>:   movl   $0x0,0xffffda98(%ebp)
    0x401e3ea2 <_IO_vfprintf+33>:   call   0x401be678 <*__GI___errno_location>
    0x401e3ea7 <_IO_vfprintf+38>:   mov    (%eax),%eax

Here is the same code when compiled with 2.6.7-rc4 headers and gcc 3.3:

    0x401e1a17 <vfprintf+0>:        push   %ebp
    0x401e1a18 <vfprintf+1>:        mov    %esp,%ebp
    0x401e1a1a <vfprintf+3>:        push   %edi
    0x401e1a1b <vfprintf+4>:        push   %esi
    0x401e1a1c <vfprintf+5>:        push   %ebx
    0x401e1a1d <vfprintf+6>:        call   0x401bd2b8 <__i686.get_pc_thunk.bx>
    0x401e1a22 <vfprintf+11>:       add    $0xcf666,%ebx
    0x401e1a28 <vfprintf+17>:       sub    $0x5c4,%esp
    0x401e1a2e <vfprintf+23>:       movl   $0x0,0xfffffb58(%ebp)
    0x401e1a38 <vfprintf+33>:       call   0x401bd5f4 <__errno_location>
    0x401e1a3d <vfprintf+38>:       mov    0xc(%ebp),%edi
    0x401e1a40 <vfprintf+41>:       mov    (%eax),%eax
Comment 8 Maurice van der Pot (RETIRED) gentoo-dev 2004-08-01 05:45:17 UTC
I'm reassigning this to toolchain.

By the way, I only just noticed that in the two code listings
at the end of my previous comment, one shows _IO_vfprintf while
the other shows vfprintf. Looks like different functions are
called after all. 

In any case 9K+ of stack space for local vars alone seems a bit excessive ;)

glibc version: glibc-2.3.4.20040619

Guys, if you need any extra info, let me know.

Portage 2.0.51_pre13 (default-x86-1.4, gcc-3.4.1, glibc-2.3.4.20040619-r0, 
2.6.8-rc2 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz)
=================================================================
System uname: 2.6.8-rc2 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz
Gentoo Base System version 1.5.1
distcc 2.16 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.3 [enabled]
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.5-r1
Binutils: sys-devel/binutils-2.14.90.0.8-r1
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-march=pentium4 -O0 -pipe -g3 -ggdb3"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.3/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=pentium4 -O0 -pipe -g3 -ggdb3"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache cvs digest fixpackages noclean nostrip sandbox 
sign userpriv usersandbox"
GENTOO_MIRRORS="http://ftp.snt.utwente.nl/pub/os/linux/gentoo http://ftp.gentoo.skynet.be/pub/gentoo/ http://ftp.uni-erlangen.de/pub/mirrors/gentoo http://mirrors.sec.informatik.tu-darmstadt.de/gentoo http://ftp.easynet.nl/mirror/gentoo/"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/home/griffon26/cvs-wa/gentoo-x86"
SYNC="rsync://griffon26.kfk4ever.com/gentoo-portage"
USE="X alsa apache2 avi bonobo cdr crypt cscope dedicated dga dvd encode 
ethereal fastcgi fbcon freetds gd gdbm ggi gif gpm gstreamer gtk gtk2 imlib 
ipv6 jikes joystick jpeg libwww lirc mad mcal memlimit mikmod mmx motif mozilla 
mpeg mpi mysql ncurses nls nocd oggvorbis opengl pam pdflib perl plotutils png 
pnp qt quicktime readline samba sdl slang snmp sse ssl svga tcltk tcpd tiff 
truetype trusted usb wmf wxwindows x86 xml xml2 xmms xosd xv zlib"
Comment 9 SpanKY gentoo-dev 2005-12-02 16:45:05 UTC
can you rebuild with gcc-3.4.4-r1 and see if it works ?  this smells like a bug
we fixed with 3.4.4-r1 ...
Comment 10 Maurice van der Pot (RETIRED) gentoo-dev 2005-12-05 11:10:36 UTC
Come to think of it, I haven't seen this in quite a while. Even though my system
has been using 2.6 headers and gcc 3.4 for ages.

Must have been fixed.