Bug 46981 - media-gfx/gnuplot-3.8j segfaults when trying to do a fit
|
Bug#:
46981
|
Product: Gentoo Linux
|
Version: unspecified
|
Platform: All
|
|
OS/Version: Linux
|
Status: RESOLVED
|
Severity: normal
|
Priority: P2
|
|
Resolution: FIXED
|
Assigned To: g2boojum@gentoo.org
|
Reported By: armin@despammed.com
|
|
Component: Applications
|
|
|
URL:
|
|
Summary: media-gfx/gnuplot-3.8j segfaults when trying to do a fit
|
|
Keywords:
|
|
Status Whiteboard:
|
|
Opened: 2004-04-06 10:58 0000
|
gnuplot segfaults after attempting to do a linear fit. The crash occurs while
the final results are printed.
Reproducible: Always
Steps to Reproduce:
1. start gnuplot
2. try to do fit [range] expression "filename" via parameters
3.
Actual Results:
The fit is successful; the program crashes when displaying the results as seen
below:
After 5 iterations the fit converged.
final sum of squares of residuals : 0.00181103
rel. change during last iteration : -9.97883e-07
degrees of freedom (ndf) : 10
rms of residuals (stdfit) = sqrt(WSSR/ndf) : 0.0134575
variance of residuals (reduced chisquare) = WSSR/ndf : 0.000181103
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = -14.6228 +/- 0.673 (4.602%)
Segmentation fault
Expected Results:
successful display of the fitting parameters.
repeating with "strace gnuplot" produces the following famous last words:
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++
the flags used when merging are:
CFLAGS="-pipe -O2 -finline -freorder-functions -freorder-blocks -ffast-math
-fomit-frame-pointer"
Could you provide specific data and fitting commands so that I can try it
out here using exactly the same input?
Created an attachment (id=28809) [details]
sample data
attached is a sample data file (sample.dat)
steps:
gnuplot
fit [0.1:0.12] a*x+b "sample.dat" using (1/($1)):(($3)**2) via a,b
at this point, the fit converges after 5 iterations and the gnuplot segfaults
after displaying the first line of the result:
[...]
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = -12.3508 +/- 0.4276 (3.462%)
Segmentation fault
Thank you very much for the data and example, I really appreciate it.
(Incidentally, I've been using gnuplot for ten years or so, and I had
no idea that it did fits. Cool!)
My guess is that it's your CFLAGS, but I can't reproduce it. Even w/
your CFLAGS gnuplot seems to work on my machine, so it might be a dependency.
Nonetheless, would you mind remerging w/
CFLAGS="-O2 -mcpu=i686 -fomit-frame-pointer"
and seeing if that works?
If that fails, the next step is to ask you to either strace gnuplot or
run it from gdb, either one of which should let us know where it's
segfaulting.
Thanks!
> gnuplot
G N U P L O T
Version 3.8j patchlevel 0
last modified Wed Nov 27 20:49:08 GMT 2002
System: Linux 2.6.3-gentoo-r1
Copyright(C) 1986 - 1993, 1999 - 2002
Thomas Williams, Colin Kelley and many others
This is a pre-version of gnuplot 4.0. Please refer to the documentation
for command syntax changes. The old syntax will be accepted throughout
the 4.0 series, but all save files use the new syntax.
Type `help` to access the on-line reference manual
The gnuplot FAQ is available from
http://www.gnuplot.info/faq/
Send comments and requests for help to <info-gnuplot-beta@dartmouth.edu> Send bugs, suggestions and mods to <info-gnuplot-beta@dartmouth.edu>
Terminal type set to 'x11'
gnuplot> fit [0.1:0.12] a*x+b "sample.dat" u (1/($1)):(($3)**2) via a,b
Iteration 0
WSSR : 14.761 delta(WSSR)/WSSR : 0
delta(WSSR) : 0 limit for stopping : 1e-05
lambda : 0.711307
initial set of free parameter values
a = 1
b = 1
/
Iteration 1
WSSR : 0.114571 delta(WSSR)/WSSR : -127.837
delta(WSSR) : -14.6464 limit for stopping : 1e-05
lambda : 0.0711307
resultant parameter values
a = 0.887916
b = 0.110179
/
Iteration 2
WSSR : 0.0831861 delta(WSSR)/WSSR : -0.377286
delta(WSSR) : -0.0313849 limit for stopping : 1e-05
lambda : 0.00711307
resultant parameter values
a = -0.454838
b = 0.230027
/
Iteration 3
WSSR : 0.00212404 delta(WSSR)/WSSR : -38.1641
delta(WSSR) : -0.0810621 limit for stopping : 1e-05
lambda : 0.000711307
resultant parameter values
a = -11.3808
b = 1.42104
/
Iteration 4
WSSR : 0.00158143 delta(WSSR)/WSSR : -0.343113
delta(WSSR) : -0.00054261 limit for stopping : 1e-05
lambda : 7.11307e-05
resultant parameter values
a = -12.3499
b = 1.52669
/
Iteration 5
WSSR : 0.00158143 delta(WSSR)/WSSR : -2.69974e-07
delta(WSSR) : -4.26945e-10 limit for stopping : 1e-05
lambda : 7.11307e-06
resultant parameter values
a = -12.3508
b = 1.52679
After 5 iterations the fit converged.
final sum of squares of residuals : 0.00158143
rel. change during last iteration : -2.69974e-07
degrees of freedom (ndf) : 15
rms of residuals (stdfit) = sqrt(WSSR/ndf) : 0.0102678
variance of residuals (reduced chisquare) = WSSR/ndf : 0.000105429
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = -12.3508 +/- 0.4276 (3.462%)
b = 1.52679 +/- 0.04668 (3.057%)
correlation matrix of the fit parameters:
a b
a 1.000
b -0.999 1.000
Can't remerge with -mcpu=i686 on amd64, sorry (the toolchain is 64bit only by
default and I don't have enough time to get a cross-compiler up and running).
Besides, using the same gnuplot version on 32bit x86 does not crash (at least,
it hasn't for me for anything).
The point is, it's marked as 'stable' on amd64. If this is indeed a 64bit
issue, is it possible to verify whether anything similar appears on ia64? (as
that is also marked stable).
a backtrace from gdb shows the following:
Program received signal SIGSEGV, Segmentation fault.
0x0000002a9663b936 in strnlen () from /lib/libc.so.6
(gdb) bt
#0 0x0000002a9663b936 in strnlen () from /lib/libc.so.6
#1 0x0000002a9661030f in vfprintf () from /lib/libc.so.6
#2 0x00000000004146c8 in init_color ()
#3 0x00000000004126d9 in init_color ()
#4 0x0000000000414566 in init_color ()
#5 0x000000000040ac6e in init_color ()
#6 0x000000000040a85f in init_color ()
#7 0x000000000040a74a in init_color ()
#8 0x0000000000435fff in matherr ()
#9 0x0000002a965e08b1 in __libc_start_main () from /lib/libc.so.6
#10 0x00000000004049aa in ?? ()
Oh! I'm sorry, I didn't realize you were using amd64. (In fact, reading back
through your bug report I don't see this information anywhere, although I
could have missed it.)
I can't test it here, so I'm going to reassign this bug to the amd64 folks and
I'll cc the ia64 team as well.
Thanks.
I can confirm this bug on my amd64 box. Following the output of nonstripped
backtrace (same command and sample data as Armin provided):
*cut off*
After 5 iterations the fit converged.
final sum of squares of residuals : 0.00158143
rel. change during last iteration : -2.69974e-07
degrees of freedom (ndf) : 15
rms of residuals (stdfit) = sqrt(WSSR/ndf) : 0.0102678
variance of residuals (reduced chisquare) = WSSR/ndf : 0.000105429
Final set of parameters Asymptotic Standard Error
======================= ==========================
a = -12.3508 +/- 0.4276 (3.462%)
Program received signal SIGSEGV, Segmentation fault.
0x0000002a9662bcb6 in strnlen () from /lib/libc.so.6
(gdb) bt
#0 0x0000002a9662bcb6 in strnlen () from /lib/libc.so.6
#1 0x0000002a966004aa in vfprintf () from /lib/libc.so.6
#2 0x0000000000414e4a in Dblfn ()
#3 0x0000000000412ba4 in regress ()
#4 0x00000000004145fb in fit_command ()
#5 0x000000000040af45 in command ()
#6 0x000000000040ab2c in do_line ()
#7 0x000000000040aa5d in com_line ()
#8 0x0000000000436e45 in main ()
(gdb)
I'll have a look on that later...
compiled the last available build on sf.net:
G N U P L O T
Version 3.8k patchlevel 3
last modified Mon Mar 29 15:17:53 CEST 2004
crashes in the same place. Here's a backtrace on the debug build:
(gdb) bt
#0 0x0000002a9663b936 in strnlen () from /lib/libc.so.6
#1 0x0000002a9661030f in vfprintf () from /lib/libc.so.6
#2 0x00000000004198f9 in Dblfn (fmt=0x4d3268 "%-15.15s = %-15g %-3.3s %-12.4g (%.4g%%)\n") at fit.c:1688
#3 0x00000000004173d8 in regress (a=0x648490) at fit.c:777
#4 0x0000000000419773 in fit_command () at fit.c:1638
#5 0x000000000040c7ec in command () at command.c:511
#6 0x000000000040c3ae in do_line () at command.c:368
#7 0x000000000040c28d in com_line () at command.c:327
#8 0x000000000044fcb7 in main (argc=1, argv=0x7fbffff398) at plot.c:626
It happens as it tries to write the last output line (a= [...] ) to the log file. Could this actually be a libc bug (as vprintf fails to print to a file a line that successfully printed to stdout)?
Nice backtrace.. :-) How about working on a patch?
Ok, uncommenting vfprintf(log_f, fmt, args); in src/fit.c:Dlbfn() solves the
segfault. Probably a sizeof() problem with log_f on 64bit Archs ?
Works for me now. The problem was less using vfprintf on a file but more
calling
it without re-initializing "args" via VA_START. Made a patch against it + a
patch
to the ebuild.
It does not seem necessary to remove printing to the log. Indeed, doing
va_end(args);
VA_START(args, fmt);
between the two vprintf() calls seems to take care of it.
after looking at the c99 standard, this seems to be a gnuplot bug rather than
an amd64 one. specifically, 7.15 par. 3 says that the va_list argument (args)
is supposed to be in an invalid state after the first vfprintf() call. So maybe
make the patch required for all archs until a fix is available upstream?
also posted as a bug on the gnuplot sf.net bugtracker here:
http://sourceforge.net/tracker/index.php?func=detail&aid=932162&group_id=2055&atid=102055
I've committed an amd64-specific ebuild for this to CVS. if this fix needs to
exist on all archs, then I'll leave that decision up to somebody other than
myself.
I'm re-assigning this bug to the maintainer for this package, but it should be
fixed on amd64 now. give it a bit to reach rsync.
Closing, since there's been no recent input.
Really closing this time.