Bug 153939 - mail-filter/bogofilter - crashes on token ' id '
Bug#: 153939 Product:  Gentoo Linux Version: 2006.1 Platform: All
OS/Version: Linux Status: RESOLVED Severity: normal Priority: P2
Resolution: FIXED Assigned To: net-mail@gentoo.org Reported By: dang@gentoo.org
Component: Applications
URL: 
Summary: mail-filter/bogofilter - crashes on token ' id '
Keywords:  
Status Whiteboard: 
Opened: 2006-11-03 07:57 0000
Description:   Opened: 2006-11-03 07:57 0000
While batch processing a huge junk file, I had numerous cores in bogofilter
that looked like this:

#0  0x0000000000413b7f in parse_new_token (token=0x7fff424891c0) at token.c:355
355                     if (text[leng-1] == '>') {
(gdb) bt
#0  0x0000000000413b7f in parse_new_token (token=0x7fff424891c0) at token.c:355
#1  0x00000000004135cc in get_token (token=0x7fff424891c0) at token.c:157
#2  0x0000000000406974 in collect_words (wh=0x575150) at collect.c:48
#3  0x0000000000402d6e in bogofilter (argc=0, argv=0x7fff424893e0) at
bogofilter.c:97
#4  0x0000000000404d80 in bogomain (argc=1, argv=0x7fff424893d8) at
bogomain.c:67
#5  0x0000000000402fea in main (argc=1, argv=0x7fff424893d8) at main.c:31
(gdb) p leng
$1 = 0
(gdb) p token->leng
$2 = 4
(gdb) p text[-4]
$4 = 32 ' '
(gdb) p text[-3]
$5 = 105 'i'
(gdb) p text[-2]
$6 = 100 'd'
(gdb) p text[-1]
$7 = 32 ' '


Naive fix attached, which worked for me to remove all the cores.

------- Comment #1 From Daniel Gryniewicz 2006-11-03 07:57:47 0000 -------
Created an attachment (id=101144) [details]
Proposed fix

------- Comment #2 From David Relson 2006-11-04 06:18:01 0000 -------
Created an attachment (id=101217) [details]
lexer change to fix QUEUE_ID problem

Torsten,

I've reproduced the problem by splitting "id <identification>" across two
lines.  The parser expects a single space in the middle of the QUEUE_ID. My fix
is to allow spaces and newlines.

David

------- Comment #3 From Torsten Veller 2006-11-04 07:28:05 0000 -------
Thank you, David.

Daniel, can you please test it with your huge junk file?

------- Comment #4 From Daniel Gryniewicz 2006-11-04 13:21:23 0000 -------
Didn't fix it entirely, I still get the same core, but only 4 instead of 12, so
it helped.  Maybe I can knock up a script to figure out which mails cause the
core.

------- Comment #5 From Daniel Gryniewicz 2006-11-04 13:39:12 0000 -------
Created an attachment (id=101243) [details]
Emails that cause bogofiler to crash

Okay, here's the set of emails that cause bogofilter to crash even with the
lexer change.  I've also sent this to the upstream maintainer in response to
the email thread.

------- Comment #6 From Daniel Gryniewicz 2006-11-04 14:14:50 0000 -------
Created an attachment (id=101246) [details]
bogofilter.cf

------- Comment #7 From Daniel Gryniewicz 2006-11-04 14:34:21 0000 -------
This appears to be related to being on amd64.

Portage 2.1.2_rc1-r3 (default-linux/amd64/2006.1, gcc-4.1.1, glibc-2.5-r0,
2.6.18-ck1-dfg3 x86_64)
=================================================================
System uname: 2.6.18-ck1-dfg3 x86_64 AMD Turion(tm) 64 Mobile Technology ML-28
Gentoo Base System version 1.12.6
Last Sync: Unknown
distcc 2.18.3 x86_64-pc-linux-gnu (protocols 1 and 2) (default port 3632)
[enabled]
ccache version 2.4 [enabled]
app-admin/eselect-compiler: [Not Present]
dev-java/java-config: [Not Present]
dev-lang/python:     2.4.3-r4
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     2.4-r6
dev-util/confcache:  [Not Present]
sys-apps/sandbox:    1.2.18.1
sys-devel/autoconf:  2.13, 2.60
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10
sys-devel/binutils:  2.17
sys-devel/gcc-config: 1.3.14
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.17-r1
ACCEPT_KEYWORDS="amd64 ~amd64"
AUTOCLEAN="yes"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=athlon64 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo
/etc/texmf/web2c"
CXXFLAGS="-march=athlon64 -O2 -pipe"
DISTDIR="/home/portage/distfiles"
FEATURES="autoconfig ccache collision-protect cvs distcc distlocks
metadata-transfer multilib-strict nostrip sandbox sfperms sign strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org
http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LC_ALL="en_US.utf8"
LDFLAGS="-Wl,--as-needed"
LINGUAS="en en_US de"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress
--force --whole-file --delete --delete-after --stats --timeout=180
--exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/home/portage"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/home/portage/overlays/gnome /home/portage/overlays/dang
/home/portage/overlays/uml-overlay /home/portage/overlays/dang.overlays/user
/home/portage/overlays/dang.overlays/maintainer /home/portage/overlays/sunrise"
SYNC="rsync://nobody.nowhere.foo/fuckoff"
USE="amd64 X aac aalib accessibility acpi alsa apache2 artworkextra avahi avi
bash-completion beagle berkdb bitmap-fonts bzip2 cairo cdr cli cracklib crypt
cscope cups db2 dbus dga directfb divx4linux dlloader dri dvd eds elibc_glibc
encode ethereal evo exif fam firefox flac font-server foomaticdb gd gdbm gif
gimpprint gnome gnutls gphoto2 gstreamer gtk gtk2 gtkhtml guile hal howl iconv
imagemagick imap imlib input_devices_evdev input_devices_keyboard
input_devices_mouse input_devices_synaptics ipv6 isdnlog jabber jpeg
kernel_linux ldap libg++ libnotify libwww linguas_de linguas_en linguas_en_US
logrotate lzo lzw mad matroska mikmod mime mmap mng mono motif mozilla mpeg mpi
ncurses nls nptl nptlonly offensive ogg opengl oscar oss pam pcmcia pcre pda
pdflib perl png ppds pppd python qemu-fast quicktime readline reflection rtc
samba sdl session sharedmem slang slp smime soap softmmu speex spell spl sqlite
sse3 ssl svg tcpd theora tiff truetype truetype-fonts type1-fonts udev unicode
usb userland_GNU video_cards_ati video_cards_radeon video_cards_vesa vorbis
wifi wmf xface xml xml2 xorg xsl xv xvid zlib"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG,
PORTAGE_RSYNC_EXTRA_OPTS


[17:29:56 athena] ~> bogofilter -V
bogofilter version 1.1.1
    Database: Sleepycat Software: Berkeley DB 4.3.29: (September  6, 2005)
AUTO-XA
Copyright (C) 2002-2003 Eric S. Raymond, Adrian Otto, Gyepi Sam.
Copyright (C) 2002-2006 David Relson, Matthias Andree, Greg Louis

bogofilter comes with ABSOLUTELY NO WARRANTY.  This is free software, and
you are welcome to redistribute it under the General Public License.  See
the COPYING file with the source distribution for details.

------- Comment #8 From Daniel Gryniewicz 2006-11-04 15:41:02 0000 -------
Created an attachment (id=101249) [details]
Replacement patch

I had to change the definition of ID as well.  With this patch, I get output
from all those emails.

------- Comment #9 From Torsten Veller 2006-11-08 02:24:22 0000 -------
bogofilter-1.1.1-r1 committed with the patch.