Summary of Bug: With an asterisk-1.2.4 using festival-1.4.3-r3, when asterisk is configured for cacheing festival speech files, festival goes into infinite loop . During operation of festival (i.e. request to server for text to speech translation) will eventually causes festival to use 90%+ of system CPU, and cause system lockup. How to create the problem: 1) configure "festival.conf" (in the /etc/asterisk directory) and set "usecache=yes". 2) setup an asterisk extension to use festival (extensions.conf): exten => s,1, Festival('Testing Festival Speech') exten => s,2, Festival('All work and no play make jack a dull boy') exten => s,3, Festival('All of you be damned, we cant have heaven crammed') exten => s,4, Festival('Rock and Roll will never die') 3) Activate extension. Sometimes requires a repeat. Reproduceable: Yes - problem reproduced on two Gentoo Systems (AMD Sempron 3100+ and Intel P4 2.8G). Deactivating the cache feature eliminates the problem. emerge info: Portage 2.0.54 (default-linux/x86/2005.1, gcc-3.4.4, glibc-2.3.5-r2, 2.6.15-gentoo-r1ast01 i686) ================================================================= System uname: 2.6.15-gentoo-r1ast01 i686 AMD Sempron(tm) Processor 3100+ Gentoo Base System version 1.6.14 distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] ccache version 2.3 [enabled] dev-lang/python: 2.3.5-r2, 2.4.2 sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=athlon-xp -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/lib/mozilla/defaults/pref /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/bind /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -march=athlon-xp -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig ccache distcc distlocks sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" MAKEOPTS="-j8" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 16bit 3dnow 3dnowext 7zip X a52 aac aalib acpi aim alsa amd apache2 apm arts asf asterisk async audiofile automount avi berkdb bitmap-fonts bonobo bootsplash bzip2 ccache cdparanoia cdr clamav clamd cli command-args cpdflib cpudetection crypt css cups curl dba dbus dga divx4linux dts dv dvd dvdr dvdread edl eds emboss encode enscript esd exif expat extensions fam fame fastcgi fat ffmpeg flac font-server fontconfig foomaticdb fortran fping fpx ftp gcj gd gdbm geometry gif gimpprint glut glx gnome gnome-print gphoto2 gpm gs gstreamer gtk gtk2 gtkhtml guile hal icecast idn ieee1394 imagemagick imap imlib innodb inode iodbc jack jai java java-internal javamail javascript jbig jce jp2 jpeg jpeg2k junit kde kdexdeltas lame largeterminal lcms ldap libcaca libg++ libgd libvisual libwww live lm_sensors lzo lzw mad mbox mcal mhash mikmod milter mime ming mjpeg mmap mmx mmxext mng motif moznocompose moznoirc moznomail mp3 mp4live mpeg mpeg2 mpeg4 mpi mplayer mpm-worker mysql mysqli nagios-dns nagios-ntp nagios-ping nagios-ssh nas ncurses netpbm network nfs nls nptl ntfs objc odbc ofx ogg oggvorbis on-the-fly-crypt openal opengl oss pam pcre pdflib pear perl php png postgres ppds python qt quicktime rar readline real recode rtc samba sasl sdl session sharedext sharedmem silc slang sndfile sockets speex spell sql sqlite sse sse2 ssl subtitles subversion tcltk tcpd tetex threads tidy tiff tokenizer transcode truetype truetype-fonts type1-fonts udev unicode usb v4l v4l2 vcd vcdimager vidix vim vim-pager vim-with-x vmdbmysql vorbis win32codecs wmf xine xml xml2 xmlrpc xmms xosd xpm xsl xslt xv xvid xvmc yv12 zapras zip zlib zvbi userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS
trying to reproduce the problem
please attach your festival.conf
Created attachment 81569 [details] Festival.conf Setting the "usecache=yes" creates the problem for me.
additional configuration info: asterisk use flags: +alsa -bri +curl -debug +doc +gtk -h323 -hardened -lowmem +mmx +mysql -nosamples +odbc -osp -postgres -pri +speex +sqlite +ssl -ukcid +zaptel festival use flags +asterisk -doc
does /tmp/asterisk/cache exist? what are the permissions on that directory?
/tmp: drwxrwxrwt 31 root root 4096 Mar 6 15:50 /tmp /tmp/asterisk: drwxr-xr-x 3 asterisk asterisk 4096 Mar 4 23:36 /tmp/asterisk /tmp/asterisk/cache: drwxr-xr-x 2 asterisk asterisk 4096 Mar 4 23:53 /tmp/asterisk/cache
Created attachment 81571 [details, diff] patch for festival-1.4.3 to fix (possible) endless loop in case of a configuration error the lockup is caused by missing error handling in the festival server code. the server prozess spawns a new child process for every request and uses waitpid (in a loop) to wait for child processes that have finished. in case of an error waitpid will return -1 (e.g. if one of the children dies prematurely) and the main process will end up stuck in the waitpid loop, consuming 100% of cpu time. Attached patch adds some error handling to avoid that. However one issue remains: dead children may end up as zombie processes and are not removed by the server process. I have no idea how to avoid this at the moment, but that situation is still better than before.
is ;(voice_us1_mbrola) enabled in your /etc/festival/server.scm file? if yes, is mbrola installed?
mbrola is installed, but not used (commented out in the config file). I was using the default voice. interesting info on the patch. Running in non-cache mode, I am getting the following process hanging around: root 21736 15323 0 16:42 ? 00:00:00 [festival] <defunct> didn't seem to be a problem, but odd, none the less the festival servicer process is: root 15323 1 0 Mar05 ? 00:00:00 /usr/bin/festival --server -b /etc/festival/server.scm
hmm ok, i think that's because the server process doesn't wait in the loop anymore, that meaning those zombie processes will be killed after the next request. i guess the only real solution would be to rewrite the server loop that handles new incoming connections. maybe i can get something working in the next couple of days.
Sorry, I should have clarified that the festival defunct process was with the unpatched release. I will make changes to the ebuild and patch it, and see if it helps. BTW - I saw some posts that you may be upgrading to latest release (festival 1.9.5?). Looks like that some very good new voices. It that being posted to portage anytime soon, and should I wait for that to be releases, along with the newest version of asterisk (1.2.5)? I had to unmask asterisk 1.2.4, but it seems to be working just fine.
applied the patch to 1.4.3-r3, and set cache to yes. With asterisk/festival on the same host, I verfied cache is working - it is creating files the cache directory, and festival cpu usage appears low on repeat phrases. No loops or lockups as yet in my limited testing. I will try the festival network server config, and repeat the test. The <defunct> process is still there, but appears not to be a problem. The patch also solved other issue. When asterisk called festival, it would log a large number (50 - 100) of event messages with "utils.c negative timestamp error" for each festival playback. These messages have now stopped after appling the patch. Thanks!!
Created attachment 82032 [details, diff] reimplementation of the main loop based on select() new patch changes the main loop in the festival server to use a select call with timeout. after each services request / timeout waitpid is called to cleanup child tasks.
Festival 1.95 is now in portage. Is this still happening?
All, is this still an issue with festival 1.95-beta?
I am closing this since I haven't heard whether this is continuing to be an issue with festival 1.95.