Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 346253 - app-misc/screen hangs on reattach
Summary: app-misc/screen hangs on reattach
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal with 1 vote (vote)
Assignee: Sven Wegener
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-21 04:13 UTC by Timothy Miller
Modified: 2017-01-21 13:06 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timothy Miller 2010-11-21 04:13:26 UTC
I believe I've run into a bug that occurs intermittently that been known about since 2005, but the maintainers of gnu screen have ignored it.  There's an article on it at "http://churchturing.org/w/screen/"; scroll down to the section entitled "Contribution".  

The problem occurs when you're running gnu screen, and you drop the connection.  When you reconnect later and try to reattach, sometimes that reattach fails in the form of "screen -r -d" just hanging.

I've attached a debugger, and this is what I see:

0x00007f59953073c3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82
82	../sysdeps/unix/syscall-template.S: No such file or directory.
	in ../sysdeps/unix/syscall-template.S
(gdb) where
#0  0x00007f59953073c3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1  0x0000000000441ccc in sched () at sched.c:124
#2  0x00000000004062b0 in main (ac=<value optimized out>, av=<value optimized out>) at screen.c:1365

I've tried all sorts of things to get this unstuck.  Attach/detach a debugger, send a signal, various command-line options to screen.  Nothing helps.  I've also had this bug hit me several times.  I don't know what's making it happen for me more than for most people, but clearly, I'm not the only one that's experienced it.

Since there's a patch available, I wonder if I couldn't encourage the Gentoo devs to (a) apply the patch to the Gentoo source, and (b) put some pressure on the upstream to do the same.

Here's a post on the gnu screen mailing list, where they claim there's no fix, although Hindle seems to differ:  http://www.mail-archive.com/screen-users@gnu.org/msg02595.html

I'm going to leave screen in this stuck state for a while in case anyone wants me to do any probing with the debugger.

BTW, I contacted Hindle.  He says that although screen refuses to apply the patch, Debian has accepted it because it works.  I strongly urge the Gentoo devs to follow suit.

Reproducible: Always




$ emerge --info
Portage 2.1.9.24 (default/linux/amd64/10.0, gcc-4.4.5, glibc-2.12.1-r3, 2.6.35-gentoo-r5 x86_64)
=================================================================
System uname: Linux-2.6.35-gentoo-r5-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q9450_@_2.66GHz-with-gentoo-2.0.1
Timestamp of tree: Sat, 20 Nov 2010 08:00:01 +0000
app-shells/bash:     4.1_p9
dev-java/java-config: 2.1.11-r2
dev-lang/python:     2.6.6-r1, 3.1.2-r4
dev-util/cmake:      2.8.1-r2
sys-apps/baselayout: 2.0.1-r1
sys-apps/openrc:     0.6.4
sys-apps/sandbox:    2.3-r1
sys-devel/autoconf:  2.13, 2.68
sys-devel/automake:  1.8.5-r4, 1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.5
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.4
sys-devel/make:      3.82
virtual/os-headers:  2.6.35 (sys-kernel/linux-headers)
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=core2 -ggdb -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -march=core2 -ggdb -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--jobs=2"
FEATURES="assume-digests binpkg-logs distlocks fixlafiles fixpackages news parallel-fetch protect-owned sandbox sfperms splitdebug strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://gentoo.netnitco.net http://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ ftp://mirror.datapipe.net/gentoo ftp://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ http://gentoo.mirrors.easynews.com/linux/gentoo/ ftp://ftp.free.fr/mirrors/ftp.gentoo.org/ ftp://gentoo.imj.fr/pub/gentoo/ ftp://distro.ibiblio.org/pub/linux/distributions/gentoo/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en en_US"
MAKEOPTS="--jobs=3 --load-average=7"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac aalib acl acpi alsa amd64 apache2 aspell autotrace bash-completion berkdb bidi bonjour bzip2 cairo cdda cdio cdr cli composite cracklib crypt ctype cups curl cxx dbus device-mapper dri dts dvd dvdr encode exif extras fbcon ffmpeg fftw filter flac fontconfig fortran freetype gcj gd gdbm git glib gmm gnutls gpm graphviz gs httpd iconv imagemagick ipp ipv6 ithreads jadetex java jpeg jpeg2k kde kde4 kerberos kpathsea kvm lame lapack latex lcms ldap live lm_sensors lzma mad matroska mdnsresponder-compat mjpeg mkl mmx mng modules mp3 mpeg mudflap multilib mysql mysqli ncurses nls nptl nptlonly ogg oggvorbis openexr opengl openmp openssl pam pcre pdf perl php plasma plotutils png ppds pppd python qemu qt3support qt4 quicktime readline reports rss ruby samba sasl secure-delete semantic-desktop session smp spl sql sse sse2 sse3 ssl stream subversion svg sysfs tcl tcpd theora threads thumbnail tiff tk truetype unicode utempter vcd vlm vnc vorbis webkit wxwindows x264 xcomposite xml xorg xv xvid zeroconf zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en en_US" PHP_TARGETS="php5-2" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="radeon" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2010-11-24 00:18:23 UTC
screen has seen a lot of development lately, so maybe the patch is in already, or the bug has been fixed in a different way. Upstream has been more responsive to its bug tracker too.
Comment 2 gent_bz 2010-11-25 02:46:07 UTC
This may or may not be related to the reporter's problem:

I experienced this bug recently with 4.0.3-r3 when attempting to re-attach to a screen session started when running 4.0.3-r1.  I reinstalled 4.0.3-r1 and was able to re-attach to running sessions.  I have not upgraded and re-started my screen sessions to see if this is a bug in 4.0.3-r3.
Comment 3 Timothy Miller 2010-11-25 03:15:54 UTC
The reason that upstream wouldn't accept Hindle's patch is because it uses a longjump, and they object on philosophical grounds.  Debian, on the other hand, applies the patch on practical grounds, those being that the patch actually fixes the problem.  Screen devs seem to want to blame the problem on the kernel and wash their hands of it.

That being said, Mr. Adamczewski has an interesting point.  In my case, the lockup didn't happen in write(), which is what the patch fixes.  Rather, the screen daemon was in select(), yet I couldn't attach.  I don't recall whether or not I had upgraded between the time I started screen and when I tried to reattach, but I can imagine how a client/daemon protocol change (if there was any) from one version to the next could be responsible for what I experienced.  If that is the case, then my problem is a non-bug, although nevertheless, Gentoo devs may want to consider applying the patch for the pre-existing hang problem.
Comment 4 Dirkjan Ochtman (RETIRED) gentoo-dev 2011-04-11 08:53:30 UTC
I'm seeing a similar problem, on two separate boxes (stable amd64 and stable x86). All three of those screens won't re-attach with screen-4.0.3-r4 (which has just gone stable), even though re-attaching with screen-4.0.3 (after a downgrade) works fine. One of them contains an irssi client, the others are both also doing fairly heavy networking jobs (through Python scripts, using zeromq). Please re-evaluate the merits of this bug, as this seems like a big regression.
Comment 5 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2011-04-14 20:47:13 UTC
Which patch should we be looking at? I gave up researching it.

http://patch-tracker.debian.org/package/screen/4.0.3-11+lenny1
Comment 6 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2011-04-15 19:10:35 UTC
(In reply to comment #3)

> That being said, Mr. Adamczewski has an interesting point.  In my case, the
> lockup didn't happen in write(), which is what the patch fixes.  Rather, the
> screen daemon was in select(), yet I couldn't attach.  I don't recall whether
> or not I had upgraded between the time I started screen and when I tried to
> reattach, but I can imagine how a client/daemon protocol change (if there was
> any) from one version to the next could be responsible for what I experienced. 
> If that is the case, then my problem is a non-bug, although nevertheless,
> Gentoo devs may want to consider applying the patch for the pre-existing hang
> problem.

Interesting, indeed. I've recreated the "hanging" issue quite often now, myself. However, I can easily "fix" that by killing the session and starting a new session for the task at hand. From there on out, no more hanging. While initially upset about this software issue, I have since calmed down and got my screen sessions back in order. Ergo, doubtful that I will be seeking out a patch to apply to Gentoo anymore.
Comment 7 Sven Wegener gentoo-dev 2011-04-16 11:10:28 UTC
Without having a deeper look into the issue, I suspect that the patch 4.0.3-extend-d_termname-ng2.patch changes structures that screen uses to communicate between the backend and the attaching process and this breaks 4.0.3-r4 attaching to 4.0.3.
Comment 8 Sven Wegener gentoo-dev 2011-04-16 11:14:30 UTC
Actually, that refers to the last user comments on the -r4 issue.
Comment 9 Dirkjan Ochtman (RETIRED) gentoo-dev 2011-04-18 11:00:26 UTC
Yeah, it seems I can reattach to my irssi screen fine when it's started by -r4. I guess this might not be an issue for me after all, thanks.
Comment 10 Dustin J. Mitchell 2011-05-02 00:24:11 UTC
I'm seeing this on upgrade from 4.0.3 to -r4, as well.  In can confirm that exiting my screen session and re-starting with -r4 fixes it.  A note about this would probably save users with long-running screen sessions a lot of searching about!
Comment 11 Timothy Miller 2011-05-02 01:44:24 UTC
I'm debating in my mind whether or not this repeated breakage is a sort of "bug" (in a work-flow sort of way).  It's like breaking an API, except that the API is entirely internal to the program.  Normally, no one sees these things, but in this case, there are user-visible effects.  It sure would be nice if the devs would attempt to stabilize the client-server protocol so that this would happen only on major version changes.  You know... a little thinking ahead, rather than lots of little tweaks.  How might we get a little note about this to the developers?
Comment 12 Dustin J. Mitchell 2011-05-02 01:49:04 UTC
Timothy - sounds like something to post to screen's newsgroup or mailing list.  It also sounds like something of a religious issue, so it's probably better to handle this in portage in the interm.

It's worth noting that this particular breakage was caused by a patched added in portage - see comment 7.  So the screen devs aren't to blame here.
Comment 13 Chris Xe 2011-05-30 17:23:33 UTC
Just casting my vote that fixing this would be nice and what worked for me to get unstuck. The problem started after I did a normal `emerge -uD world` and detached my screen and went to bed. I woke up... but my screen session didn't. Annoying to not see the emerge output. To fix the problem I did
   `sudo emerge -av  =app-misc/screen-4.0.3`
and that allowed me to reattach to the screen session which was waiting patiently. Apparently it was upgrading to screen-4.0.3-r4 that caused the problem. (Just like the other posters indicate.)
I don't know if this is practical, but it seems like screen should not upgrade if it can detect that screen is running. If it's not running, the upgrade should be fine. If the upgrade gets aborted, it could leave a message saying to shut down screen first. I could live with that; it's just the surprise of not having my session come back that was hard to take. (I had some kind of important stuff in other windows.) Thanks to all who posted and helped me solve my particular problem with this.
Comment 14 Patrice Clement gentoo-dev 2015-10-10 16:30:39 UTC
This version of screen is no more in Portage. Some comments indicate this issue is gone with more recent versions of screen. Feel free to reopen this bug if it isn't.
Comment 15 EoD 2017-01-21 13:06:47 UTC
(In reply to Patrice Clement from comment #14)
> This version of screen is no more in Portage. Some comments indicate this
> issue is gone with more recent versions of screen. Feel free to reopen this
> bug if it isn't.

I have the issue after upgrading from app-misc/screen-4.3.1-r1 to app-misc/screen-4.4.0 .