Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 219577 - sys-libs/gpm-1.20.3 doesn't accept connections to /dev/gpmctl while X session active
Summary: sys-libs/gpm-1.20.3 doesn't accept connections to /dev/gpmctl while X session...
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Library (show other bugs)
Hardware: All Linux
: High normal with 1 vote (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
: 268480 (view as bug list)
Depends on:
Blocks:
 
Reported: 2008-04-28 09:22 UTC by Martin von Gagern
Modified: 2010-12-24 20:32 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin von Gagern 2008-04-28 09:22:25 UTC
I found out I could only run a very limited number of curses programs at the same time in my X session before they started to hang. Stracing such hanging tasks showed them to be connecting to /dev/gpmctl. Switching to the text console and moving the mouse there seemed to help, as did restarting gpm.

Steps to reproduce:
1. Make sure ncurses was emerged with gpm support.
2. Make sure gpm is running.
3. Open urxvt from x11-terms/rxvt-unicode.
4. Run screen in that terminal window.
5. Inside screen run "dialog --yesno Test 10 40".
6. Create new screen window (Ctrl+A Ctrl+C) and repeat 5.
7. Keep creating windows and starting dialogs until a dialog fails to display.
   I could start about five or six dialogs before it started to fail.
8. Run "netstat -x -a -p | grep gpm"
   It will display several sockets in "CONNECTING" state.

It looks like gpm wouldn't accept any connections and some queue of pending connections filled up. Ltrace of the daemon shows a lot of sleep(2), which seem to come from daemon/wait_text.c:wait_text() where a loop waits for the console to enter text mode, not giving the select on the socket any chance.

Although one might argue that ncurses should not deal with gpm at all when run in an X terminal window, I would also assume that a service that provides a socket should also accept a reasonable number of connections on said socket. As this whole socket communication is encapsulated inside the gpm library, Gpm_Open should ensure it never hangs.

I'm a bit surprised that the connect syscall for a unix domain socket seems to return before the connection is completely established. I'm also surprised that such pending connections seem to outlive the process which initiated them, at least if the initiating prcess is aborted in some way.
Comment 1 Martin von Gagern 2008-04-28 10:14:20 UTC
Not an issue with sys-libs/gpm-1.20.1-r60 due to the following check in lib/liblow.c:Gpm_Open()

  if(strncmp(tty,option.consolename,strlen(option.consolename)-1)
     || !isdigit(tty[strlen(option.consolename)-1])) {
    /* gpm_report(GPM_PR_ERR,"strncmp/isdigit/option.consolename failed"); */
    goto err;
  }

The comment does come from a patch, 09_all_logfillup.patch, it was a statement before. Together this might indicate that a check whether the application console is indeed a text console might be the appropriate solution to this problem, although I still consider a non-listening socket owner a bad thing.
Comment 2 SpanKY gentoo-dev 2008-12-31 10:15:31 UTC
is this still a problem with gpm-1.20.5 ?  i built ncurses with USE=gpm and gpm-1.20.5, but i cant reproduce your issue

what i tried was:
$ sudo /etc/init.d/gpm start
$ for n in {1..50} ; do screen -dmS $n dialog --yesno Test 10 40 ; done
$ pgrep dialog | wc -l
50
$ killall dialog
Comment 3 Martin von Gagern 2008-12-31 11:34:18 UTC
(In reply to comment #2)
> is this still a problem with gpm-1.20.5 ?  i built ncurses with USE=gpm and
> gpm-1.20.5, but i cant reproduce your issue

I can't reproduce it with sys-libs/ncurses-5.7 but with ncurses-5.6-r2 and gpm-1.20.5 the issue still exists.

> what i tried was:
> $ sudo /etc/init.d/gpm start
> $ for n in {1..50} ; do screen -dmS $n dialog --yesno Test 10 40 ; done
> $ pgrep dialog | wc -l
> 50
> $ killall dialog

How would you hope to identify hanging processes this way, without attaching to the screens? You could either have them timeout, and see if they terminate: "dialog --timeout 5 --yesno Test 10 40"
The bug causes dialog not to timeout, so even 10 seconds later the process count would still be 50, not 0.

Or you could kill them and look for gpm connections in the CONNECTING state using "netstat --unix -a | grep gpmctl". With the bug, there are six connections CONNECTING.
Comment 4 SpanKY gentoo-dev 2008-12-31 12:31:20 UTC
i did check the network state and i did not see any connections pending.  the point of the code i posted was to have a case that could be run w/out having to type a ton of manual commands and to easily push the limit.

i guess we could move ncurses-5.7 to stable and just forget about the issue.
Comment 5 Jakub Zawadzki 2008-12-31 13:46:00 UTC
I don't know if we care, but it's still issue for net-im/ekg2
Comment 6 Stefan Wimmer 2009-01-23 13:38:15 UTC
I can confirm that the same happens to me with ncurses-5.6-r2 & gpm-1.20.5 ... I'm starting to test now the combination ncurses-5.7 & gpm-1.20.5 and will report back to you ...
Comment 7 Stefan Wimmer 2009-01-23 14:32:17 UTC
(In reply to comment #6)
> I can confirm that the same happens to me with ncurses-5.6-r2 & gpm-1.20.5 ...
> I'm starting to test now the combination ncurses-5.7 & gpm-1.20.5 and will
> report back to you ...

It still happens :-/

Invoking elinks in xfce terminal inside a screen session hangs and CTRL-C evokes this error msg:

Jan 23 15:25:59 swimmer elinks: *** info 
Jan 23 15:25:59 swimmer elinks: /dev/gpmctl: Interrupted system call
Jan 23 15:25:59 swimmer elinks: *** err 
Jan 23 15:25:59 swimmer elinks: /dev/gpmctl: No such device or address
Jan 23 15:25:59 swimmer elinks: *** err 
Jan 23 15:25:59 swimmer elinks: Oh, oh, it's an error! possibly I die! 

 netstat -x -a -p | grep gpm
unix  2      [ ACC ]     STREAM     LISTENING     1351845  1032/gpm            /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTING    0        -                   /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTING    0        -                   /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTING    0        -                   /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTING    0        -                   /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTING    0        -                   /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTING    0        -                   /dev/gpmctl
unix  3      [ ]         STREAM     CONNECTED     1351661  1032/gpm          
Comment 8 Charles Stewart 2009-04-28 15:16:15 UTC
I have the same issue with sys-libs/gpm-1.20.5, sys-libs/ncurses-5.6-r2, and www-client/links-2.2

Triggering of this issue seems to be sporatic and non-deterministic.  The usual cause is connecting to a machine running X11 via ssh and then running links within screen.  It seems to happen most often when opening multiple instances of links within screen, and rapid startup and shutdown of links.
Comment 9 Arthur D. 2009-09-28 16:59:33 UTC
I just reproduced that damn bug with gpm-1.20.5 and gpm-1.20.6

These steps should help you to reproduce it:
1) Go to console (press Ctrl-Alt-F1 if you are running X server).
2) Connect (using SSH) to a Gentoo machine which runs gpm and X server.
3) Launch mc (mc should be compiled with gpm USE flag); press F10 to exit mc.
4) Repeat step (3) six times in a row.
5) Now if you try running any application that uses gpm (mc or links for example), it will lock up. If you kill it and try running again it will be locked up.
6) To unlock the applications you should restart gpm daemon.

That bug forced me to reboot the server several times :-( because I didn't know what the problem was and how to fix it.

P.S. mc is midnight commander by the way.
Comment 10 SpanKY gentoo-dev 2010-12-24 20:32:59 UTC
*** Bug 268480 has been marked as a duplicate of this bug. ***