Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 179978 - /etc/init.d/net.pppX fails to stop pppd daemon if authentication is failing
Summary: /etc/init.d/net.pppX fails to stop pppd daemon if authentication is failing
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] baselayout (show other bugs)
Hardware: All Linux
: High major (vote)
Assignee: Gentoo Dialup Developers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-05-27 09:03 UTC by Jaco Kroon
Modified: 2007-05-28 21:51 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jaco Kroon 2007-05-27 09:03:34 UTC
If pppd fails to actually establish a connection (eg, when running into authentication failures), then the rc system fails to actually kill it.

This seems to be because pppd doesn't actually create it's pid file unless it's holding onto a ppp device, meaning that it creates those nodes for a very small time period when it's attempting to establish a connection.

I don't see an easy fix, other than to perhaps use --background, --make-pidfile and --pidfile /var/run/ppp-rc-${IFACE}.pid to start-stop-daemon when starting, and using the nodetach option to pppd and then in the kill script to use that instead.  Then when stopping we can use pid instead.

The other option that I can see is to modify pppd itself to create the linkname based pidfile (/var/run/ppp-${IFNAME}.pid since linkname gets set to ${IFNAME}) immediately upon startup instead of when the link comes up.

Note that this results in a bigger problem where more and more and more pppd processes just gets spawned as one is attempting to debug pppd issues.  And since I periodically need to restart my pppd devices to switch between different accounts when the one or the other fails, this compounds pretty quickly for me.

Reproducible: Always

Steps to Reproduce:
1.  place incorrect auth details /etc/ppp/{chap,pap}-secrets and point to wrong credentials from /etc/conf.d/net
2.  start pppd
3.  check logs to make sure pppd fails
4.  attempt to stop pppd using /etc/init.d/net.ppp0 stop

Actual Results:  
The rc systems reports:

 * Stopping ppp1
 *   Bringing down ppp1                 [ ok ]
 *   Running postdown function

Whilst in fact it didn't actually bring down the pppd process for ppp1.  Note that I run multiple ppp connections normally, routing internation and local traffic differently (ADSL cap issues).

Expected Results:  
To have the pppd process killed with signal 15, resulting in pppd committing suicide.

[I--] [  ] net-dialup/ppp-2.4.4-r4 (0)
[I--] [  ] net-dialup/rp-pppoe-3.8 (0)

xacatecas run # ip addr show
1: inteth: <BROADCAST,MULTICAST,PROMISC,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:a0:c9:96:93:95 brd ff:ff:ff:ff:ff:ff
2: exteth: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:a0:c9:96:94:d3 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.1/24 brd 192.168.1.255 scope global exteth
3: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
4: tap0: <BROADCAST,MULTICAST,PROMISC,UP,10000> mtu 1500 qdisc noqueue qlen 100
    link/ether 6a:03:1c:ce:28:65 brd ff:ff:ff:ff:ff:ff
6: br0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc noqueue 
    link/ether 00:a0:c9:96:93:95 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/24 brd 192.168.0.255 scope global br0
490: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,10000> mtu 1492 qdisc pfifo_fast qlen 3
    link/ppp 
    inet 196.209.184.50 peer 196.209.184.1/32 scope global ppp0
xacatecas run # pidof pppd
9928
xacatecas run # /etc/init.d/net.ppp1 start
 * Starting ppp1
 *   Running preup function                                                                                                                                                     [ ok ]
 *   Bringing up ppp1
 *     ppp
 *       Running pppd ...
 *       Backgrounding ...
xacatecas run # pidof pppd
16849 9928
xacatecas run # /etc/init.d/net.ppp1 stop 
 * Stopping ppp1
 *   Bringing down ppp1                                                                                                                                                         [ ok ]
 *   Running postdown function
xacatecas run # pidof pppd
16849 9928
xacatecas run #
Comment 1 Roy Marples (RETIRED) gentoo-dev 2007-05-27 13:32:50 UTC
Could you attach your emerge --info so we know what base system you have installed? Thanks
Comment 2 Jaco Kroon 2007-05-27 19:55:12 UTC
The particular machine is a pentium3 700MHz, running totally headless, being used primarily as a router, VPN (both OpenVPN and PPtP), mail and proxy server.  Yes, I'll upgrade the profile at some point in the next week :).

Portage 2.1.2.2 (default-linux/x86/2006.0, gcc-4.1.1, glibc-2.5-r0, 2.6.19.2 i686)
=================================================================
System uname: 2.6.19.2 i686 Pentium III (Katmai)
Gentoo Base System release 1.12.9
Timestamp of tree: Mon, 21 May 2007 10:50:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
dev-lang/python:     2.3.5-r3, 2.4.3-r4
dev-python/pycrypto: 2.0.1-r5
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.61
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10
sys-devel/binutils:  2.16.1-r3
sys-devel/gcc-config: 1.3.14
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.20-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium3 -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /home/tc/root/etc"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c /home/tc/root/etc/env.d"
CXXFLAGS="-O2 -march=pentium3 -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="metadata-transfer sandbox sfperms skiprocheck strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://pug.lan/gentoo-portage"
USE="apache2 apm berkdb bitmap-fonts bzip2 cli cracklib crypt cups dri eds emboss encode foomaticdb gdbm gif gpm gstreamer iconv imlib isdnlog jpeg kde ldap libg++ libwww logrotate mad midi mikmod mp3 mpeg mudflap mysql ncurses nptl nptlonly ogg openmp pam pcre png pppd readline reflection samba session smux spl ssl truetype truetype-fonts type1-fonts userlocales v4l vorbis x86 xml xorg zlib zvbi" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="apm ark chips cirrus cyrix dummy fbdev glint i128 i740 i810 imstt mach64 mga neomagic nsc nv r128 radeon rendition s3 s3virge savage siliconmotion sis sisusb tdfx tga trident tseng v4l vesa vga via vmware voodoo"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 3 Alin Năstac (RETIRED) gentoo-dev 2007-05-28 03:55:43 UTC
There are 2 pid files created by pppd: pppX.pid and ppp-pppX.pid. The latter has 2 lines, first being the PID.
When I implemented pppd.sh, I remember using the ppp-pppX.pid, although baselayout repository seems to contradict me.

@Jaco: Please run the following command and check if the ppp-pppX.pid file is created immediately after net.pppX start:
    /etc/init.d/net.pppX start && ls -l /var/run

@Roy: If I'm right and ppp-pppX.pid is created immediately, read the PID from the first line of that file.
Comment 4 Jaco Kroon 2007-05-28 05:27:11 UTC
Already checked, no it doesn't.  Also note that you want to use ppp-${LINKNAME}.pid since you can predict that name, the pppX.pid file is somewhat unpredictable as explained in the original post (basically it can happen that unit X is already used and then pppd auto-allocates a unit number and then use pppN.pid where you have no clue what N is).

And no, you are using ppp-pppX.pid (IFNAME==linkname), there is even a comment about it:

    # Set linkname because we need /var/run/ppp-${linkname}.pid
    # This pidfile has the advantage of being there, even if ${iface} interface was never started
    opts="linkname ${iface} ${opts}"

And then:

    eval start-stop-daemon --start --exec /usr/sbin/pppd \
        --pidfile "/var/run/ppp-${iface}.pid" -- "${opts}" >/dev/null

and:

    start-stop-daemon --stop --exec /usr/sbin/pppd \
        --pidfile "${pidfile}" --retry 30

with pidfile set to /var/run/ppp-$1.pid ($1 == $IFACE).

But just above that I saw something that for a brief moment looked like it might provide me with a solution, and that is updetach.  Together with maxfail I thought that it would actually make net.pppX fail if it can't connect (which it will) and then in postdown (Does it get called when startup fails?  Any other hooks for failure?) I could switch accounts and issue /etc/init.d/net.${IFNAME} restart </dev/null &>/dev/null & which will then step on to the next account (Yes, I'm trying to auto-switch accounts when I reach a certain hardcap).  Unfortunately that will only work on boot since it needs to be a clean pppd start from the init system.  So I will need to hack pppd in any case to add a hook for authentication failures. /etc/ppp/auth-fail?  Would such a patch be generally usable?  What happens currently if a user sets maxfail?

Anyway, that isn't really part of this bug report.
Comment 5 Alin Năstac (RETIRED) gentoo-dev 2007-05-28 18:54:13 UTC
OK, I discovered the problem.
remove_pidfiles() is a function which deletes both pid files and is called from multiple places, not only before exiting. The reason for this behaviour is best explained by this comment which I found in auth.c:
    /*
     * Delete pid files before disestablishing ppp.  Otherwise it
     * can happen that another pppd gets the same unit and then
     * we delete its pid file.
     */
This is true for the /var/run/pppX.pid file, but not for /var/run/ppp-$linkname.pid. Kernel don't have anything to do with the linkname that user choose to use nor that linkname is guarded against conflicts in user-space code. It is the user's responsibility to select an unique linkname.

I've fixed it in ppp-2.4.4-r5 by applying a new patch called linkpidfile.patch.
Comment 6 Jaco Kroon 2007-05-28 21:51:11 UTC
Fix confirmed.  Works like a charm thanks.