Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 189777 - br2684ctl.sh doesn't test for success
Summary: br2684ctl.sh doesn't test for success
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] baselayout (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Dialup Developers
URL:
Whiteboard: openrc:oldnet due:20101011
Keywords:
Depends on:
Blocks:
 
Reported: 2007-08-22 01:54 UTC by Renato Caldas
Modified: 2010-09-12 08:47 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
br2684ctl.sh (br2684ctl.sh,1.63 KB, text/plain)
2007-09-12 17:52 UTC, Alin Năstac (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Renato Caldas 2007-08-22 01:54:18 UTC
I have an Alcatel DynaMiTe usb modem (similar to speedtouch) in a headless box, used as a LAN router. The adsl connection is over PPPoE, and working ok so far. Except for one irritating detail.

The /etc/init.d/net.nas0 is started automatically at boot, but unfortunately it's started _before_ the modem is properly set up. This means br2684ctl will fail to create the nas0 bridge.

Currently, the net.nas0 init script will report a correct startup, which is not true. A subsequent call to "/etc/init.d/net.nas0 status" will report it as crashed. The correct behavior would be to test the return value of br2684ctl.

On the other hand, instead of just testing if the br2684ctl started correctly it would be nice for the script to mark net.nas0 as inactive and keep trying until the bridge is established.

BTW, I'm using baselayout-2.
Thanks,
Comment 1 Roy Marples (RETIRED) gentoo-dev 2007-08-22 08:36:21 UTC
We cannot do this as the br2684ctl program doesn't actually return an error/success code as we have to force it into the background.

To solve this bug, the br2684ctl program will need to be fixed first.
Comment 2 Alin Năstac (RETIRED) gentoo-dev 2007-08-22 08:53:46 UTC
I suppose I can call fork() in br2684ctl after it has successfully created the nas interface, but making it retry to start the connection is a different matter and without that is impossible to have an inactive status.

Do you consider inactive status as a must?
Comment 3 Alin Năstac (RETIRED) gentoo-dev 2007-08-22 09:13:16 UTC
Actually, the current br2684ctl support -b (background) option.
You might wanna use it instead ssd. That way it will be possible to determine if it failed or not.

This is what I think it will work in baselayout-1.12:
--- br2684ctl.sh        2007-05-21 17:31:24.000000000 +0300
+++ br2684ctl.sh        2007-08-22 12:11:41.000000000 +0300
@@ -41,9 +41,7 @@
        fi

        einfo "Starting RFC 2684 Bridge control on ${iface}"
-       start-stop-daemon --start --exec /sbin/br2684ctl --background \
-               --make-pidfile --pidfile "/var/run/br2684ctl-${iface}.pid" \
-               -- -c "${number}" ${!opts}
+       /sbin/br2684ctl -c "${number}" ${!opts} -b
        eend $?
 }

@@ -51,7 +49,7 @@
 br2684ctl_post_stop() {
        local iface="$1"
        local number="${iface#${iface%%[0-9]}}"
-       local pidfile="/var/run/br2684ctl-${iface}.pid"
+       local pidfile="/var/run/${iface}.pid"

        [[ $(interface_type "${iface}") != "nas" ]] && return 0

Comment 4 Roy Marples (RETIRED) gentoo-dev 2007-08-22 09:26:06 UTC
/var/run/${iface}.pid is hardly descriptive is it?

Could we patch it so that we can either give it a pidfile to create OR prefix it with br2684ctl?

Also, could you go through the code and ensure it creates the pidfile immediately after forking so we can continue using it on baselayout-2 with s-s-d? After all, it would be handy to know if it's crashed for whatever reason.
Comment 5 Renato Caldas 2007-08-22 13:19:26 UTC
(In reply to comment #2)
> Do you consider inactive status as a must?
> 

It is the only way to maintain the headless box "headless".. Currently I need to ssh into it to make the dsl connection work. You can imagine the headache it would be in case of power loss or dsl line connection loss..
Comment 6 Renato Caldas 2007-08-24 13:53:02 UTC
So what changes to the br2684ctl program are required to make this work?

BTW I've contacted Marco d'Itri and he told me he's not actively maintaining the debian package code. The linux-atm tree has evolved since, and Marco advised me to use their sources for any future changes.

He also suggested for his work to be ported to the main tree, which I'll probably do on my spare time.

In the meantime, is it feasible to move to the linux-atm cvs sources for br2684ctl?
Comment 7 Alin Năstac (RETIRED) gentoo-dev 2007-08-24 14:32:58 UTC
(In reply to comment #6)
> In the meantime, is it feasible to move to the linux-atm cvs sources for
> br2684ctl?

I don't see why not.
Recapitulation of what needs to be done:
 - pid file should be renamed to /var/run/br2684ctl-${iface}.pid
 - add a retry option (probably -r if it isn't collide with other options) followed by a time period between retries
 - add persist option (try to create nas interface until someone kill the execution)
 - up till now, all the work was done while reading options. The wanted operations sequence is this: read options, call fork, create pid file, do stuff needed for creation of the nas interface.
Comment 8 Renato Caldas 2007-08-27 14:35:43 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > In the meantime, is it feasible to move to the linux-atm cvs sources for
> > br2684ctl?
> 
> I don't see why not.

Great! The debian changes are all part of the linux-atm cvs now, so the change should have no ill side effects :)

> Recapitulation of what needs to be done:
>  - pid file should be renamed to /var/run/br2684ctl-${iface}.pid

This part is done (on the linux-atm cvs).

>  - add a retry option (probably -r if it isn't collide with other options)
> followed by a time period between retries
>  - add persist option (try to create nas interface until someone kill the
> execution)

These two I'll try to do this week.

>  - up till now, all the work was done while reading options. The wanted
> operations sequence is this: read options, call fork, create pid file, do stuff
> needed for creation of the nas interface.

I have one question regarding this: how does the ssd know when the daemon is inactive or started?

Comment 9 SpanKY gentoo-dev 2007-08-27 14:47:04 UTC
i'm not familiar at all with br2684ctl, but are you saying that it's been fully integrated into the linux-atm project ?
Comment 10 Alin Năstac (RETIRED) gentoo-dev 2007-08-27 14:49:59 UTC
(In reply to comment #8)
> I have one question regarding this: how does the ssd know when the daemon is
> inactive or started?

Probably ssd is happy if there is a process matching the content of the pid file, but there is a better question to ask: who will inform baselayout about nas0 status change?

I can see only one answer to this: invent 2 scripts (e.g. /etc/br2684ctl/up and /etc/br2684ctl/down) that will be executed on inactive->active and active->inactive transitions. 
Roy, what do you think? Is it feasible to implement something similar with PPP baselayout support starting from this?
Comment 11 Renato Caldas 2007-08-27 15:02:21 UTC
(In reply to comment #9)
> i'm not familiar at all with br2684ctl, but are you saying that it's been fully
> integrated into the linux-atm project ?
> 

Yes, it is now part of the linux-atm 2.5.0 branch. When it will come out, that I'm not sure. We will probably need to have a separate br2684ctl ebuild for some time..
Comment 12 SpanKY gentoo-dev 2007-08-27 15:24:28 UTC
but if you're proposing making the br2684ctl package really just a linux-atm snapshot, seems like it'd make sense to just make a linux-atm snapshot package ?

doesnt matter to me, just askin :)
Comment 13 Renato Caldas 2007-08-27 19:39:02 UTC
(In reply to comment #12)
> but if you're proposing making the br2684ctl package really just a linux-atm
> snapshot, seems like it'd make sense to just make a linux-atm snapshot package
> ?

I'm proposing to make a package of just the br2684ctl part. Its directory has a Makefile.am, it's a matter of throwing in a configure.ac, a GPL license and a couple of other files. We could as well do a linux-atm snapshot, I'm just not sure of how safe it would be.

BUT I don't think there's the possibility of having a baselayout compatible with both the current stable br2684ctl or the current linux-atm one. On the other hand, baselayout-2 is by itself unstable, so there may be no issue here :)
Comment 14 Roy Marples (RETIRED) gentoo-dev 2007-08-28 07:03:19 UTC
No, s-s-d itself just starts and stops daemons. The RC system handles the state of a service outside of this. However, start-stop-daemon does store the information to work out of the daemon is still running or not so we can set a "crashed" status.

Also, the init scripts themselves (which also means the net modules) don't actually set the status themselves as the RC system does this depending on the exit code of the start/stop functions. At most the init script can set itself to be "inactive" which means that a daemon it has started will re-enter the service to continue starting up.
Comment 15 Renato Caldas 2007-08-28 11:13:22 UTC
(In reply to comment #14)
> Also, the init scripts themselves (which also means the net modules) don't
> actually set the status themselves as the RC system does this depending on the
> exit code of the start/stop functions. At most the init script can set itself
> to be "inactive" which means that a daemon it has started will re-enter the
> service to continue starting up.

I see. But when will the daemon continue the startup? Is it the RC system that polls the startup function? Or does the daemon need to "do" something for the startup function to be reentered?

PS: There should definitely be some documentation for the baselayout development.. I wonder if someone would volunteer to do that..
Comment 16 Roy Marples (RETIRED) gentoo-dev 2007-08-28 11:37:53 UTC
(In reply to comment #15)
> I see. But when will the daemon continue the startup?

That depends on the daemon. In the case of ifplugd or netplugd it's when you plug a working network cable in. For say openvpn it's when it connects to a remote endpoint.

> Is it the RC system that
> polls the startup function? Or does the daemon need to "do" something for the
> startup function to be reentered?

No.
See above answer.

> PS: There should definitely be some documentation for the baselayout
> development.. I wonder if someone would volunteer to do that..

Yes there should. On my TODO list when I stop fixing bugs :)
Comment 17 Alin Năstac (RETIRED) gentoo-dev 2007-08-28 11:45:17 UTC
I already told you need to add br2684ctl support for up/down. This is the way br2684ctl will inform baselayout that net.nasX is active or inactive. Please note that br2684ctl only sets up the Ethernet interface. It will be baselayout's duty to configure the IP settings when the following example of /etc/br2684ctl/up script will be executed by br2684ctl daemon:

#!/bin/sh
# the followings parameters are available:
# $1 = interface-name

if [ -x /etc/init.d/net.$1 ]; then
        if ! /etc/init.d/net.$1 --quiet status ; then
                export IN_BACKGROUND="true"
                /etc/init.d/net.$1 --quiet start
        fi
fi

Comment 18 Renato Caldas 2007-09-02 19:22:48 UTC
Hello again,

There is something that puzzles me here: the br2684ctl _doesn't_ remove the nasX interface when it quits. But the interface _does_ get removed when I run "/etc/init.d/net.nas0 stop". So who is removing it??

Thanks
Comment 19 Alin Năstac (RETIRED) gentoo-dev 2007-09-03 04:53:49 UTC
The application opens 2 sockets, one in create_br() and the other one in assign_vcc(). When the application close them, the kernel will destroy the nas interface. Right now, this gets done only when br2684ctl exits (all file descriptors are closed).

Application is very simple: 
 - creating the interface: open those 2 sockets and configure them
 - destroying the interface: close the 2 sockets
Current source code looks really bad. I hope your variant will look a lot better.
Comment 20 Renato Caldas 2007-09-06 13:11:35 UTC
(In reply to comment #7)
> I don't see why not.
> Recapitulation of what needs to be done:
(...)
>  - up till now, all the work was done while reading options. The wanted
> operations sequence is this: read options, call fork, create pid file, do stuff
> needed for creation of the nas interface.

Done. I'm still waiting for the review from the linux-atm guys (it's not in the tree yet).

There's an important issue pending regarding the support for specifying multiple interfaces.. I'm for removing it, as that's what's making the code a mess..

As for the up/down scripts, maybe that's not required: maybe ppp checks for the existance of the interface before starting (I'm not sure, there's no documentation about that..). So possibly there's no need for the inactive state.

(In reply to comment #19)
> The application opens 2 sockets, one in create_br() and the other one in
> assign_vcc(). When the application close them, the kernel will destroy the nas
> interface. Right now, this gets done only when br2684ctl exits (all file
> descriptors are closed).

What was puzzling me was that when br2684ctl was started without the "-a" argument it wouldn't complain, wouldn't work as expected and wouldn't remove the interface. That's corrected now, as the options get checked before "doing stuff". Thanks for the pointer :)
Comment 21 Alin Năstac (RETIRED) gentoo-dev 2007-09-06 13:25:30 UTC
(In reply to comment #20)
> There's an important issue pending regarding the support for specifying
> multiple interfaces.. I'm for removing it, as that's what's making the code a
> mess..

Yeah, I also think this should go away. No need for multiple interfaces from our pov.

> As for the up/down scripts, maybe that's not required: maybe ppp checks for the
> existance of the interface before starting (I'm not sure, there's no
> documentation about that..). So possibly there's no need for the inactive
> state.

pppd don't even know it uses a RFC2684 bridge, never mind restarting the br2684ctl. Besides, it isn't pppd the program that starts the bridge in the first place. If you want to have active/inactive states on nas interfaces, you will *have* to add up/down scripts, otherwise all the trouble is (mostly) in vain. Sorry...
Comment 22 Alin Năstac (RETIRED) gentoo-dev 2007-09-07 06:34:49 UTC
Ah, I think there might be another way to implement active/inactive status on nas interfaces. Run "udevmonitor --kernel --env" to check if you can use udev for that.
Comment 23 Renato Caldas 2007-09-07 07:59:48 UTC
(In reply to comment #22)
> Ah, I think there might be another way to implement active/inactive status on
> nas interfaces. Run "udevmonitor --kernel --env" to check if you can use udev
> for that.
> 

Yep I guess I can! Really clever solution :)

The output of:
/etc/init.d/net.ppp0 stop
/etc/init.d/net.nas0 stop
/etc/init.d/net.nas0 start
/etc/init.d/net.ppp0 start

is this:

$ udevmonitor --kernel --environment
udevmonitor will print the received events for:
UEVENT the kernel uevent

UEVENT[1189151774.978253] remove   /class/net/ppp0 (net)
ACTION=remove
DEVPATH=/class/net/ppp0
SUBSYSTEM=net
SEQNUM=637
INTERFACE=ppp0
IFINDEX=8

UEVENT[1189151783.448011] remove   /class/net/nas0 (net)
ACTION=remove
DEVPATH=/class/net/nas0
SUBSYSTEM=net
SEQNUM=638
INTERFACE=nas0
IFINDEX=4

UEVENT[1189151799.561949] add      /class/net/nas0 (net)
ACTION=add
DEVPATH=/class/net/nas0
SUBSYSTEM=net
SEQNUM=639
INTERFACE=nas0
IFINDEX=9

UEVENT[1189151815.321265] add      /class/net/ppp0 (net)
ACTION=add
DEVPATH=/class/net/ppp0
SUBSYSTEM=net
SEQNUM=640
INTERFACE=ppp0
IFINDEX=10
Comment 24 Alin Năstac (RETIRED) gentoo-dev 2007-09-12 17:52:09 UTC
Created attachment 130750 [details]
br2684ctl.sh

Try this br2684ctl.sh and see if it works for you.

Don't forget to set RC_HOTPLUG=yes in conf.d/rc and add RC_NEED_ppp0="net.nas0" to conf.d/net.
Comment 25 Renato Caldas 2007-09-16 13:46:22 UTC
Sorry for the delay..

(In reply to comment #24)
> Created an attachment (id=130750) [edit]
> br2684ctl.sh
> 
> Try this br2684ctl.sh and see if it works for you.

It doesn't: net.nas0 is stuck in the inactive state. Bringing it up (via "ifconfig nas0 up") doesn't change that.
Comment 26 Alin Năstac (RETIRED) gentoo-dev 2007-09-17 06:15:12 UTC
Do the following test:
 - run "/etc/init.d/net.nas0 start" (it should end with a "Backgrounding" message). Check br2684ctl process and nas0 interface exists.
 - run "IN_HOTPLUG=1 /etc/init.d/net.nas0 start". This should complete the net.nas0 setup and also start net.ppp0 if this service is listed in the default runlevel.
Comment 27 Renato Caldas 2007-09-19 15:09:30 UTC
It works :) But now we need to figure out how to get it starting automatically.

Other thing, the "do stuff first" patch I've sent to the linux-atm list still has no feedback. Over two weeks have passed, so I'm thinking of "bumping" the topic.. It may be poor netiquette, but I really want to move forward..

They don't seem to like the idea of removing the "multiple interface" support, so I'm also considering a fork on br2684ctl to clean it up..
Comment 28 Alin Năstac (RETIRED) gentoo-dev 2007-09-19 18:24:36 UTC
(In reply to comment #27)
> It works :) But now we need to figure out how to get it starting automatically.
Other than setting RC_HOTPLUG=no or RC_HOTPLUG="!net.nas0", I don't know what could cause this. Try to enable debug log in udev by running "udevcontrol log_priority=debug" and see if /lib/udev/net.sh does what it supposed to.

> They don't seem to like the idea of removing the "multiple interface" support,
> so I'm also considering a fork on br2684ctl to clean it up..

.. or you could keep the multiple interface support if they want it so bad. Just use dynamically allocated (and reallocated) arrays to keep the settings of the controlled nas interfaces as well as their file descriptors.
Comment 29 Alin Năstac (RETIRED) gentoo-dev 2007-10-25 20:08:52 UTC
Any news about this? 
Comment 30 Renato Caldas 2007-10-25 20:22:38 UTC
Sorry for the lack of news.. I haven't had much time lately to dedicate to this subject..

Regarding the br2684ctl patches, the "do stuff first" patch was only accepted in October 21st (4 days ago), and the linux-atm list has been kind of dead lately.

I've proposed your solution to the multiple-interface problem to them, still no answer. I guess I'll just make the changes and submit them the patches..

Regarding the udev problem I haven't had much time to fiddle with it.. Maybe this weekend.. Sorry.. And thanks for your interest and help!
Comment 31 Renato Caldas 2007-12-08 18:25:43 UTC
(In reply to comment #29)
> Any news about this? 
> 

Hey Alin!

I finally got some time to test it :)

The udev.sh segfaults! That's why it didn't work.. I'll try to figure out why and report soon. Thanks!!

Dec  8 18:21:47 [udevd] msg_queue_insert: seq 777 queued, 'add' 'net'
Dec  8 18:21:47 [udevd] udev_event_run: seq 777 forked, pid [2526], 'add' 'net', 0 seconds old
Dec  8 18:21:47 [udevd-event] wait_for_sysfs: file '/sys/devices/virtual/net/nas0/address' appeared after 0 loops
Dec  8 18:21:47 [udevd-event] udev_rules_get_name: no node name set, will use kernel name 'nas0'
Dec  8 18:21:47 [udevd-event] run_program: 'net.sh nas0 start'
Dec  8 18:21:47 [udevd-event] run_program: '/lib/udev/net.sh' (stderr) '/lib/udev/net.sh: line 28:  2528 Segmentation fault      IN_HOTPLUG=1 "${SCRIPT}" --quiet "${ACTION}"'
Dec  8 18:21:47 [udevd-event] run_program: '/lib/udev/net.sh' returned with status 139
Dec  8 18:21:47 [udevd-event] pass_env_to_socket: passed -1 bytes to socket '/org/kernel/udev/monitor', 
Dec  8 18:21:47 [udevd-event] udev_event_run: seq 777 finished with -1
Dec  8 18:21:47 [udevd] udev_done: seq 777, pid [2526] exit with 1, 0 seconds old
Comment 32 Renato Caldas 2007-12-08 18:27:18 UTC
(In reply to comment #31)
> The udev.sh segfaults!

That is, /lib/udev/net.sh, sorry ;)
Comment 33 Renato Caldas 2007-12-08 18:47:44 UTC
After all the segfaulter was net.nas0. I was grep'ing the log for udev, and missed some important lines. Here's the full output:

Dec  8 18:36:44 [udevd] msg_queue_insert: seq 785 queued, 'add' 'net'
Dec  8 18:36:44 [udevd] udev_event_run: seq 785 forked, pid [3147], 'add' 'net', 0 seconds old
Dec  8 18:36:44 [udevd-event] wait_for_sysfs: file '/sys/devices/virtual/net/nas0/address' appeared after 0 loops
Dec  8 18:36:44 [udevd-event] udev_rules_get_name: no node name set, will use kernel name 'nas0'
Dec  8 18:36:44 [udevd-event] run_program: 'net.sh nas0 start'
Dec  8 18:36:44 [kernel] net.nas0[3149]: segfault at 4854415c eip b7e87903 esp bf880994 error 4
Dec  8 18:36:44 [br2684ctl] Interface "nas0" created sucessfully_
Dec  8 18:36:44 [udevd-event] run_program: '/lib/udev/net.sh' (stderr) '/lib/udev/net.sh: line 28:  3149 Segmentation fault      IN_HOTPLUG=1 "${SCRIPT}" --quiet "${ACTION}"'
Dec  8 18:36:44 [udevd-event] run_program: '/lib/udev/net.sh' returned with status 139
Dec  8 18:36:44 [udevd-event] pass_env_to_socket: passed -1 bytes to socket '/org/kernel/udev/monitor', 
Dec  8 18:36:44 [udevd-event] udev_event_run: seq 785 finished with -1
Dec  8 18:36:44 [udevd] udev_done: seq 785, pid [3147] exit with 1, 0 seconds old
Dec  8 18:36:44 [br2684ctl] Communicating over ATM 0.0.35, encapsulation: LLC_
Dec  8 18:36:44 [br2684ctl] Interface configured
Dec  8 18:36:44 [br2684ctl] RFC 1483/2684 bridge daemon started_
Dec  8 18:36:44 [/etc/init.d/net.nas0] WARNING: net.nas0 has started, but is inactive

It seems that the udev event is triggered before the nas0 interface is ready..
Comment 34 Alin Năstac (RETIRED) gentoo-dev 2007-12-09 07:22:37 UTC
Edit br2684ctl.sh and replace the line
  [ ${IN_HOTPLUG:-0} = 1 ] && return 0 # bridge setup finished
with
  [ ${IN_HOTPLUG:-0} = 1 ] && sleep 1s && return 0 # bridge setup finished
Alternatively, you could add a sleep in br2684ctl utility right after pidfile creation.

Anyway, these are merely workarounds. The proper solution would require baselayout to lock net.${iface} while an instance of that script is still running.
Comment 35 Roy Marples (RETIRED) gentoo-dev 2007-12-09 09:07:12 UTC
(In reply to comment #34)
> Anyway, these are merely workarounds. The proper solution would require
> baselayout to lock net.${iface} while an instance of that script is still
> running.

baselayout already locks it.
To see this, open two terminals and start the same init script in each. Obviously make the init script sleep a little to make things easier.
Comment 36 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-09-11 18:04:17 UTC
mrness:
Is this still needed/broken? The locking is correct.
I'll close 2010/10/11 if there is no response.
Comment 37 Renato Caldas 2010-09-11 22:35:20 UTC
Sorry, I've not been using gentoo for a while now.. I guess you can close it, thanks!
Comment 38 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-09-12 08:47:16 UTC
This should work, and user can't test anymore.