Upon shutting down a domU under XEN the script "/etc/xen/scripts/vif-bridge" is invoked with an "offline" argument. This is for the recommended setup of connecting domUs to the dom0 through a bridged device named xenbr0. The relevant snippet of code reads as follows: ========================================== case "$command" in online) setup_virtual_bridge_port "$dev" mtu="`ip link show $bridge | awk '/mtu/ { print $5 }'`" if [ -n "$mtu" ] && [ "$mtu" -gt 0 ] then ip link set $dev mtu $mtu || : fi add_to_bridge "$bridge" "$dev" ;; offline) do_without_error brctl delif "$bridge" "$dev" do_without_error ifconfig "$dev" down ;; add) setup_virtual_bridge_port "$dev" add_to_bridge "$bridge" "$dev" ;; esac ========================================== The function "do_without error" called from the "offline)" pattern in the "case" statement is defined in /etc/xen/scripts/xen-hotplug-common.sh which is indirectly sourced through /etc/xen/scripts/vif-common.sh and reads as follows: ========================================== do_without_error() { "$@" 2>/dev/null || log debug "$@ failed" } ========================================== The call 'do_without_error brctl delif "$bridge" "$dev"' obviously executes brctl delif "$bridge" "$dev" and the call 'do_without_error brctl delif "$bridge" "$dev"' executes ifconfig "$dev" down - both discarding any error output, but in case of any error (i.e. exit code <> 0) still logging a failed message to syslog as follows: ========================================== Feb 26 22:14:29 vm-host logger: /etc/xen/scripts/vif-bridge: brctl delif xenbr0 vif1.0 failed Feb 26 22:14:29 vm-host logger: /etc/xen/scripts/vif-bridge: ifconfig vif1.0 down failed ========================================== Upon investigating it seems that the problem is related to the fact that the network device (at least for paravirtualized guests using the netfront/netback device model) has already been destroyed by the dom0 kernel when the script is being run. This is evidenced by the following entries in syslog preceding the above quoted error messages: ========================================== Feb 26 22:14:29 vm-host kernel: [ 6169.989895] xenbr0: port 1(vif1.0) entered disabled state Feb 26 22:14:29 vm-host kernel: [ 6170.007496] xenbr0: port 1(vif1.0) entered disabled state Feb 26 22:14:29 vm-host kernel: [ 6170.007568] device vif1.0 left promiscuous mode Feb 26 22:14:29 vm-host kernel: [ 6170.007571] xenbr0: port 1(vif1.0) entered disabled state ========================================== These findings are further underpinned by the relevant error messages provided by the function "do_without_error" (captured by redirecting stderr to a file rather than to /dev/null) which are as follows: ========================================== for brctl: "interface vif1.0 does not exist!" for ifconfig: "vif1.0: ERROR while getting interface flags: No such device" ========================================== Suggested fix: for brctl: check whether the interface still exists and is also still linked to the bridge prior to invoking the brctl command for ifconfig: check whether the interface still exists and is also still up prior to invoking the ifconfig command as follows: ========================================== case "$command" in online) setup_virtual_bridge_port "$dev" mtu="`ip link show $bridge | awk '/mtu/ { print $5 }'`" if [ -n "$mtu" ] && [ "$mtu" -gt 0 ] then ip link set $dev mtu $mtu || : fi add_to_bridge "$bridge" "$dev" ;; offline) brctl show "$bridge" | grep -q "$dev" && do_without_error brctl delif "$bridge" "$dev" ifconfig -s | grep -q "$dev" && do_without_error ifconfig "$dev" down ;; add) setup_virtual_bridge_port "$dev" add_to_bridge "$bridge" "$dev" ;; esac ========================================== In terms of functionality my suggested fix does not change anything as in case the interface is still linked to the bridge (is still up) - which might be the case for PCI-passed through devices from dom0 to a domU - the removal from the bridge (bringing the interface down) is performed exactly as before. It however does away the nasty error message in the syslog.
@kk, your report looks sane to me.. but I feel like this fix should route to upstream (instead of just keeping into Gentoo), think about other distro will also benefit. would you kindly report this to xen upstream? also I'd personally try to avoid using "do something | grep -q xx && do others" style, slightly adjust as following (grep's -qs options are not portable, man 1 grep) offline) if brctl show "$bridge" | grep "$dev" > /dev/null 2>&1 ; then do_without_error brctl delif "$bridge" "$dev" fi if ifconfig -s "$dev" > /dev/null 2>&1 ; then do_without_error ifconfig "$dev" down fi ;;
yes, in this case reporting upstream seems apt. You know where & how? xen-user or xen-deel ML and xen afaik has a bugzilla. The URL text box we often use for a link to an upstream bug but can also be used for an email published on a xensource ML page
Yixun & Ian, thanks for your comments. I'll initially send an e-Mail to the xen-user ML. It's anyway mainly a copy&paste thing from my report here - albeit I'll be amending the fix I had suggested with what Yixun has provided. If appropriate I am sure Ian Campbell on there will redirect/resend my mail to the xen-devel mailing list. Yixun or Ian - would anyone of you like to be on CC? Thanks, KK
(In reply to KK from comment #3) > Yixun & Ian, > thanks for your comments. I'll initially send an e-Mail to the xen-user ML. > It's anyway mainly a copy&paste thing from my report here - albeit I'll be > amending the fix I had suggested with what Yixun has provided. > > If appropriate I am sure Ian Campbell on there will redirect/resend my mail > to the xen-devel mailing list. > Just tell me fellow Ian @ gentoo sent you. I've been in contact with him extensively lately. Add the link of this bug would be apt. > Yixun or Ian - would anyone of you like to be on CC? > > Thanks, > > KK hmm, well, add me if you care to. Won't hurt.
No further news