Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 831743 - sys-apps/openrc: net-online should wait for IPv6 DAD announcements before returning
Summary: sys-apps/openrc: net-online should wait for IPv6 DAD announcements before ret...
Status: UNCONFIRMED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: OpenRC (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: OpenRC Team
URL: https://www.agwa.name/blog/post/bewar...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-21 16:40 UTC by Holger Hoffstätte
Modified: 2022-01-25 11:11 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Holger Hoffstätte 2022-01-21 16:40:38 UTC
I recently started using IPv6 addresses in my DNS server, which is set to start immediately after net-online, and noticed that after a reboot the DNS server would often be non-functional, leading to further fallout from other system services due to broken DNS. Investigation resulted in learning about DAD, an IPv6 feature that marks the interface as UP after first assigning a "tentative" address while looking for duplicate addresses on the network. This process may take 1-2 seconds, but software that starts in that window gets the tentative address and may fail.
This seems to be a widely occurring problem with countless postings on StackOverflow etc.

IMHO net-online should wait until an interface is completely usable before returning.

As documented in $URL Linux has gained several sysctl options to control this behaviour; I tried setting both accept_dad and optimistic_dad but without success.

The only thing that reliably works for me is waiting until the "tentative" attribute has been cleared, which typically happens in 1-2 seconds max.

Various distributions have apparently tackled this problem, like e.g. Debian in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=705996 or systemd as in https://serverfault.com/questions/766253/ensure-systemd-wait-for-ipv6-before-start-service-unit


Reproducible: Always

Steps to Reproduce:
1. start IPv6-aware service right after net-online
2. service binds to a tentative address and is probably confused



Expected Results:  
net-online should wait for interface address to be no longer tentative.


I added the following to my /etc/init.d/net-online and it reliably does the trick:

start_post()
{
	for dev in ${interfaces}; do
		while (true); do
			sleep 1
			status=$(ip address list $dev | grep "inet6.*global")
			# no IPv6 yet: try again
			[[ $? == "1" ]] && continue
			# tentative: try again
			echo $status | grep -q "tentative"
			[[ $? == "1" ]] && break
		done
	done
}

This works for me (I needed a quick fix). Please don't take this verbatim; it could probably need integration with the timeout counter already in net-online and likely doesn't work when IPv6 is disabled etc. However it should help get the discussion going.