I configured my software watchdog to monitor eth0 using
interface = eth0
in the watchdog.conf. This worked until my traffic exceeded 2147483647 bytes. This is because the check_iface() function in src/iface.c defines
unsigned int bytes = atoi(line + i + strlen(dev->name) + 1);
to find out the number of bytes sent over one interface. But atoi() returns an int which is limited to 2147483647. Hence the check_iface() returns 101/ENETUNREACH and watchdog may reboot the machine without any reason.
In my logfiles this issue shows up as
Feb 19 17:45:20 [watchdog] device eth0 received 2147483647 bytes
Feb 19 17:45:20 [watchdog] device eth0 did not receive anything
since last check
What is your proposed solution to this problem? Making the variable larger (long long) will only make the problem appear later...
Well, if /sbin/ifconfig can live with 'unsigned long long' it should be enough for watchdog aswell. 18 million terabyte is _a lot_ for today.
To be safe for the future one could read the value as a string and convert only the last 9 digits using atoi(). For watchdog it is currently only important that there was traffic but not how much. (I know that this will fail if you have exactly 1 GB traffic between two watchdog runs but it's still possible to use 'long long's and 15 digites so minimize the chance).
Another way might be entirely operating on strings using strcmp().
Tell me what you'd prefere and I can write the patch if you think about resolving this issue.
(In reply to comment #2)
> Well, if /sbin/ifconfig can live with 'unsigned long long' it should be enough
> for watchdog aswell. 18 million terabyte is _a lot_ for today.
> Tell me what you'd prefere and I can write the patch if you think about
> resolving this issue.
It's not really what I would prefer that counts - it is what the upstream author prefers. Personally I think strcmp() is the most sane approach.
Please write a patch for this issue and submit it upstream for inclusion. Once it gets accepted upstream, please reopen this bug report and I'll take care of the Gentoo side of things. Thanks.
Created attachment 80309 [details, diff]
Patch to resolve the iface limit using strings
I'll submit this patch upstream aswell. Maybe someone replys this time.
IIRC the kernel doesn't store/count more then 4gb in the traffic anyway and thus it resets itself after 4gb. afaik it's an unsigned int, thus 32bit, 4gb. I once wanted to use ifconfig to monitor spent bandwith on an interface, and came along this limitation, chaning it to a 64bit int wasn't easy either, and the kernel developers said that it was pointless anyhow. more or less anyway.
So i'd say, make it 32bit or more, and just simply compare the two ints, if old != new it's fine. The only thing to watch out for is that it gets probed atleast once every X minutes, X being small enough for your interface speed. e.g. for 10mbit an hour is plenty, whereas 100mbit 10m is atleast needed etc.