Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 123404 - sys-apps/watchdog broken interface monitoring
Summary: sys-apps/watchdog broken interface monitoring
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal
Assignee: Henrik Brix Andersen
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-19 09:43 UTC by Christoph Probst
Modified: 2006-10-20 05:44 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch to resolve the iface limit using strings (watchdog_5.2.4-iface-byte-limit.diff,1.60 KB, patch)
2006-02-20 13:40 UTC, Christoph Probst
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christoph Probst 2006-02-19 09:43:45 UTC
I configured my software watchdog to monitor eth0 using

  interface               = eth0

in the watchdog.conf. This worked until my traffic exceeded 2147483647 bytes. This is because the check_iface() function in src/iface.c defines

  unsigned int bytes = atoi(line + i + strlen(dev->name) + 1);

to find out the number of bytes sent over one interface. But atoi() returns an int which is limited to 2147483647. Hence the check_iface() returns 101/ENETUNREACH and watchdog may reboot the machine without any reason.

In my logfiles this issue shows up as

  Feb 19 17:45:20 [watchdog] device eth0 received 2147483647 bytes
  Feb 19 17:45:20 [watchdog] device eth0 did not receive anything
                                                    since last check
Comment 1 Henrik Brix Andersen 2006-02-19 10:27:08 UTC
What is your proposed solution to this problem? Making the variable larger (long long) will only make the problem appear later...
Comment 2 Christoph Probst 2006-02-19 13:01:17 UTC
Well, if /sbin/ifconfig can live with 'unsigned long long' it should be enough for watchdog aswell. 18 million terabyte is _a lot_ for today.

To be safe for the future one could read the value as a string and convert only the last 9 digits using atoi(). For watchdog it is currently only important that there was traffic but not how much. (I know that this will fail if you have exactly 1 GB traffic between two watchdog runs but it's still possible to use 'long long's and 15 digites so minimize the chance).

Another way might be entirely operating on strings using strcmp().

Tell me what you'd prefere and I can write the patch if you think about resolving this issue.
Comment 3 Henrik Brix Andersen 2006-02-19 13:46:40 UTC
(In reply to comment #2)
> Well, if /sbin/ifconfig can live with 'unsigned long long' it should be enough
> for watchdog aswell. 18 million terabyte is _a lot_ for today.

True.

> Tell me what you'd prefere and I can write the patch if you think about
> resolving this issue.

It's not really what I would prefer that counts - it is what the upstream author prefers. Personally I think strcmp() is the most sane approach.

Please write a patch for this issue and submit it upstream for inclusion. Once it gets accepted upstream, please reopen this bug report and I'll take care of the Gentoo side of things. Thanks.


Comment 4 Christoph Probst 2006-02-20 13:40:42 UTC
Created attachment 80309 [details, diff]
Patch to resolve the iface limit using strings

I'll submit this patch upstream aswell. Maybe someone replys this time.
Comment 5 Olliver Schinagl 2006-10-20 05:44:06 UTC
IIRC the kernel doesn't store/count more then 4gb in the traffic anyway and thus it resets itself after 4gb. afaik it's an unsigned int, thus 32bit, 4gb. I once wanted to use ifconfig to monitor spent bandwith on an interface, and came along this limitation, chaning it to a 64bit int wasn't easy either, and the kernel developers said that it was pointless anyhow. more or less anyway.

So i'd say, make it 32bit or more, and just simply compare the two ints, if old != new it's fine. The only thing to watch out for is that it gets probed atleast once every X minutes, X being small enough for your interface speed. e.g. for 10mbit an hour is plenty, whereas 100mbit 10m is atleast needed etc.