Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 131723 - wrong outgoing IP address in traceroute
Summary: wrong outgoing IP address in traceroute
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Eldad Zack (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-04-29 15:05 UTC by Jan Kundrát (RETIRED)
Modified: 2007-05-07 21:44 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
strace.err (strace.err,12.17 KB, text/plain)
2006-10-09 15:15 UTC, Jan Kundrát (RETIRED)
Details
strace.ok (strace.ok,20.51 KB, text/plain)
2006-10-09 15:15 UTC, Jan Kundrát (RETIRED)
Details
experimental ebuild, attempt to solve source address problem. (traceroute-1.4_p12-r5.ebuild,1.75 KB, text/plain)
2006-10-13 17:37 UTC, Eldad Zack (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jan Kundrát (RETIRED) gentoo-dev 2006-04-29 15:05:38 UTC
I have a multihomed host, a lot of interfaces have multiple IP addresses from several IP ranges (some of them are local to a small piece of network (and unroutable in the rest), others are "global" in our environment), those interfaces have non-standard names like "eth-wl-jih" etc. I use baselayout with iproute2 to assign addresses to interfaces.

The problem is that traceroute simply selects the first IP address of the interface that it uses for sending packets (determined from routing table, I guess) instead of using the correct address as could be determined from the routing table. I suspect that using ifconfig for setup (thus creating "aliased interfaces") is likely to fix the problem, but I prefer iproute2.

In this particular setup, the problem was caused by specifying the "local" address as the first one in /etc/conf.d/net and the "routable" address as the second one. Packets were sent with the "local" address despite that I tracerouted a host that should go through the default route (please note that checking the default route is *not* enough as you might have a lot of other routes with higher priority).

Credits for debugging this problem go to Jan Krupa, <krupaj@mobilnews.cz>.
Comment 1 Markus Ullmann (RETIRED) gentoo-dev 2006-10-08 14:41:13 UTC
This has to be fixed upstream as we can't do much about that, but I doubt they'll fix it anyway as the last release was made in 2000
Comment 2 Eldad Zack (RETIRED) gentoo-dev 2006-10-09 14:06:44 UTC
I'll take this
Comment 3 Eldad Zack (RETIRED) gentoo-dev 2006-10-09 14:06:54 UTC
I'll take this
Comment 4 Eldad Zack (RETIRED) gentoo-dev 2006-10-09 14:09:37 UTC
Please give me a more techinical testcase:

route tables (route -n)
commands, results, expected results - if it isn't clear from traceroute's output, tcpdumps as well.



Comment 5 Jan Kundrát (RETIRED) gentoo-dev 2006-10-09 15:13:55 UTC
> route tables (route -n)

pstros ~ # ip r | wc -l
143

If you really need it, let me know. Even if it helped, changing the route table is not an long-term option.

> commands, results, expected results - if it isn't clear from traceroute's
> output, tcpdumps as well.

This is the setup that works. Please note that the 10.18.64.25/30 is listed as the first address of an interface.

Our network uses addresses from the 10.0.0.0/8 subnet. The local end of the "upstream" link that has a default route has the IP address 10.18.64.25/30.

pstros ~ # ip a show dev eth-wl-jih
3: eth-wl-jih: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:d0:b7:07:80:b7 brd ff:ff:ff:ff:ff:ff
    inet 10.18.64.25/30 brd 10.18.64.27 scope global eth-wl-jih
    inet 192.168.5.1/30 brd 192.168.5.3 scope global eth-wl-jih
    inet 10.18.6.249/29 brd 10.18.6.255 scope global eth-wl-jih
    inet6 fe80::2d0:b7ff:fe07:80b7/64 scope link
       valid_lft forever preferred_lft forever

pstros ~ # ip r get 10.19.1.1
10.19.1.1 via 10.18.64.26 dev eth-wl-jih  src 10.18.64.25
    cache  mtu 1500 advmss 1460 metric 10 64

Now when I traceroute 10.19.1.1, I get a proper and functional traceroute (see attached strace.ok):

pstros ~ # strace -o strace.ok traceroute -n 10.19.1.1
traceroute to 10.19.1.1 (10.19.1.1), 30 hops max, 40 byte packets
 1  10.18.64.26  3.962 ms  2.710 ms  5.320 ms
 2  10.18.64.34  4.339 ms  4.672 ms  7.490 ms
 3  10.19.9.241  5.895 ms  4.327 ms  6.277 ms
 4  10.19.3.194  6.579 ms  7.609 ms  5.069 ms
 5  10.19.1.1  6.869 ms  5.182 ms  8.199 ms

However, when I set the eth-wl-jih interface like this -- the only change is another address added as the first one of the interface:
pstros ~ # ip a show eth-wl-jih
3: eth-wl-jih: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:d0:b7:07:80:b7 brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.1/24 brd 192.168.99.255 scope global eth-wl-jih
    inet 10.18.64.25/30 brd 10.18.64.27 scope global eth-wl-jih
    inet 192.168.5.1/30 brd 192.168.5.3 scope global eth-wl-jih
    inet 10.18.6.249/29 brd 10.18.6.255 scope global eth-wl-jih
    inet6 fe80::2d0:b7ff:fe07:80b7/64 scope link
       valid_lft forever preferred_lft forever

, traceroute tries to send packets like (see "strace.err" attachment) which won't receive any reply as their are send with the source address of 192.168.99.1.

The workaround is to list the "most commonly used outgoing IP address" as the first one on every interface. A proper fix would be parsing the routing table and binding to the respective address.

The two promised files will be attached.
Comment 6 Jan Kundrát (RETIRED) gentoo-dev 2006-10-09 15:15:26 UTC
Created attachment 99240 [details]
strace.err
Comment 7 Jan Kundrát (RETIRED) gentoo-dev 2006-10-09 15:15:48 UTC
Created attachment 99241 [details]
strace.ok
Comment 8 Eldad Zack (RETIRED) gentoo-dev 2006-10-13 17:36:28 UTC
I think I have a solution. The problem is that traceroute tries to be smart and determine the source address itself, but doesn't have enough information...

All I did is remove the "guessing" code, which amounts to letting the kernel sort it out instead. it works fine on my machine.

I'm attaching an ebuild here, try it and see if it solves your problem.
Comment 9 Eldad Zack (RETIRED) gentoo-dev 2006-10-13 17:37:38 UTC
Created attachment 99610 [details]
experimental ebuild, attempt to solve source address problem.
Comment 10 Eldad Zack (RETIRED) gentoo-dev 2006-10-13 17:38:45 UTC
BTW, the patch is here:

dev.gentoo.org/~eldad/patches/traceroute-1.4a12-let_kernel_find_address.patch


If it solves your problem I'll add it to the patchball and commit a proper ebuild to cvs.
Comment 11 Jan Kundrát (RETIRED) gentoo-dev 2006-10-14 02:33:56 UTC
This patch works when added to the (current stable) 1.4_p12-r2 ebuild.

I've looked at the -r5 ebuild (version you've based your patch on) and it seems to differ a lot. Fex it says "nasty hack until bug 93363 is fixed" while the bug has been already fixed, LDFLAGS filtering seems to differ and it installs this setuid app in such a way that even non-wheel users can use it.

That said, your ebuild works for me here as well (2.6 hardened-sources), altought I'll stick with the -r2 version with your patch until the above mentioned issues are solved.

And it would be nice if BSD people could check if it doesn't break for them.
Comment 12 Eldad Zack (RETIRED) gentoo-dev 2006-10-16 14:48:15 UTC
Can you get some BSD guys to test this too before I commit the changes?
Comment 13 Roy Marples (RETIRED) gentoo-dev 2006-10-18 03:00:32 UTC
Doesn't break my FreeBSD, but I've not used the test case as noted above.
Can't fault the logic of the patch though :)
Comment 14 Eldad Zack (RETIRED) gentoo-dev 2006-11-18 04:38:06 UTC
in CVS.
Comment 15 Daniel Drake (RETIRED) gentoo-dev 2007-05-07 21:44:50 UTC
this bug introduced bug #158851