Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 158317 - >=sys-kernel/gentoo-sources-2.6.17-r8 - web pages don't finish loading
Summary: >=sys-kernel/gentoo-sources-2.6.17-r8 - web pages don't finish loading
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-12-16 12:46 UTC by Joel Kammet
Modified: 2006-12-16 21:28 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
wireshark tcp capture - plain text (tcp-trace.txt,15.20 KB, text/plain)
2006-12-16 19:21 UTC, Joel Kammet
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Joel Kammet 2006-12-16 12:46:30 UTC
I can't get certain web pages to finish loading in any browser (tried Firefox-bin, Konqueror, Galeon, Epiphany, Links) under kernel 2.6.18-gentoo-r4 or 2.6.17-gentoo-r8. I haven't noticed any problems loading pages at other websites, but on the other hand, this problem occurs ONLY on my Gentoo machine when running these kernels. These pages seem to work fine on every other machine I try: my laptop running Ubuntu Dapper, an old desktop running Debian Sarge, another old desktop running RH9.

They also work on this Gentoo box if I boot it in WinXP, and -- I just tried booting an old kernel 2.6.14-gentoo-r5 and they work correctly under that too. It might be that I did something different in my kernel configuration, but what? Or is it a bug in the newer kernels?

Here are links to the problem pages:

This one loads only as far as the second line of the course search app: the box to enter semester, and then hangs.
http://websql.brooklyn.cuny.edu/course_search/

This one LOOKS like it loads a full page, but really never finishes (links to "J" or any subsequent letter don't work:
http://websql.brooklyn.cuny.edu/course_search/acad/dept_list.jsp?div=G
Comment 1 Daniel Drake (RETIRED) gentoo-dev 2006-12-16 15:18:18 UTC
I think you have a broken router in your path.

Does this help:

echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
Comment 2 Joel Kammet 2006-12-16 19:21:05 UTC
Created attachment 104193 [details]
wireshark tcp capture - plain text

This capture starts about 4 seconds before trying to load the course-search page, and ends after about 15 or 20 seconds of "no progress".
Comment 3 Joel Kammet 2006-12-16 19:26:06 UTC
(In reply to comment #1)
> I think you have a broken router in your path.
> 
> Does this help:
> 
> echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
> 
Yes, disabling tcp window scaling gets those pages to load, but why should I have to disable it only for these kernels?  I have 3 other machines in front of me, all going through the same router, one running ubuntu's 2.6.15-27-386, one running debian's 2.6.16-2-686, one running redhat's 2.4.20-31-9, all of them have tcp window scaling enabled, and all of them load those pages correctly.

I posted a wireshark capture of the tcp activity while trying (unsuccessfully) to load the course-search page.  Maybe it will help show you what's happening. 
Comment 4 Daniel Drake (RETIRED) gentoo-dev 2006-12-16 19:52:30 UTC
Newer kernels scale TCP window size much more based on RAM.

On my 1GB system, upgrading to the newer kernel made my window scale value jump from 2 (factor 4) to 7 (factor 128). You can see this is a huge increase. The same kind of thing will be happening on your system.

For example:

Beforehand, my system would be requesting 512 bytes of data (i.e. 128 bytes with window scale 2, calculation 2^2 * 128). Due to a broken router in my path, the remote end saw that as a request for 128 bytes. However, as this is a long distance connection the remote end will only send 50 bytes at a time to reduce the undesirable effects of packet loss, so I didn't see any degradation.

On new kernels, when my system requests 512 bytes, it actually is requesting 4 bytes with window scale 7 (calculation 2^7 * 4). Due to the broken router, the remote end sees this as a request for 4 bytes of data, and responds with a measly 4 bytes. My system then requests another 512 bytes, but due to the broken router it only gets another 4, so many many more requests are needed, resulting in excessively long load times.

I was actually able to get my ISP to fix the issue after I narrowed it down to connections made through their transparent NetApp webcache. I also hear that Windows Vista ships with window scaling turned on, so you could use this as a way to motivate your ISP to fix it :)
Comment 5 Joel Kammet 2006-12-16 20:09:54 UTC
(In reply to comment #4)

Looking at the tcp capture I posted, it looks to me as if my system is requesting a window scaling factor of 6, and the server is replying with 0.  (Isn't that what the "WS=6" in the Syn and the "WS=0" in the Syn+Ack mean?)

Since window scaling is an option, isn't the server simply refusing the option, and shouldn't my system accept 0 scaling for the duration of the session?

Why do you attribute this problem to a router?  Or am I misreading the wireshark output?

Please note also that this behavior has been confirmed by two other people on the forums: http://forums.gentoo.org/viewtopic.php?p=3787991#3787991
and that the problem occurs only for certain files coming out of that server.
Comment 6 Daniel Drake (RETIRED) gentoo-dev 2006-12-16 20:41:06 UTC
Then that server may well be behind a broken router and it might not be your fault or your ISPs fault. But, I saw the same kind of behavior when my ISP had the problem, couldn't exactly explain why it was only certain sites but I could explain the problem to them and they did confirm it.

Your interpretation of the logs is incorrect, because the window is not negotiated, instead two scale factors are stated, one from each computer, one for each direction.

Your computer establishes the A-->B connection, saying "you should send data to me with a window scale factor of 6".

The remote computer acknowledges the above packet and establishes the B-->A connection, with a field in the ACK saying that the window scale in the other direction is 0. According to RFC1323 this means:
"The value 'shift.cnt' may be zero (offering to scale, while applying a scale factor of 1 to the receive window)."

So, the fact that the remote end specified a window scale at all (even if zero) means that it is saying "I support window scaling, and I accept your window scale factor"
Comment 7 Daniel Drake (RETIRED) gentoo-dev 2006-12-16 20:49:59 UTC
On 2.6.19 clamping the window will actually limit the window scaling and you can do this on a per-host/per-route basis. Example:

# ip route add 216.145.246.23/32 via 10.8.0.1 window 65535
Comment 8 Joel Kammet 2006-12-16 21:28:13 UTC
(In reply to comment #7)
> On 2.6.19 clamping the window will actually limit the window scaling and you
> can do this on a per-host/per-route basis. Example:
> 
> # ip route add 216.145.246.23/32 via 10.8.0.1 window 65535
> 
LOL.  I just finished posting the same suggestion on the forum & was about to post it here.

Thanks for your help Daniel.