Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 25597

Summary: 2.4.20-gentoo-r5 never sees SYN/ACK for SYN it sent
Product: Gentoo Linux Reporter: seatec-astronomy
Component: [OLD] Core systemAssignee: x86-kernel (DEPRECATED) <x86-kernel>
Status: VERIFIED WORKSFORME    
Severity: blocker CC: steel300
Priority: Highest    
Version: unspecified   
Hardware: x86   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: linux-2.4.20-gentoo-r5 config
linux-2.4.20-gentoo-r5 config
Even MORE simple link detection.

Description seatec-astronomy 2003-07-30 17:29:13 UTC
I had problems with my network for hours today. While ping(icmp) and dns(udp)
worked fine, tcp did not. The best impression will give you my post to the forums. 
http://forums.gentoo.org/viewtopic.php?t=71122&highlight=seatec&sid=ba5f8d5dea55da725d4e93c0460b9dda
In the end I fixed it by replacing the 2.4.20-gentoo-r5 with a vanilla 2.4.20.
After booting the new kernel everything was back to normal. The problem was:
When I sent a SYN into the internet, my gentoo never "saw" the SYN/ACK. It got
delivered(I sniffed the lan, it was definitively delivered to the gentoo), but
sniffing locally on that problematic gentoo machine, I noticed it never showed
up. Apparently the kernel threw it away. The NIC is an onboard 3com 3c940
gigabit ethernet. The driver was included in the kernel sources. I already had
the network up and wrking with that kernel. I have absolutely no clue why it
stopped working. 
(I'm trying to give accurate information in few sentences. But this is my first
post to any bugzilla. I'm sorry if I couldn't satisfy your requests for information)
Comment 1 Tim Yamin (RETIRED) gentoo-dev 2003-08-26 15:20:53 UTC
Do you get this with -r6? Are you using any netfilter software/firewall/blah, is anything like that enabled/built as modules in your kernel?
Comment 2 seatec-astronomy 2003-08-26 16:19:53 UTC
I know it sounds like, but there was no firewall of any sort in between. The only thing I know is that the machine sent out syns, and never saw the syn/acks that actually came back. I still have that kernel config+src. Right now I'm still using a 2.4.20, but will update these days. So I didn't test -r6 yet. 
CONFIG_NETFILTER and CONFIG_FILTER was not enabled in the config. My NIC is an onboard gbit ethernet. Maybe there's a problem with the driver implementation?
Comment 3 Jay Pfeifer (RETIRED) gentoo-dev 2003-09-07 08:24:24 UTC
which nic are you using? how about a kernel .config and relevant system info.

Thanks,

Jay
Comment 4 seatec-astronomy 2003-09-10 22:27:31 UTC
Created attachment 17480 [details]
linux-2.4.20-gentoo-r5 config
Comment 5 seatec-astronomy 2003-09-10 22:28:34 UTC
Created attachment 17481 [details]
linux-2.4.20-gentoo-r5 config
Comment 6 seatec-astronomy 2003-09-10 22:33:08 UTC
Argh. Had problems attaching the file. Bugzilla gave me an error msg. :}
The nic is an onboard chip, a 3com 3c940. It's a gbit ethernet chip. 
The system is an iP4-2.8 ghz northwood. The board is an Asus P4C800 Deluxe. The machine has 512 MB Infineon RAM. 
Comment 7 seatec-astronomy 2003-09-19 08:36:54 UTC
bash-2.05b$ uname -a
Linux IceDragon 2.4.20-gentoo-r7 #1 Fri Sep 19 15:11:45 CEST 2003 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux
I recompiled my kernel tonight, after running a vanilla 2.4.18 for weeks. I'm experiencing the same problems as with gentoo-r5 with gentoo-r7 now, only that it's even more bizarre. 
While I'm able to connect to most hosts, www.heise.de, www.google.com and many others just don't work. The symptoms are the same. I see the syn from my host, I sniff the router(some suse) and see the syn/ack coming back from the host(f.ex. www.google.com), but the sniffer running on gentoo-r7 never sees that syn/ack coming in. 
This is _not_ an application problem. I tried it with different browsers and the telnet client. They work fine. They do what they're supposed to do: send out a syn and wait for the reply to complete the handshake. The reply just never reaches them. My MTU is 1500. So whats wrong? Do you want that Kernel config too?
Comment 8 Brian Jackson (RETIRED) gentoo-dev 2003-09-19 09:27:57 UTC
the gigE driver you are using is the sk98lin? have you been able to reproduce this with another nic or with a different kernel version? I'd be interested to see how a vanilla 2.4.22 treats you.
Comment 9 seatec-astronomy 2003-09-19 10:00:45 UTC
yes I'm using the sk98lin driver that came with the kernel(I didn't patch myself). 
I'll try a vanilla 2.4.22 as soon as I find time, prolly sunday, else monday. I do not have any otehr nic in that machine. As I said vanilla 2.4.18 worked without any problems. 
Comment 10 Tim Yamin (RETIRED) gentoo-dev 2003-10-11 05:05:51 UTC
Okay: This probably is a gentoo-sources problem and it looks like somebody
did a messy backport in <= -r5...

Can you: modify the patches-.... archive for the gentoo-sources, get rid
of the "sk98lin" file, run "ebuild /usr/portage/sys-kernel/gentoo-sources/gentoo-sources-2.4.20-r7.ebuild
digest" and remerge gentoo-sources...
Comment 11 Tim Yamin (RETIRED) gentoo-dev 2003-10-21 09:51:49 UTC
* ping * -> see comment #10
Comment 12 seatec-astronomy 2003-10-25 04:22:34 UTC
sorry was pretty busy lately. did you see that I had the same problem with
r7? r7 was not patched by me(iirc, may be wrong). I think it already contained
that sk98lin driver. so now what?
Comment 13 Tim Yamin (RETIRED) gentoo-dev 2003-10-25 05:01:40 UTC
Yes, -r7 contains the driver, so does -r8. I asked whether you could rip
it out of the patches-... tarball [it's called sk98lin] and run 'patch' with
it, telling it to reverse the patch when asked.
Comment 14 Jordan Ritter 2003-11-08 11:10:11 UTC
Created attachment 20444 [details]
Even MORE simple link detection.

Instead of using mii-tool to get link state, we can re-use the link state
"RUNNING" from ifconfig (which comes from the ioctl SIOCGIFFLAGS, which equates
to the netif_carrier_ok (netdevice.h) from the kernel's net_device struct).
 

This patch simply augments setup_env() with one more variable.
Comment 15 Jordan Ritter 2003-11-08 11:15:08 UTC
ahh shit, sorry, I got confused by bugzilla.  apologies.
Comment 16 Jason Cox (RETIRED) gentoo-dev 2004-04-08 18:36:15 UTC
Is this still valid? Is the error there anymore? If it's not I'm marking this closed.
Comment 17 seatec-astronomy 2004-04-09 01:05:00 UTC
I can not access the machine till september, and I'm still not running a gentoo kernel on it. So honestly I do not know wether it's fixed or not. Apparently I was the only one reporting this issue? I'd close this case if I were you.
Comment 18 Jason Cox (RETIRED) gentoo-dev 2004-04-09 01:14:55 UTC
Closing. Marking works for me, since this is an isolated incident that noone else could reproduce.
Comment 19 Jason Cox (RETIRED) gentoo-dev 2004-04-09 01:15:13 UTC
Closed.