Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 297314 - 'hw csum failure' errors on Sun quad port HME using VLAN with sys-kernel/gentoo-sources-2.6.31-r6
Summary: 'hw csum failure' errors on Sun quad port HME using VLAN with sys-kernel/gent...
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-17 19:10 UTC by Tomasz Orzechowski
Modified: 2010-02-19 17:25 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tomasz Orzechowski 2009-12-17 19:10:58 UTC
I am getting very frequent (430k in 10 days, so about one every 2 seconds on average) 'hw csum failure' messages on a VLAN created on a ethernet port on a Sun HME quad ethernet card.  The 'hw csum failure' message is followed by a random 'Pid: 0, comm: swapper Not tainted 2.6.31-gentoo-r6' message (where the pid/comm changes with no apparent logic).  That line is followed by a stack trace, included below.  There are no reported errors on the underlying interface.  Before creating the VLAN there were no problems on the same physical link on this port.  The link is not very heavily used, peaking at about 10Mbit/s or roughly 10% capacity.  Traffic is flowing, with no apparent packet loss at the IP level.  Rebooting the machine didn't resolve the issue, nor does removing/reinserting the cable.

Call Trace:                                                     
 [<c11fea91>] netdev_rx_csum_fault+0x31/0x40                    
 [<c11f9e49>] __skb_checksum_complete_head+0x59/0x60            
 [<c11f9e5b>] __skb_checksum_complete+0xb/0x10                  
 [<c12745a4>] nf_ip_checksum+0xa4/0x110                         
 [<c1274500>] ? nf_ip_checksum+0x0/0x110                        
 [<f8265ddb>] tcp_error+0xcb/0x240 [nf_conntrack]               
 [<c10076d9>] ? nommu_map_page+0x39/0x70                        
 [<f8265d10>] ? tcp_error+0x0/0x240 [nf_conntrack]              
 [<f826266f>] nf_conntrack_in+0xdf/0x4b0 [nf_conntrack]         
 [<c10076a0>] ? nommu_map_page+0x0/0x70                         
 [<c11fef31>] ? dev_hard_start_xmit+0x241/0x380                 
 [<c12150ed>] ? __qdisc_run+0x12d/0x1b0                         
 [<f828d2e0>] ? ipv4_conntrack_in+0x0/0x20 [nf_conntrack_ipv4]  
 [<f828d2fa>] ipv4_conntrack_in+0x1a/0x20 [nf_conntrack_ipv4]   
 [<c1232b87>] nf_iterate+0x57/0x80                              
 [<c123a690>] ? ip_rcv_finish+0x0/0x2c0                         
 [<c1232dcd>] nf_hook_slow+0x4d/0xc0                            
 [<c123a690>] ? ip_rcv_finish+0x0/0x2c0                         
 [<c123ade6>] ip_rcv+0x1f6/0x280                                
 [<c123a690>] ? ip_rcv_finish+0x0/0x2c0                         
 [<c123abf0>] ? ip_rcv+0x0/0x280                                
 [<c11fe2a2>] netif_receive_skb+0x2a2/0x520                     
 [<c1200cf9>] process_backlog+0x69/0x90                         
 [<c12011e7>] net_rx_action+0x97/0x110                          
 [<c1022b83>] __do_softirq+0x73/0x100                           
 [<c103f695>] ? handle_IRQ_event+0x35/0xc0                      
 [<c1022c3a>] do_softirq+0x2a/0x30                              
 [<c1022eca>] irq_exit+0x2a/0x40                                
 [<c1004922>] do_IRQ+0x42/0x90                                  
 [<c11f6edc>] ? __kfree_skb+0x3c/0x90                           
 [<c1003189>] ? common_interrupt+0x29/0x30                      
 [<c1003189>] common_interrupt+0x29/0x30                        
 [<c1040000>] ? synchronize_irq+0x80/0xc0                       
 [<c107fbc1>] ? __mnt_is_readonly+0x1/0x20                      
 [<c107fc3b>] ? mnt_clone_write+0xb/0x20                        
 [<c107fc8e>] mnt_want_write_file+0x3e/0x50                     
 [<c107e328>] file_update_time+0x38/0xd0                        
 [<c10499fa>] __generic_file_aio_write_nolock+0x20a/0x4e0       
 [<c1128507>] ? do_con_write+0x367/0x1aa0                       
 [<c103277a>] ? atomic_notifier_call_chain+0x1a/0x20            
 [<c1126042>] ? notify_update+0x22/0x30                         
 [<c1049f74>] generic_file_aio_write+0x54/0xc0                  
 [<c10b5abd>] ext3_file_write+0x2d/0xc0                         
 [<c106cc5c>] do_sync_write+0xcc/0x110                          
 [<c102ef30>] ? autoremove_wake_function+0x0/0x50               
 [<c111d7f8>] ? tty_ldisc_deref+0x8/0x10                        
 [<c11182d9>] ? tty_write+0x1a9/0x1d0                           
 [<c106d3b9>] vfs_write+0x99/0x150                              
 [<c1101255>] ? copy_to_user+0x35/0x50                          
 [<c106cb90>] ? do_sync_write+0x0/0x110                         
 [<c106d92d>] sys_write+0x3d/0x70                               
 [<c1002b48>] sysenter_do_call+0x12/0x26                        

Reproducible: Always

Steps to Reproduce:
1. configure vlan on Sun HME card
2. generate traffic
3. watch dmesg

Actual Results:  
errors in dmesg/log

Expected Results:  
Nothing.
Comment 1 George Kadianakis (RETIRED) gentoo-dev 2009-12-30 15:30:12 UTC
Greetings,

could you try reporting this issue upstream (http://bugzilla.kernel.org/)?
If you do so, could you paste us the link to the upstream bug report here?
Comment 2 Tomasz Orzechowski 2010-01-04 13:47:14 UTC
Seems to be indeed an upstream bug with Sun HME cards and VLANs in general.

http://bugzilla.kernel.org/show_bug.cgi?id=9270

I hope I am marking change of status correctly.
Comment 3 Mike Pagano gentoo-dev 2010-01-09 18:47:57 UTC
Can you attach the full dmesg with the error?

Last working kernel? Please test with 2.6.32 and git 2.6.33_rcX if 2.6.32 fails.