Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 432568 - Detected Hardware Unit Hang
Summary: Detected Hardware Unit Hang
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo Linux bug wranglers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-24 09:51 UTC by M. Prášek
Modified: 2012-08-24 13:42 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description M. Prášek 2012-08-24 09:51:39 UTC
Kernel 3.5.2-gentoo

One of two INTEl cards in machine :

[ 8641.708407] e1000e 0000:03:00.0: eth1: Detected Hardware Unit Hang:
[ 8641.708407]   TDH                  <3ac>
[ 8641.708407]   TDT                  <3f5>
[ 8641.708407]   next_to_use          <3f5>
[ 8641.708407]   next_to_clean        <3ab>
[ 8641.708407] buffer_info[next_to_clean]:
[ 8641.708407]   time_stamp           <1007f377e>
[ 8641.708407]   next_to_watch        <3ae>
[ 8641.708407]   jiffies              <1007f48cc>
[ 8641.708407]   next_to_watch.status <0>
[ 8641.708407] MAC Status             <80383>
[ 8641.708407] PHY Status             <792d>
[ 8641.708407] PHY 1000BASE-T Status  <3800>
[ 8641.708407] PHY Extended Status    <3000>
[ 8641.708407] PCI Status             <10>
[ 8642.720015] ------------[ cut here ]------------
[ 8642.720024] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x260/0x270()
[ 8642.720026] Hardware name: G33-DS3R
[ 8642.720030] NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
[ 8642.720032] Modules linked in: xt_addrtype xt_hashlimit xt_recent nf_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat_irc nf_conntrack_irc ipv6 dm_mod
[ 8642.720050] Pid: 0, comm: swapper/0 Not tainted 3.5.2-gentooK4-NO_preemption,1000HZ #3
[ 8642.720052] Call Trace:
[ 8642.720055]  <IRQ>  [<ffffffff8103468a>] warn_slowpath_common+0x7a/0xb0
[ 8642.720065]  [<ffffffff81034761>] warn_slowpath_fmt+0x41/0x50
[ 8642.720069]  [<ffffffff815dd8a0>] dev_watchdog+0x260/0x270
[ 8642.720074]  [<ffffffff810425c9>] run_timer_softirq+0x199/0x2d0
[ 8642.720078]  [<ffffffff8104251f>] ? run_timer_softirq+0xef/0x2d0
[ 8642.720082]  [<ffffffff815dd640>] ? __netdev_watchdog_up+0x80/0x80
[ 8642.720086]  [<ffffffff8103c069>] __do_softirq+0xa9/0x150
[ 8642.720093]  [<ffffffff81700e4c>] call_softirq+0x1c/0x30
[ 8642.720097]  [<ffffffff81003c55>] do_softirq+0x75/0xb0
[ 8642.720101]  [<ffffffff8103bd27>] irq_exit+0x97/0xd0
[ 8642.720105]  [<ffffffff8101fd89>] smp_apic_timer_interrupt+0x69/0xa0
[ 8642.720109]  [<ffffffff817005dc>] apic_timer_interrupt+0x6c/0x80
[ 8642.720111]  <EOI>  [<ffffffff8100a17b>] ? mwait_idle+0x6b/0x90
[ 8642.720119]  [<ffffffff8100a172>] ? mwait_idle+0x62/0x90
[ 8642.720123]  [<ffffffff8100a4d8>] cpu_idle+0x88/0xd0
[ 8642.720127]  [<ffffffff816ce2ed>] rest_init+0xad/0xc0
[ 8642.720131]  [<ffffffff816ce240>] ? csum_partial_copy_generic+0x170/0x170
[ 8642.720136]  [<ffffffff81aa0c6c>] start_kernel+0x328/0x335
[ 8642.720140]  [<ffffffff81aa0772>] ? kernel_init+0x19d/0x19d
[ 8642.720144]  [<ffffffff81aa032d>] x86_64_start_reservations+0x131/0x136
[ 8642.720148]  [<ffffffff81aa041f>] x86_64_start_kernel+0xed/0xf4
[ 8642.720150] ---[ end trace 474f23482ccf5ee7 ]---
[ 8642.720279] e1000e 0000:03:00.0: eth1: Reset adapter
[ 8645.948025] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 8754.704395] e1000e 0000:03:00.0: eth1: Detected Hardware Unit Hang:
[ 8754.704395]   TDH                  <8e4>
[ 8754.704395]   TDT                  <926>
[ 8754.704395]   next_to_use          <926>
[ 8754.704395]   next_to_clean        <8e3>
[ 8754.704395] buffer_info[next_to_clean]:
[ 8754.704395]   time_stamp           <10080eef2>
[ 8754.704395]   next_to_watch        <8e6>
[ 8754.704395]   jiffies              <100810230>
[ 8754.704395]   next_to_watch.status <0>
[ 8754.704395] MAC Status             <80383>
[ 8754.704395] PHY Status             <792d>
[ 8754.704395] PHY 1000BASE-T Status  <3800>
[ 8754.704395] PHY Extended Status    <3000>
[ 8754.704395] PCI Status             <10>
[ 8756.704089] e1000e 0000:03:00.0: eth1: Reset adapter
[ 8759.893018] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[ 8781.708405] e1000e 0000:03:00.0: eth1: Detected Hardware Unit Hang:
[ 8781.708405]   TDH                  <3f4>
[ 8781.708405]   TDT                  <423>
[ 8781.708405]   next_to_use          <423>
[ 8781.708405]   next_to_clean        <3f3>
[ 8781.708405] buffer_info[next_to_clean]:
[ 8781.708405]   time_stamp           <1008158f1>
[ 8781.708405]   next_to_watch        <3f7>
[ 8781.708405]   jiffies              <100816bac>
[ 8781.708405]   next_to_watch.status <0>
[ 8781.708405] MAC Status             <80383>
[ 8781.708405] PHY Status             <792d>
[ 8781.708405] PHY 1000BASE-T Status  <3800>
[ 8781.708405] PHY Extended Status    <3000>
[ 8781.708405] PCI Status             <10>
[ 8784.720031] e1000e 0000:03:00.0: eth1: Reset adapter
[ 8787.951018] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

There are two cards in pcie 1x slots. Both cards experienced  sudden  2-3s traffic drops at rates around 200mbits, but only eth1 add this error to dmesg

This is critical because it is a  main router in our wisp company.
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2012-08-24 13:42:22 UTC
1) Does this happen frequently with the 3.5.2 kernel?
2) Does it never happen with an older kernel?