174732 – SMP irq affinity strangeness

Bug 174732 - SMP irq affinity strangeness

Summary: SMP irq affinity strangeness

Status:	RESOLVED INVALID

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	[OLD] Core system (show other bugs)
Hardware:	AMD64 Linux

Importance:	High trivial
Assignee:	Gentoo Kernel Bug Wranglers and Kernel Maintainers

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2007-04-16 01:06 UTC by Martin Práek
Modified:	2007-05-01 17:00 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Martin Práek 2007-04-16 01:06:07 UTC

I found some strange  IRQ affinity behavior


On my AMD dual core  system almost  all IRQs are handled by only one cpu. (CPU1) Krenl 2.6.19-gentoo-r6
Output of  cat /proc/interrupts look like this:


           CPU0       CPU1       
  0:       1670    1674264   IO-APIC-edge      timer
  1:          7       5944   IO-APIC-edge      i8042
  8:          0          1   IO-APIC-edge      rtc
  9:          0          0   IO-APIC-fasteoi   acpi
 12:          5        115   IO-APIC-edge      i8042
 14:          0         60   IO-APIC-edge      ide0
 20:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1
 21:          0          0   IO-APIC-fasteoi   libata
 22:          0          0   IO-APIC-fasteoi   libata
 23:        103      47646   IO-APIC-fasteoi   libata, ohci_hcd:usb2
504:          0          1   PCI-MSI-edge      eth1
505:         16       8646   PCI-MSI-edge      eth0
NMI:        147         97 
LOC:    1675620    1675573 
ERR:          0






Seems, that all interrupts goes to cpu1 , expect some minority part end on cpu0 (few from 1000). I guess that this interrupt load should be  round-robin distributed over all 2 CPU  so difference in counters should be little (not order of  magnitudes) And  I do not run irqbalance 


I can normally change desired IRQ SMP_affinity by setting affinity mask in /proc/irq/xxx/smp_affinity, ie set it to  "1" send IRQs to CPU0, set it to "2" send IRQS to CPU1,  but setting to 03 (ie allow  IRQ to  cpu0 and cpu1 )  does not result to a affinity change (ie IRQ stay still on the last CPU). 


This actual behavior is not consistent with kernel Documentation/IRQ-affinity.txt

 
I try to compile different gentoo kernel versions from 2.6.17 to 2.6.20 without changes. I try to compile it with cpusets enabled or disabled, various combinatoons of schledulers etc with no difference.

However with CONFIG_HOTPLUG_CPU (support for cpu hotplug), result is  slightly different - all interrupts except LOC end on  CPU0

           CPU0       CPU1       
  0:    9633060		 0   IO-APIC-edge      timer
  1:      52979          0   IO-APIC-edge      i8042
  8:          1          0   IO-APIC-edge      rtc
  9:          0          0   IO-APIC-fasteoi   acpi
 12:        992          0   IO-APIC-edge      i8042
 14:         60          0   IO-APIC-edge      ide0
 20:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1
 21:          0          0   IO-APIC-fasteoi   libata
 22:     885865          0   IO-APIC-fasteoi   libata, eth1
 23:      37353          0   IO-APIC-fasteoi   libata, ohci_hcd:usb2
505:      15357          0   PCI-MSI-edge      eth0
NMI:        700        277 
LOC:    9702632    9517839 
ERR:          0



My hardware is  Asus M2N-E, bios 0802 (last available)  AMD dual core x2 5200+ +e1000NIC and SATA disks and old PCI video

Reproducible: Always

Steps to Reproduce:
1. look  at the /proc/interrupts
2. generate some IRQ traffic (ping on nic, some disk copy etc)
3. observe what is happen (ie what cpu (core) IRQ counter increase  ) in /proc/interrupts
4 . Try  use kernel with CONFIG_CPU_HOTPLUG and observe the difference
5. Actual behavior is not consistent with kernel Documentation/IRQ-affinity.txt
Actual Results:  
Output of  cat /proc/interrupts look like this:


           CPU0       CPU1       
  0:       1670    1674264   IO-APIC-edge      timer
  1:          7       5944   IO-APIC-edge      i8042
  8:          0          1   IO-APIC-edge      rtc
  9:          0          0   IO-APIC-fasteoi   acpi
 12:          5        115   IO-APIC-edge      i8042
 14:          0         60   IO-APIC-edge      ide0
 20:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1
 21:          0          0   IO-APIC-fasteoi   libata
 22:          0          0   IO-APIC-fasteoi   libata
 23:        103      47646   IO-APIC-fasteoi   libata, ohci_hcd:usb2
504:          0          1   PCI-MSI-edge      eth1
505:         16       8646   PCI-MSI-edge      eth0
NMI:        147         97 
LOC:    1675620    1675573 
ERR:          0



Expected Results:  
Irq are round robin routed to all cpus like described in kernel Documentation/IRQ-affinity.txt

Comment 1 Daniel Drake (RETIRED) gentoo-dev

2007-04-16 11:41:17 UTC

I think this is normal behaviour for a dual core processor. The documentation you are reading refers to true SMP systems with separate processors.

Comment 2 Kevin Bowling 2007-04-27 02:12:48 UTC

Try 'emerge irqbalance'.  Should be a base dependency IMHO.

@dsd
Dual core is pretty much dual CPU in this regard.  The only real difference of dual core vs. dual CPU is I/O bandwidth, which AMD has a plenty.

Comment 3 Daniel Drake (RETIRED) gentoo-dev

2007-05-01 17:00:46 UTC

There was once a LKML discussion where this was being discussed, and if I remember correctly, the default behaviour of most/all interrupts going to one core is perfectly normal for dual core setups, and actually balancing them between the two cores is mostly/entirely redundant here (unless you're talking dual physical processors). However, I may be misremembering and I can't find this discussion now...

Regardless of that, I strongly doubt you have enough interrupt sources to warrant balancing -- at least your /proc/interrupts isn't particularly large. If you do want balancing, the kernel has a basic mechanism on some archs (CONFIG_IRQBALANCE), or there is the more complete userspace alternative (irqbalance). If neither actually change much then it simply means you aren't getting enough interrupts for them to have to be 'distributed' further than usual.

Also, I don't see where IRQ-affinity.txt says that interrupts are handled round-robin style by the individual CPUs.