Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 254682 - [2.6.26 regression] preprocessor slowdown
Summary: [2.6.26 regression] preprocessor slowdown
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: http://bugzilla.kernel.org/show_bug.c...
Whiteboard: watch-linux-bugzilla
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-12 20:53 UTC by Steven Patrick
Modified: 2009-01-29 16:49 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
distccmon-gnome snapshot from 2.6.25.20 (fast.png,26.07 KB, image/png)
2009-01-12 20:55 UTC, Steven Patrick
Details
distccmon-gnome snapshot from 2.6.26 (slow.png,27.20 KB, image/png)
2009-01-12 20:55 UTC, Steven Patrick
Details
output from git bisect log (git_bisect.log,1.69 KB, text/plain)
2009-01-22 23:02 UTC, Steven Patrick
Details
kernel config file (kernel-config-x86_64-2.6.26-rc3.gz,17.95 KB, application/octet-stream)
2009-01-22 23:50 UTC, Steven Patrick
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Patrick 2009-01-12 20:53:40 UTC
Greetings.

  I apologize if this is not the right place to report this
problem, but I don't know of a better one yet.  Please direct me
to a better forum if necessary.

  Here's my situation.  I've recently upgraded the kernels
on ~30 computers at work (from 2.6.21 to 2.6.27).  These 
computers are used to build and test software we develop.  
We speed up the building process using distcc.  However,
after the kernel upgrade, the builds are much much slower.
The preprocessing stage seems to be at least 10 times
slower.
  As evidence of this slowdown I am attaching two images created
using distccmon-gnome.  Both snapshots were taken shortly 
after starting builds in a clean sandbox.  The only difference
is the kernel.  "fast.png" was generated while running
kernel 2.6.25.20.  "slow.png" was generated with 2.6.26.
The light purple sections indicate the preprocessing times
for each file.
  This slowdown is observed on both 32 and 64 bit computers
and using either gcc or the intel compiler. (The intel compiler
builds do not use distcc, but that are also slower.)  Strangely 
enough, it's still faster to use an NFS mounted sandbox on a
machine with an older kernel than the same sandbox on the local
machine with a newer kernel.  (This suggests to me that it
is neither a disk or network IO problem.)
  As you might guess, I've already narrowed it down to
something between 2.6.25 and 2.6.26.  All the kernels I built
used the default configuration as supplied by genkernel.
I also see the same issue with a 2.6.28 kernel configured without 
using genkernel.  I also see it in both gentoo-sources and
vanilla-sources.
  Can I get the old (better) performance in a newer kernel?
  Are the rest of my file accesses also much slower?
  I've tried all kinds of internet searches to find an
explanation for this slowdown.  If there is one, I'm
clearly not using the right search terms.
  I will happily supply any more information anyone might
request and/or test new/different kernels.

Thanks,
Steven
Comment 1 Steven Patrick 2009-01-12 20:55:09 UTC
Created attachment 178219 [details]
distccmon-gnome snapshot from 2.6.25.20
Comment 2 Steven Patrick 2009-01-12 20:55:48 UTC
Created attachment 178221 [details]
distccmon-gnome snapshot from 2.6.26
Comment 3 Daniel Drake (RETIRED) gentoo-dev 2009-01-16 23:48:48 UTC
Thanks for your investigation so far!

It is really hard to guess at the cause of this kind of bug. If you have the time and patience, it would be great if you could run the following process:
http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

Use v2.6.25 as good and v2.6.26 as bad. It will require you to test about 14 kernels, but assuming a smooth bisection it should pinpoint the exact commit which introduced the bug.
Comment 4 Steven Patrick 2009-01-22 23:02:00 UTC
Created attachment 179393 [details]
output from git bisect log
Comment 5 Steven Patrick 2009-01-22 23:15:31 UTC
  Well, the original bisect between v2.6.25 and v2.6.26 didn't
go to well.  So, I manually narrowed it down to between
v2.6.26-rc3 and v2.6.26-rc4.  Using those two values the bisection
went smoothly, yielding the newly attached log.
  I'm just confused by two things.  First, the message connected to
the identified commit only mentions NFS updates.  I'm not really
using NFS when experiencing the original problem.  (I do use amd to
automount home directories, etc.)  Second, the apparent diffs related
to the identified commit includes more than NFS related files?
  What should I do now?
Comment 6 Steven Patrick 2009-01-22 23:50:10 UTC
Created attachment 179395 [details]
kernel config file

  The kernel config file I used for both the faster kernel
and the last few iterations of the slower kernel.
Comment 7 Steven Patrick 2009-01-28 19:49:34 UTC
I've submitted this to bugzilla.kernel.org.
It's bug 12564 there.
Comment 8 Steven Patrick 2009-01-29 16:49:13 UTC
  I've found the root cause of the problem.  The kernel
patch identified is the one where the kernel started
honoring the "noac" NFS option.  And it turns out that amd
was using this option when doing its top level mounts.
By telling amd to use a non-zero "auto_attrcache" value, 
all of my performance issues went away.