Greetings. I apologize if this is not the right place to report this problem, but I don't know of a better one yet. Please direct me to a better forum if necessary. Here's my situation. I've recently upgraded the kernels on ~30 computers at work (from 2.6.21 to 2.6.27). These computers are used to build and test software we develop. We speed up the building process using distcc. However, after the kernel upgrade, the builds are much much slower. The preprocessing stage seems to be at least 10 times slower. As evidence of this slowdown I am attaching two images created using distccmon-gnome. Both snapshots were taken shortly after starting builds in a clean sandbox. The only difference is the kernel. "fast.png" was generated while running kernel 2.6.25.20. "slow.png" was generated with 2.6.26. The light purple sections indicate the preprocessing times for each file. This slowdown is observed on both 32 and 64 bit computers and using either gcc or the intel compiler. (The intel compiler builds do not use distcc, but that are also slower.) Strangely enough, it's still faster to use an NFS mounted sandbox on a machine with an older kernel than the same sandbox on the local machine with a newer kernel. (This suggests to me that it is neither a disk or network IO problem.) As you might guess, I've already narrowed it down to something between 2.6.25 and 2.6.26. All the kernels I built used the default configuration as supplied by genkernel. I also see the same issue with a 2.6.28 kernel configured without using genkernel. I also see it in both gentoo-sources and vanilla-sources. Can I get the old (better) performance in a newer kernel? Are the rest of my file accesses also much slower? I've tried all kinds of internet searches to find an explanation for this slowdown. If there is one, I'm clearly not using the right search terms. I will happily supply any more information anyone might request and/or test new/different kernels. Thanks, Steven
Created attachment 178219 [details] distccmon-gnome snapshot from 2.6.25.20
Created attachment 178221 [details] distccmon-gnome snapshot from 2.6.26
Thanks for your investigation so far! It is really hard to guess at the cause of this kind of bug. If you have the time and patience, it would be great if you could run the following process: http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/ Use v2.6.25 as good and v2.6.26 as bad. It will require you to test about 14 kernels, but assuming a smooth bisection it should pinpoint the exact commit which introduced the bug.
Created attachment 179393 [details] output from git bisect log
Well, the original bisect between v2.6.25 and v2.6.26 didn't go to well. So, I manually narrowed it down to between v2.6.26-rc3 and v2.6.26-rc4. Using those two values the bisection went smoothly, yielding the newly attached log. I'm just confused by two things. First, the message connected to the identified commit only mentions NFS updates. I'm not really using NFS when experiencing the original problem. (I do use amd to automount home directories, etc.) Second, the apparent diffs related to the identified commit includes more than NFS related files? What should I do now?
Created attachment 179395 [details] kernel config file The kernel config file I used for both the faster kernel and the last few iterations of the slower kernel.
I've submitted this to bugzilla.kernel.org. It's bug 12564 there.
I've found the root cause of the problem. The kernel patch identified is the one where the kernel started honoring the "noac" NFS option. And it turns out that amd was using this option when doing its top level mounts. By telling amd to use a non-zero "auto_attrcache" value, all of my performance issues went away.