I have a 2.5GHz PowerMac (7,3 architecture) that frequently turns itself off
under load when running the 2.5.12 kernel.
Operations that will make this happen include calculating the module
dependencies during boot and the command 'emerge --update --deep --pretend
world'. In the latter case it crashes while calculating dependencies.
When the crash does occur, the machine is starting to do something CPU
intensive. The fans start getting louder and then the machine turns itself off.
This problem does not occur with the 2.6.11 kernel, so I suspect that there is
a problem with the thermal suppport for this machine in the newer kernel.
Steps to Reproduce:
1. Install the latest stable kernel on a 2.5GHz PowerMac
2. Boot, run 'emerge --sync', and 'emerge --update --deep --pretend world'
The machine should turn itself off when the fans start up
Should run to completion
The problem has also been mentioned in the forums:
can you tell me exactly which kernel you're using (gentoo-sources,
I had this problem when running both gentoo-sources 2.6.12-r6 and
gentoo-sources 2.6.12-r4. The problem hasn't occurred when I'm running
In both cases my configuration has been the default config file (from
arch/ppc64/configs) with the addition of support for the HFS+ filesystem.
I hope that this helps.
Does this occur on other machines (trying to isolate if this maybe a a hardware
also, is there anything in your system log that you would be able to share?
dependant on how soon the machine powers off there maybe some interesting tidbits.
I am also running a PowerMac7,3 with kernel 2.6.12 and have no problems so far.
Can you please provide your emerge --info output? Maybe you have some bad kernel
headers, CFLAGS, or other "not so common" installation.
Created attachment 66042 [details]
Created attachment 66043 [details]
Syslog messages from boot to crash.
Created attachment 66044 [details]
Syslog messages from boot to crash.
Comment on attachment 66043 [details]
Syslog messages from boot to crash.
Sorry for the double post
I have finally had a chance to perform some more testing on this problem. Alas
I do not have access to another G5, but I haven't had similar problems when
running MacOS 10.3 or 10.4. When running gentoo-sources-2.6.11-r8 this
problem has happened once since I posted this bug, and the machine has had
quite active loads applied to it in that time. In contrast I can get it to
crash within 5 minutes using gentoo-sources-2.6.12-r8.
Attached is my config file for gentoo-sources-2.6.12-r8 and the sysylog output
from boot to crash.
The execution sequence was:
- Boot kernel
- Let X bomb out because /dev/mouse wasn't present (this is the cause of the X
errors in the syslog. I didn't see this 2.6.11-r4 and haven't investigated
- Login as root
- Run 'emerge --update --deep --pretend world' multiple times (about 7) until
the crash occurs.
The contents of 'emerge --info' are:
Portage 18.104.22.168-r2 (default-linux/ppc64/2005.0, gcc-3.4.4,
glibc-22.214.171.12441102-r1, 2.6.11-gentoo-r8 ppc64)
System uname: 2.6.11-gentoo-r8 ppc64 PPC970FX, altivec supported
Gentoo Base System version 1.6.13
sys-devel/autoconf: 2.13, 2.59-r6
sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
CFLAGS="-O2 -pipe -mcpu=970 -mtune=970"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control"
CXXFLAGS="-O2 -pipe -mcpu=970 -mtune=970"
FEATURES="autoconfig distlocks sandbox sfperms strict"
USE="X altivec bash-completion berkdb bitmap-fonts bzip2 cups eds emacs emboss
esd fam fortran gdbm gif gnome gpm gstreamer gtk gtk2 imlib java jpeg kde
libwww motif mozilla ncurses nls opengl pam perl png postgres ppc64 python qt
readline ssl tcpd tetex tiff truetype truetype-fonts type1-fonts unicode xml2
xv zlib userland_GNU kernel_linux elibc_glibc"
Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTDIR_OVERLAY
Do you get the same results with vanilla-sources?
Since the last update to this bug I've tried a number of different kernels and
configurations with pretty much the same result. This includes
vanilla-sources-2.6.13 and gentoo-sources-2.6.13.
The problem does seem to be related to increasing load. A common crash point is
during the calculation of module dependencies at boot. At this point the fans
are at their loudest since startup. However, if the root file system is
force-checked during boot (because the machine as been rebooted so often :-)
then this reduces the CPU load, the fans slow down and then the calculation of
module dependencies passes without incident.
This problem has also occured while using the 2005.1 install CD. This
installation kerel turned off while I was decompressing the stage3 tarball using
the tar command line listed in the handbook.
I'm beginning to suspect the suggestion above that it might be a hardware
problem. My only hesitation is that I do not see this problem at all when
running MacOS X (10.3 and 10.4).
I'm not going to try installing some other distributions to see if that will
provide some useful data on the problem.
Examing the source code to the G5 fan driver in vanilla 126.96.36.199 I have
discovered a possible cause.
When the machine first gets to hot (line therm_pm72.c:1664) or stays too hot for
too long (line 1675) then the machine could be powered off, which is the symptom
that I'm seeing.
In the first case it is shut off because an attempt to run the userland program
/sbin/critical_overtemp has failed. This program is not present on the 2005.1
minimal CD, nor in the stage3 tar ball so if this code path is taken the machine
I found that on my PowerMac 7,3 2x2.3Ghz that the fan just wasn't getting up to
speed in time to cool the processors down. The fan is supposed to idle at
300rpm, but drops as low as 150rpm. It can take up to 2 seconds to get to full
speed (3000rpm approx), but the cpu can reach a critical temperature in a much
shorter time frame. The following change to the kernel keeps the fan at a
decent speed a whilest keeping noise at a minimum. It eliminated the problem
for me in a room temperature of 25C.
--- linux/drivers/macintosh/therm_pm72.c 2005-10-02 17:38:44.000000000 +1000
+++ linux/drivers/macintosh/therm_pm72.c.original 2005-10-02
@@ -512,8 +511,8 @@
if (id == FCU_FAN_ABSENT_ID)
- if (rpm < 1500)
- rpm = 1500;
+ if (rpm < 300)
+ rpm = 300;
else if (rpm > 8191)
rpm = 8191;
buf = rpm >> 5;
Increasing the min rpm to 2500 eliminated the problem at temperatures up to 29C,
but drastically increased the noise level.
I forgot to mention, the critical temperature appears to be about 85C on either
processor. If you get above that the kernel will call "/sbin/critical_overtemp"
(which I'm guessing should be a script that runs '/sbin/shutdown -h now') waits
30 seconds then turns the power off.
I've seen the cpu temperature reach 84.3C on one CPU in MacOS 10.4.2 without
triggering a shutdown.
If your kernel configuration contains "# CONFIG_IOMMU_VMERGE is not set" then
the kernel cannot turn the power off preventing the pesky unprompted shutdowns
(at a cost of "Shutdown timed out, power off now !" printks and the risk of
turning your shinny PowerMac into an expensive stone).
Created attachment 70270 [details, diff]
I contacted benh and he gave me this patch with a good comment what is going on
Hi Markus ! Here's what i posted to other people with the same problem
recently, feel free to copy that to the gentoo bug, I'm still waiting
for feedback on the proposed patch.
-------- Forwarded Message --------
From: Benjamin Herrenschmidt <firstname.lastname@example.org>
Subject: Problems with overtemp conditions on a PowerMac G5
Date: Fri, 07 Oct 2005 11:09:50 +1000
I'm doing this group mailing to people who have reported so far issues
with the machine shutting down abruptly after putting some load on the
I've investigated the issue, and came up with a couple of facts here:
One is that some machines seem to have either an incorrect thermal
calibration data, or simply a defective CPU<->heatsink connection, and
the other one is that Darwin/OS X makes this problem "invisible" by
silently slowing the CPU down when it heats up too much (thus they can
claim the problem doesn't exist and don't have to service the faulty
machines I suppose).
I have made a patch to the linux thermal driver that may help. The idea
is that if the driver detects a critical thermal condition, it doesn't
shut down right away, but gives a few seconds with fans at full speed
for the condition to clear up instead of shutting down.
Please let le know if that helps
If you test this patch, then please leave a comment and/or contact benh (benh
at kernel dot crashing dot org) directly.
Created attachment 75071 [details, diff]
now here we go! This patch solves the problem competely for me! Benh has already send this patch upstream, so it will be included in next kernel release!
@kernel herd: Would you mind adding this to gentoo-sources?
Looks good, but we'll wait for it to hit Linus' tree first.
it hit Linus' tree:
As the OP of this bug let me add my confirmation of Ben's comments in his git log. This latest patch doesn't eliminate the crashes entirely but reduces them so significantly that my machine is stable enough for my purposes.
I'd encourage this latest patch be added to gentoo-sources as soon as possible.
Fixed in genpatches-2.6.14-7 (gentoo-sources-2.6.14-r6)