Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 177952

Summary: [2.6.21 regression] ACPI S5 (Power off) is broken
Product: Gentoo Linux Reporter: Vladimir Pouzanov <farcaller>
Component: [OLD] Core systemAssignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel>
Status: VERIFIED NEEDINFO    
Severity: normal CC: kim, radek
Priority: High    
Version: unspecified   
Hardware: x86   
OS: Linux   
Whiteboard: linux-2.6.21-regression linux-2.6.22
Package list:
Runtime testing required: ---
Attachments: linux-2.6.21-gentoo config
diff from 2.6.20 to 2.6.21

Description Vladimir Pouzanov 2007-05-10 18:10:56 UTC
My notebook no longer powers down on request, I have "Power Down." message instead. This happens on vanilla sources, gentoo and suspend2 are also broken.

Reproducible: Always

Steps to Reproduce:
1. Power down notebook by issuing 'suspend' or alt+sysrq+o

Actual Results:  
"Power Down." message

Expected Results:  
Powered off device

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 72
model name      : AMD Turion(tm) 64 X2 Mobile Technology TL-52
stepping        : 2
cpu MHz         : 800.000
cache size      : 512 KB
physical id     : 0
siblings        : 2
core id         : 0
cpu cores       : 2
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8legacy ts fid vid ttp tm stc
bogomips        : 1608.46
clflush size    : 64

# lspci
00:00.0 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.1 RAM memory: nVidia Corporation C51 Memory Controller 0 (rev a2)
00:00.2 RAM memory: nVidia Corporation C51 Memory Controller 1 (rev a2)
00:00.3 RAM memory: nVidia Corporation C51 Memory Controller 5 (rev a2)
00:00.4 RAM memory: nVidia Corporation C51 Memory Controller 4 (rev a2)
00:00.5 RAM memory: nVidia Corporation C51 Host Bridge (rev a2)
00:00.6 RAM memory: nVidia Corporation C51 Memory Controller 3 (rev a2)
00:00.7 RAM memory: nVidia Corporation C51 Memory Controller 2 (rev a2)
00:02.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:03.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:04.0 PCI bridge: nVidia Corporation C51 PCI Express Bridge (rev a1)
00:09.0 RAM memory: nVidia Corporation MCP51 Host Bridge (rev a2)
00:0a.0 ISA bridge: nVidia Corporation MCP51 LPC Bridge (rev a3)
00:0a.1 SMBus: nVidia Corporation MCP51 SMBus (rev a3)
00:0b.0 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)
00:0b.1 USB Controller: nVidia Corporation MCP51 USB Controller (rev a3)
00:0d.0 IDE interface: nVidia Corporation MCP51 IDE (rev a1)
00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev a1)
00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2)
00:10.1 Audio device: nVidia Corporation MCP51 High Definition Audio (rev a2)
00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
01:00.0 Ethernet controller: Atheros Communications, Inc. AR5006EG 802.11 b/g Wireless PCI Express Adapter (rev 01)
04:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce Go 7600] (rev a1)
05:01.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller
05:01.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 19)
05:01.2 System peripheral: Ricoh Co Ltd Unknown device 0843 (rev 01)
05:01.3 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 0a)
Comment 1 Vladimir Pouzanov 2007-05-10 18:12:14 UTC
Created attachment 118773 [details]
linux-2.6.21-gentoo config
Comment 2 Kim Højgaard-Hansen 2007-05-10 18:27:05 UTC
could you elaborate a little more on "power down" message?

Is this dmesg? and could you attach this as well?
Comment 3 Vladimir Pouzanov 2007-05-10 18:36:58 UTC
This is the last output from kernel, not a dmesg, just a printk ;) THen kernel is completely halted.

I've tried to make printk more verbose and got last line:
acpi_power_off called (drivers/acpi/sleep/poweroff.c:45)
Comment 4 Kim Højgaard-Hansen 2007-05-10 19:25:41 UTC
and the last working kernel version was?
Comment 5 Vladimir Pouzanov 2007-05-10 19:31:14 UTC
Sorry, forgot to mention that. Last known good is linux-2.6.20-gentoo-r8
Comment 6 Kim Højgaard-Hansen 2007-05-10 20:54:20 UTC
which suspension mode do you use?

there should be "platform" "shutdown" and "reboot" , you could try all of them
Comment 7 Vladimir Pouzanov 2007-05-10 20:58:00 UTC
> there should be "platform" "shutdown" and "reboot" , you could try all of them

I assume that "shutdown" is /sbin/shutdown and "reboot" is /sbin/reboot. What's "platform" then and how can I try it?
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2007-05-10 22:46:10 UTC
I'm a little confused as to whether this bug is about suspend-to-disk or power off..

Does powering off without suspending work? i.e. just running "halt"
how about "reboot"?
Comment 9 Vladimir Pouzanov 2007-05-11 08:20:37 UTC
This is not suspend2 or any other way of suspend, just power off. halt, poweroff, shutdown, alt+sysrq+o work the same - I get "Power Down." message. reboot works. suspend2 works too, but after suspending (to disk) I have the same "Power Down." message.
Comment 10 Daniel Drake (RETIRED) gentoo-dev 2007-05-11 12:16:18 UTC
OK. Can you shutdown/halt when booted with the "acpi=off" kernel parameter?
Comment 11 Vladimir Pouzanov 2007-05-11 12:56:57 UTC
The only difference with acpi=off is working system after "Power Down."/"System halted." message, i.e. kernel is not halted.
Comment 12 Kim Højgaard-Hansen 2007-05-12 10:17:14 UTC
could you specify what configuration parameters you changed from 2.6.20-r8 to 2.6.21 ? (if any) 

Could you try with a newer version of vanilla-sources?
Comment 13 Vladimir Pouzanov 2007-05-12 11:16:17 UTC
Created attachment 118986 [details]
diff from 2.6.20 to 2.6.21
Comment 14 Vladimir Pouzanov 2007-05-12 11:17:53 UTC
Do you mean git version by 'a newer version of vanilla-sources'? AFAIK, 2.6.21.1 is the latest one.
Comment 15 Kim Højgaard-Hansen 2007-05-12 12:07:14 UTC
so you did try 2.6.21.1 ? Then perhaps the git snapshot yes

also, could you try hitting some keys when you get the powerdown message? I found someone reporting that he could halt the machine that way
Comment 16 Kim Højgaard-Hansen 2007-05-13 15:32:17 UTC
could you test with the newly released 2.6.22-rc1 ?
Comment 17 Kevin Bowling 2007-05-13 22:28:14 UTC
From the dialog in this bug, I don't know if you guys understand what the reporter is saying.  The kernel is no longer able to soft off the machine.  Remember Windows 95 "It is not safe to turn off your computer." messages?  The kernel does the same thing if it cannot soft off a machine.
Comment 18 Daniel Drake (RETIRED) gentoo-dev 2007-05-13 23:49:13 UTC
I may be wrong, but I think the kernel prints that message even when "automatic" power down will happen immediately after.

Also, this worked in earlier kernels, so is a valid bug regardless of what appears on screen.
Comment 19 Vladimir Pouzanov 2007-05-14 07:01:03 UTC
ok, it's working again in 2.6.22-r1, however I've got half of the screen filled with some messages. Not sure about what they say, because power is down in half a second.
Comment 20 Daniel Drake (RETIRED) gentoo-dev 2007-05-20 04:25:28 UTC
I've researched this and can't find anything other than another unsolved report (http://lkml.org/lkml/2007/5/8/281). I've looked through the various resources (LKML, kernel bugzilla, git logs) but I can't see anything obvious that would have fixed this for 2.6.22. It's also hard to say which part of the kernel the original bug is related to -- it doesn't appear to be ACPI.

If you're interested in helping further, you could do an inverse bisection. First read about bisections here: http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

You need to do the opposite, with the aim of finding the patch which solved the issue in 2.6.22 (so that we can backport it). Mark 2.6.21 as good, and 2.6.22-rc1 as bad. When you find a kernel that fails to shutdown, mark it as GOOD. When you find a kernel that shuts down OK, mark it as BAD. Eventually you'll reach the "first bad commit" which will be the commit that fixed the bug.

Given that this is a time consuming process (I estimate 13 kernels required to complete the bisection) and you have already found some working configurations, I understand if you aren't willing to go to this trouble -- this bug will simply sit around until 2.6.22 is released (unless more info appears).
Comment 21 Vladimir Pouzanov 2007-05-22 20:36:25 UTC
bisect is a really nice thing! Last good revision is 2e42005bcdb4f63bed1cea7f537a5534d4bd7a57. Next revisions:
f3d2e7865c816258c699ff965768e46b50d536d3
c5fc42ac4d4d6d3e3f619290b86890cb3725d2f8
8f34890dce60f7df6dd23a0d04977c6572adaab8
4bf273939c99fae5bae399f51c417a552d74b97f
a4bbb810dedaecf74d54b16b6dd3c33e95e1024c
fail to compile. Next one (ad71860a17ba33eb0e673e9e2cf5ba0d8e3e3fdd) compiles, S5 is broken there. diff between 2e42005b and ad71860a is 9833 lines long and it makes me feel nervous.
Comment 22 Daniel Drake (RETIRED) gentoo-dev 2007-05-22 21:21:45 UTC
Can you please post the output of "git bisect log" ?
Comment 23 Daniel Drake (RETIRED) gentoo-dev 2007-05-22 21:23:05 UTC
Also, am I right in assuming that for the ones that failed to compile, you marked them as good/bad? If so, that probably messed up the bisection result. This is really the only bisect downfall (sorry, I should have mentioned that earlier)
Comment 24 Vladimir Pouzanov 2007-05-22 21:36:04 UTC
git-bisect start
# bad: [de46c33745f5e2ad594c72f2cf5f490861b16ce1] Linux 2.6.21
git-bisect bad de46c33745f5e2ad594c72f2cf5f490861b16ce1
# good: [62d0cfcb27cf755cebdc93ca95dabc83608007cd] Linux 2.6.20
git-bisect good 62d0cfcb27cf755cebdc93ca95dabc83608007cd
# bad: [7292576043666ff39946dee14641fe719ba8c7e8] ACPI: fix S3 fan resume issue
git-bisect bad 7292576043666ff39946dee14641fe719ba8c7e8
# bad: [4768fbcbcfbbcacb785ae08eef33767a0b4fdcdd] [NET]: Fix whitespace errors.
git-bisect bad 4768fbcbcfbbcacb785ae08eef33767a0b4fdcdd
# bad: [905adce4094d64a6691df994e424fbf486301adc] Merge master.kernel.org:/pub/scm/linux/kernel/git/bart/ide-2.6
git-bisect bad 905adce4094d64a6691df994e424fbf486301adc
# good: [64358164f5bfe5e11d4040c1eb674c29e1436ce5] USB: remove duplicate device id from zc0301
git-bisect good 64358164f5bfe5e11d4040c1eb674c29e1436ce5
# bad: [21d37bbc65e39a26856de6b14be371ff24e0d03f] Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
git-bisect bad 21d37bbc65e39a26856de6b14be371ff24e0d03f
# bad: [5008740e27540e4069a2f8235f8308aba46036a2] ACPICA: Update version to 20061215
git-bisect bad 5008740e27540e4069a2f8235f8308aba46036a2
# bad: [977a6226feae3e2c10a4d8227625ff0f04b49239] ACPICA: Fix trace output name and whitespace
git-bisect bad 977a6226feae3e2c10a4d8227625ff0f04b49239
# bad: [c5a7156959e89b32260ad6072bbf5077bcdfbeee] ACPICA: Disable all wake GPEs after first one recieved
git-bisect bad c5a7156959e89b32260ad6072bbf5077bcdfbeee
# good: [f93a21c7184de3db962d01f11eb2ddad5396c824] ACPICA: Update version to 20060721
git-bisect good f93a21c7184de3db962d01f11eb2ddad5396c824
# bad: [4bf273939c99fae5bae399f51c417a552d74b97f] ACPICA: Fix for FADT conversion in 64-bit mode
git-bisect bad 4bf273939c99fae5bae399f51c417a552d74b97f
# bad: [f3d2e7865c816258c699ff965768e46b50d536d3] ACPICA: Implement simplified Table Manager
git-bisect bad f3d2e7865c816258c699ff965768e46b50d536d3

I've marked failed compilation as bad (that was f3d2e78), after that I've hunted for good/bad by cg-seek.
Comment 25 Daniel Drake (RETIRED) gentoo-dev 2007-05-22 23:02:10 UTC
(In reply to comment #21)
> bisect is a really nice thing! Last good revision is
> 2e42005bcdb4f63bed1cea7f537a5534d4bd7a57.

good as in "shuts down OK"?

> Next one (ad71860a17ba33eb0e673e9e2cf5ba0d8e3e3fdd) compiles,
> S5 is broken there.

OK - I think you've followed the general guide on my weblog, which tracks down the commit that ADDED the bug.

That's really useful when the bug isn't already solved in a newer version, and may still prove to be useful, but I was hoping that you'd invert it in order to find the patch that FIXED the bug (read comment #20).

If you can confirm that you did the bisection in the style written on my weblog (the 'normal' way), I'll write to the ACPI list with that info, asking if they know which patch solved the issue. If they don't then you can consider starting the inverted bisection a day or 2 later (I know it's a time consuming process so let's give email a shot first!)
Comment 26 Daniel Drake (RETIRED) gentoo-dev 2007-06-09 15:18:12 UTC
pleas see comment #25
Comment 27 Vladimir Pouzanov 2007-06-09 15:24:55 UTC
I've switched to .22-r4 with manually applied gentoo & suspend2 patchsets. So, if nobody else will find this, bug can be safely discarded.