Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 252229 - [2.6.26 regression] sata_nv hang on boot when SMP/HIGHMEM enabled
Summary: [2.6.26 regression] sata_nv hang on boot when SMP/HIGHMEM enabled
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard: linux-2.6.26-regression
Keywords:
Depends on:
Blocks:
 
Reported: 2008-12-23 00:45 UTC by M Baker
Modified: 2009-04-09 02:04 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Config for 2.6.28-rc9 which hangs on boot (.config,86.21 KB, text/plain)
2008-12-24 19:37 UTC, M Baker
Details
.config for 2.6.25-r6 which does NOT hang on boot (.config,79.38 KB, text/plain)
2008-12-24 20:59 UTC, M Baker
Details
lspci -vvvv output for gentoo kernel 2.6.25-r6 (lspci_kernel-2.6.25.log,33.07 KB, text/plain)
2008-12-31 04:55 UTC, M Baker
Details
git bisect log for 2.5.25 (git_bisect_kernel-2.6.25_sata_nv_hang.log,2.20 KB, text/plain)
2008-12-31 04:58 UTC, M Baker
Details
config used for git bisect (.config,79.10 KB, text/plain)
2008-12-31 05:03 UTC, M Baker
Details
git bisect showing single commit that hangs system (git_bisect_sata_nv_try_2.log,2.27 KB, text/plain)
2009-01-10 03:16 UTC, M Baker
Details
/var/log/messages from booting machine (messages.log.1,47.05 KB, text/plain)
2009-01-13 02:28 UTC, M Baker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description M Baker 2008-12-23 00:45:48 UTC
Gerntoo with kernel 2.6.26 (all patches) through 2.6.27 almost always (not always...depends on kernel options) hangs when booting trying to load the sata_nv module.  Kernel 2.6.25 always boots without problems.   I have an evga motherboard with with an nvidia i780 chipset.  I have 4 GB of RAM installed.  Intel 3450 CPU (quad core).

If i switch to a single processor (non SMP) kernel, or use only low memory option of kernel (1GB only) then machine will boot fine.

Used genkernel to build kernel.  Most options are default but turned off virutalization support (so I can run virtualbox) and switched on core2 architecture option

During the hang, the HD activity light is on almost (blinks a little) constantly for a minute or so, then shuts off.  Does not respond to ctrl+alt+delete.  Need to use reset button to reboot.

Reproducible: Always

Steps to Reproduce:
1.  boot machine with kernel 2.6.26 or 2.6.25
2.  
3.

Actual Results:  
Hangs loadint the sata_nv module.

Expected Results:  
should load without hang
Comment 1 M Baker 2008-12-23 00:47:12 UTC
Correction to steps: 

Steps to Reproduce:
1.  boot machine with kernel 2.6.26 or 2.6.27
2.  
3.
Comment 2 Wormo (RETIRED) gentoo-dev 2008-12-24 08:18:49 UTC
Is this reproducible with vanilla sources? If so, do you want to try doing git bisect using mainline git repository to figure out which patch triggered this problem?
Comment 3 M Baker 2008-12-24 16:06:44 UTC
Tried with vanilla-sources-2.6.28-rc9 and that hangs as well.  I can do a git bisect if you point me to some instructions.
Comment 5 Sergey Ovcharenko 2008-12-24 18:29:43 UTC
Did you try to build sata_nv in kernel (not as a module)?
Can you provide us with the boot logs?
Also post your .config please.
Comment 6 M Baker 2008-12-24 19:30:38 UTC
(In reply to comment #5)
> Did you try to build sata_nv in kernel (not as a module)?
> Can you provide us with the boot logs?
> Also post your .config please.
> 

Which boot log files do you want and against which kernel?

Comment 7 M Baker 2008-12-24 19:31:15 UTC
(In reply to comment #5)
> Did you try to build sata_nv in kernel (not as a module)?
> Can you provide us with the boot logs?
> Also post your .config please.
> 

Yes, I tried built into kernel...no change.
Comment 8 M Baker 2008-12-24 19:35:29 UTC
(In reply to comment #4)
> Certainly, here are a couple of useful guides:
> http://www.kernel.org/doc/local/git-quick.html
> http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
> 

This will take some time.  Thanks for the isntructions and Merry Christmas to you.
Comment 9 M Baker 2008-12-24 19:37:26 UTC
Created attachment 176310 [details]
Config for 2.6.28-rc9 which hangs on boot
Comment 10 Markos Chandras (RETIRED) gentoo-dev 2008-12-24 19:58:13 UTC
Can you attach the .config file of the 2.6.25 kernel since we know it works there?Thanks
Comment 11 M Baker 2008-12-24 20:59:37 UTC
Created attachment 176317 [details]
.config for 2.6.25-r6 which does NOT hang on boot
Comment 12 Markos Chandras (RETIRED) gentoo-dev 2008-12-24 23:22:34 UTC
Could you please boot 2.6.25 kernel and give us an lspci -vvv output?

Thanks
Comment 13 M Baker 2008-12-31 04:55:29 UTC
Created attachment 176927 [details]
lspci -vvvv output for gentoo kernel 2.6.25-r6
Comment 14 M Baker 2008-12-31 04:58:32 UTC
Created attachment 176929 [details]
git bisect log for 2.5.25

git bisect reports 50be4917ee70218f59e04dec029121b97fb9cb3d is first bad checkin that hangs
Comment 15 M Baker 2008-12-31 05:03:03 UTC
Created attachment 176931 [details]
config used for git bisect

I used alternate .config (attached) to build the git bisect log due to compile errors.  Attached is the one I used which turns off all the unused file systems (reiserfs, dlock, ).  Used this to reproduce hang/no hang during bisect.  It is baseically exactly what genkernel outputs except for file systems turned off.
Comment 16 M Baker 2008-12-31 05:04:14 UTC
Note that the git bisect log is only partial since I had to restart it a few times in the middle due to various compile errors and other occasional hang in various bisect builds.  It does narrow down to the specific commit though.
Comment 17 Daniel Drake (RETIRED) gentoo-dev 2009-01-09 17:06:01 UTC
Not really, it narrowed it down to a merge commit (the parent of many others), and a large merge at that. How strange. I'll try and look at the log in more detail.
Comment 18 M Baker 2009-01-10 03:15:27 UTC
I went through git-bisect again this time starting with the known bad one and it narrowed it down to a different commit.  My mistake or gits...who knows.

Anywhere attached is new bisect log
Comment 19 M Baker 2009-01-10 03:16:06 UTC
Created attachment 177933 [details]
git bisect showing single commit that hangs system
Comment 20 Markos Chandras (RETIRED) gentoo-dev 2009-01-10 11:15:50 UTC
(In reply to comment #19)
> Created an attachment (id=177933) [edit]
> git bisect showing single commit that hangs system
> 

Great. That helps a lot

Could you please try the latest vanilla-sources (2.6.28) and verify that the problem still exists?

Thank you
Comment 21 M Baker 2009-01-10 17:13:19 UTC
Very much still present in vanilla-sources-2.6.28.
Comment 22 George Kadianakis (RETIRED) gentoo-dev 2009-01-10 19:48:26 UTC
(In reply to comment #21)
> Very much still present in vanilla-sources-2.6.28.
> 

Could you provide us with the last few lines of booting output just before the hang?
Also, the dmesg output of a working kernel (2.6.25?) would be really appreciated, to see which message would come normally after the hang.
Comment 23 M Baker 2009-01-13 02:28:54 UTC
Created attachment 178316 [details]
/var/log/messages from booting machine
Comment 24 M Baker 2009-01-13 02:38:21 UTC
Last few messages

Scanning for pcmcia_core...loaded
Scanning for sata_promise...loaded
Scanning for sata_sil...loaded
Scanning for sata_sil24...loaded
Scanning for sata_svw...loaded
Scanning for sata_nv...<hang>
Comment 25 Andrew 2009-01-15 17:52:32 UTC
I have also had problems with  nv_sata and the 2.6.26 and 2.6.27 kernels
(see my report in http://forums.gentoo.org/viewtopic-t-727145-highlight-.html)
I don't know if my report is related to your bug or not. In summary:
(1) going from gentoo-sources-2.6.24-r8 to 2.6.26-r4 and 2.6.27-r7
more or less broke my sata_nv. (broken dmesg in the forum report.)

(2) By running mint linux with the 2.6.27 kernel and studying lsmod,
I was able to configure 2.6.27-r7 by just adding a few items to my old configuration so that it ran quiet satisfactorily.

(3) 2.6.26-r4 failed to work when I used the configuration obtained from (2).


I don't know whether any of this is of interest to you or not.
If it is, let me know if you want any more info.
Comment 26 Daniel Drake (RETIRED) gentoo-dev 2009-01-15 18:16:06 UTC
(In reply to comment #25)
> I have also had problems with  nv_sata and the 2.6.26 and 2.6.27 kernels
> (see my report in http://forums.gentoo.org/viewtopic-t-727145-highlight-.html)
> I don't know if my report is related to your bug or not. In summary:

Then please file a new bug. It is very frustrating to deal with 2 problems on 1 bug, so please help us avoid the potential frustration if you are in doubt! :)
Comment 27 Andrew 2009-01-15 19:23:53 UTC
(In reply to comment #26)
> (In reply to comment #25)
> > I have also had problems with  nv_sata and the 2.6.26 and 2.6.27 kernels
> > (see my report in http://forums.gentoo.org/viewtopic-t-727145-highlight-.html)
> > I don't know if my report is related to your bug or not. In summary:
> 
> Then please file a new bug. It is very frustrating to deal with 2 problems on 1
> bug, so please help us avoid the potential frustration if you are in doubt! :)
> 
Please ignore my comment 25.
Thanks for explaining this too me and sorry to have bothered you. I thought (mistakenly) that the additional info might be helpful to you.  Actually I don't view that I have a problem as I have configured a 2.7.27 kernel that works just fine, even though I can't use make oldconfig to configure it.

I won't be filing a new bug as from my perspective everything is now fine.
Comment 28 M Baker 2009-02-14 13:25:30 UTC
I have tried this again with 2.6.29-rc4.  It doesn't matter if SMP or high mem is turned on or off, it still hangs.  

I think the SMP/high mem thing is a red herring and I didn't test it right the first time around.
Comment 29 M Baker 2009-04-09 02:04:53 UTC
Tried again with 2.6.29-gentoo-r1 and it failed.  I then pulled the plugs on both hard-drives and switched to different SATA ports on motherboard....now it works fine.  They were previously on ports 0 and 1, now on like 6 and 7.

This bug can be closed since this is no longer a problem for me.  I will reopen if it starts failing again.