Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 203877 - gentoo-sources-2.6.23-r3 - Processes get stuck in disk state
Summary: gentoo-sources-2.6.23-r3 - Processes get stuck in disk state
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High critical (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-31 12:47 UTC by Kai Krakow
Modified: 2008-05-10 16:34 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kai Krakow 2007-12-31 12:47:23 UTC
I first encountered this error when starting Konqueror or Firefox on sites which load many graphics. My KDE processes start to hang in disk state one after the next until the whole desktop hard locks. If I switch from X to the console I can login as root, but not as user, and check with "ps axuw" that some processes hang in disk state. Switching back to X hard locks the machine, only SSH logins as root are possible then. Reboot hangs. I can do Alt+SysRq+I, ...+S, +U to unmount the partitions and then reboot without filesystem corruptions. CPU-intensive programs like games (Neverwinter Nights, Quake 4, Doom 3) work although they do intensive IO from time to time. But desktop apps which load many small files in a short time (Konqueror, Firefox) make the system block.

I first thought it's related to my home partition being on NFSv4 but I got the same problem on my laptop which does not do IO on any NFS mounts. But it is much more harder to reproduce there. Both systems have in common that filesystems are on XFS.

The only way to fix it was to go back to gentoo-sources-2.6.22-r9. There have been no unusual syslog or dmesg entries before or during the time the hanging processes appear. Usually only KDE processes hang in disk state. Only one time I saw a login process hanging in disk state. I cannot tell if this happens with other filesystems because all my system are on XFS.

If you need more info please tell me.

Reproducible: Sometimes

Steps to Reproduce:
1. Install gentoo-sources-2.6.23-r3 on a system with XFS filesystems.
2. Run IO intensive applications which do many small IO actions in a short time, e.g. Konqueror or Firefox on web pages with many images
Actual Results:  
Sometimes the web browser gets stuck in disk state, other processes supporting the desktop follow, until the whole desktop stops responding.

Expected Results:  
Should work without problems as it did before gentoo-sources-2.6.23 series.

System partitions on XFS, /boot on reiserfs3, /home on NFSv4 on first machine, /home on XFS on second machine. Firefox and Konqueror cache are symlinked to local mounts (XFS). I suppose NFS has nothing to do with this. No unusual dmesg or syslog entries do appear.
Comment 1 Kai Krakow 2008-01-26 20:27:08 UTC
Trying the latest gentoo-sources-2.6.23-r6 changes the behaviour but the system still becomes unresponsive. Process now aren't shown to be in disk state when I look at ps or top but the system still comes to a full stop during intensive IO operations (not bulk transfers) as described before.

Maybe this helps tracking down the problem to some patches involved during changes from r3 -> r6.
Comment 2 Mike Pagano gentoo-dev 2008-02-24 22:02:21 UTC
A similar problem was discussed extensively on the upstream mailing lists for 2.6.23.X.

Can you test with gentoo-sources-2.6.24-r2 and if the problem persists, please test with the latest development kernel which is 2.6.25-rc2 as of this comment.
Comment 3 Kai Krakow 2008-02-26 15:17:06 UTC
(In reply to comment #2)
> A similar problem was discussed extensively on the upstream mailing lists for
> 2.6.23.X.

Do you have a link so I can do some reading? Would be nice... ;-)

> Can you test with gentoo-sources-2.6.24-r2 and if the problem persists, please
> test with the latest development kernel which is 2.6.25-rc2 as of this comment.

I will try the 24 version then and let you know.
Comment 4 Kai Krakow 2008-03-02 05:40:18 UTC
> Can you test with gentoo-sources-2.6.24-r2

I have tested with 2.6.24-r3. The problem persists but symptoms changed: Instead of "top" showing processes in disk state, these processes now just hang in sleep state. I did not try to kill the processes. First konqueror stopped responding, later kdesktop stopped responding. After switching to tty and back to X, even X stopped responding (no more ctrl+alt+f# switching to console possible). I did Ctrl+SysRq+S,U,B then.

> and if the problem persists, please
> test with the latest development kernel which is 2.6.25-rc2 as of this comment.

I suppose I need to use vanilla sources for that? Any gentoo specific patches I should apply first?

Comment 5 Mike Pagano gentoo-dev 2008-03-10 15:29:48 UTC
No specific gentoo patches to apply, just always grab the latest development kernel for testing which is 2.6.25-rc5 at this point.
Comment 6 Kai Krakow 2008-03-22 19:33:12 UTC
I still didn't try 2.6.25 yet. But I am running 2.6.24 pretty successful currently after I switched vom CFQ to AS io scheduler. At least I had no lockups since. But I admit I wasn't using my system too intensive since that time.
Comment 7 Mike Pagano gentoo-dev 2008-03-23 13:25:48 UTC
Thanks for the update. Let us know what the results are if you have chance to try a more intensive test.
Comment 8 Kai Krakow 2008-04-02 07:39:50 UTC
Using AS instead of CFQ seemed to completely fix it on my setup with 2.6.24. I assume this is related to changes in CFQ not playing well with NFS/XFS setups. I will try CFQ again when gentoo-sources updates to 2.6.25 stable...
Comment 9 Mike Pagano gentoo-dev 2008-04-25 00:59:10 UTC
(In reply to comment #8)
> I will try CFQ again when gentoo-sources updates to 2.6.25 stable...
> 

Have you had a chance to test with the latest gentoo-sources-2.6.25-rX release?
Comment 10 Daniel Drake (RETIRED) gentoo-dev 2008-05-10 16:34:41 UTC
Please reopen with the results if you're interested in pursuing this further.