Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 335329 - sys-kernel/gentoo-sources-2.6.35-r4 hangs periodically
Summary: sys-kernel/gentoo-sources-2.6.35-r4 hangs periodically
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-30 21:19 UTC by Marcus Becker
Modified: 2010-09-29 18:50 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge_info.log,3.72 KB, text/plain)
2010-08-30 21:21 UTC, Marcus Becker
Details
/var/log/messages (messages.log,1.21 MB, text/plain)
2010-08-30 21:23 UTC, Marcus Becker
Details
/var/log/Xorg.0.log.old (Xorg.0.log.old.log,29.65 KB, text/plain)
2010-08-30 21:23 UTC, Marcus Becker
Details
gekernel config, the same I used for 2.6.35 (kernel-config-x86-2.6.34-gentoo-r6,82.21 KB, text/plain)
2010-08-30 21:26 UTC, Marcus Becker
Details
kernel config for 2.6.35 (same that works fine in 2.6.34) (kernel-config-x86-2.6.35-gentoo-r4,84.25 KB, text/plain)
2010-09-07 19:15 UTC, Marcus Becker
Details
manual configuration for 2.6.35 very tidy (config-2.6.35,70.53 KB, text/plain)
2010-09-11 11:20 UTC, Marcus Becker
Details
dmesg while running after boot and the laptop attached to the docking station. (dmesg_running.txt,69.19 KB, text/plain)
2010-09-11 11:23 UTC, Marcus Becker
Details
lsmod (lsmod.txt,449 bytes, text/plain)
2010-09-12 16:05 UTC, Marcus Becker
Details
/var/log/messages with 2.6.35-r9 (messages_2.6.35-r9.txt,8.27 KB, text/plain)
2010-09-29 18:46 UTC, Marcus Becker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marcus Becker 2010-08-30 21:19:52 UTC
I installed the gentoo-sources, compiled them and couldn't find a problem first. The intelfb driver for X works ok, but produces a lot of error messages in Xorg.0.log. This might be the problem, but I am not sure.

About every 5-10 seconds the system hangs for 0.5 seconds. I notice that when playing any media on the system. The same if I circle constantly with the mouse over the desktop, it hangs after 5-10 seconds shortly and then I can move it again.

Since I mostly noticed it with media like videos or audio or youtube etc. I tried to increase responsiveness of the kernel, by e.g. use tickless system and not, increased and decreased the timer frequency, tried other parameter, tried genkernel and not. There are no real fatal error logs for the kernel (except of problems with dhcpcd, which I saw in another bug report). I attached the messages from yesterday until today with several reboots :)

Finally I wanted to make sure, that it is the kernel and masked 2.6.35, installed 2.6.34, compiled, booted and no problem.

Reproducible: Always

Steps to Reproduce:
1. use kernel 2.6.35-r4
2. play any media file or move the mouse, type something etc.
3.

Actual Results:  
hangs for half a second every now and then


Expected Results:  
smooth playback and no interruptions

I attach the emerge --info, messages, Xorg.0.log.old
Comment 1 Marcus Becker 2010-08-30 21:21:54 UTC
Created attachment 245409 [details]
emerge --info
Comment 2 Marcus Becker 2010-08-30 21:23:10 UTC
Created attachment 245410 [details]
/var/log/messages
Comment 3 Marcus Becker 2010-08-30 21:23:45 UTC
Created attachment 245411 [details]
/var/log/Xorg.0.log.old
Comment 4 Marcus Becker 2010-08-30 21:26:38 UTC
Created attachment 245412 [details]
gekernel config, the same I used for 2.6.35
Comment 5 Marcus Becker 2010-09-01 16:19:48 UTC
If you want, I could try to disable all the cpuidle stuff and see if that causes the problems?
Comment 6 Marcus Becker 2010-09-07 19:15:14 UTC
Created attachment 246389 [details]
kernel config for 2.6.35 (same that works fine in 2.6.34)
Comment 7 Marcus Becker 2010-09-07 19:20:06 UTC
I made a little script:
#!/bin/bash

for i in {1..50}
do
        date
        sleep .5
done

that one hung between 16:01 and the second 16:01, but it continues as if it wouldn't have lost time :)

Tue Sep  7 20:15:59 BST 2010
Tue Sep  7 20:15:59 BST 2010
Tue Sep  7 20:16:00 BST 2010
Tue Sep  7 20:16:01 BST 2010 <<-- hung here for a moment
Tue Sep  7 20:16:01 BST 2010
Tue Sep  7 20:16:02 BST 2010
Tue Sep  7 20:16:02 BST 2010
Tue Sep  7 20:16:03 BST 2010

no logs whatsoever in /var/log/messages during this time
Comment 8 Mike Pagano gentoo-dev 2010-09-07 23:23:24 UTC
Systems hangs are some of the most difficult problems to troubleshoot. 
It could be a hardware misconfiguration. 

I always start with a clean .config file when upgrading a major version.  Please try this, you can use http://kmuto.jp/debian/hcl/ to help you determine the drivers you need.

When you run your bash script, is that in a terminal with no X running? If not, can you tell me if it hangs without X?

Anything interesting in dmesg after the hang? Can I see that please.


Comment 9 Marcus Becker 2010-09-08 07:11:39 UTC
(In reply to comment #8)
> Systems hangs are some of the most difficult problems to troubleshoot. 
> It could be a hardware misconfiguration. 
> 
> I always start with a clean .config file when upgrading a major version. 
> Please try this, you can use http://kmuto.jp/debian/hcl/ to help you determine
> the drivers you need.
> 
> When you run your bash script, is that in a terminal with no X running? If not,
> can you tell me if it hangs without X?
> 
> Anything interesting in dmesg after the hang? Can I see that please.
> 

With the bash script the same happens in tty2 or in xterm with 2.6.35.

I thought I would use a rather generic genkernel configuration to upgrade. The weird thing is, that the excact same config works fine in 2.6.34 and I run Linux ever since I got the laptop on it, never had this problem.
Currently both kernels are installed, if I boot into 2.6.35 it hangs and if I boot into 2.6.34 is doesn't. There was nothing changed in the BIOS or any hardware changed.

I'll do a manual configuration again, when I am home and post my config and dmesg(~12h have to go to Oxford later :().
Comment 10 Marcus Becker 2010-09-11 11:20:38 UTC
Created attachment 246815 [details]
manual configuration for 2.6.35 very tidy

Still the same issue with hanging. I also attach the dmesg after boot and running the script.
For dhcpcd to work, I have to kill it once or twice and try again. Eventually it picks up an ip after a while.
Comment 11 Marcus Becker 2010-09-11 11:23:01 UTC
Created attachment 246817 [details]
dmesg while running after boot and the laptop attached to the docking station.
Comment 12 Mike Pagano gentoo-dev 2010-09-12 13:26:13 UTC
Can you attach the output of lsmod, please.
Comment 13 Marcus Becker 2010-09-12 16:05:40 UTC
Created attachment 246985 [details]
lsmod
Comment 14 Marcus Becker 2010-09-13 08:50:39 UTC
I am wondering, if I am the only one having this problem. This afternoon, I'll see if I kick out all the ACPI and CPU scheduling stuff. As I read this is one "improvements" in the new kernel...
CONFIG_USE_GENERIC_SMP_HELPERS=y
CONFIG_X86_32_SMP=y
could this be a problem on a single core CPU from now on?
"x86, apic: Map the local apic when parsing the MP table."
http://www.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.35.4
Comment 15 Marcus Becker 2010-09-13 09:12:48 UTC
Here is a guy with the excact same CPU as I have:
https://bugzilla.kernel.org/show_bug.cgi?id=17411

For me the system doesn't totally hang, but just every few seconds for about .5 seconds, which makes playing media or mouse operations a pain.
It could really be the SMP setting? I'll check that later.
Comment 16 Marcus Becker 2010-09-13 19:19:04 UTC
nope, still the same problem without smp enabled
if I disable the whole acpi and cpufreg-scaling (yes like an old AT machine) I still have the problem.
any changes to the scheduler in 2.6.35?
Comment 17 Marcus Becker 2010-09-13 20:42:10 UTC
tried several different settings again:
tickless system
frequencies (1000 and 300)
low latency desktop, desktop, server
cfq, deadline and no-op
slab and unallocated
pentium-pro, pentium-m and generic x86

any more ideas, please let me know...
Comment 18 Mike Pagano gentoo-dev 2010-09-29 14:32:44 UTC
The upstream bugs reporter notes that he cannot reproduce the issue with later versions of 2.6.35.X.

Can you please test with the latest version which is gentoo-sources-2.6.35-r9 at the time of this comment.
Comment 19 Marcus Becker 2010-09-29 18:46:38 UTC
Created attachment 248997 [details]
/var/log/messages with 2.6.35-r9

the date.sh script runs now without interruptions!

this is the log while I was playing a movie, the playback still hangs but there is more time in between (feels like ~20-30sec).
the interruptions are longer and more badly (about 1 sec).
the kernel was compiled using genkernel with rather default options.

conclusion: the hanging now is not the system but rather the dhcpcd problem? Maybe every time dhcpcd kicks in, it lags X + mouse + video?

So no interruptions on the console anymore :)
Comment 20 Marcus Becker 2010-09-29 18:50:32 UTC
made dhcpcd working, by killing it a few times and restarting it
no interruptions anymore...