I just had a situation where emerging neon hung while running configure. Examining running processes, I identified "uname -p" as the culprit. Calling that manually hung as well, as did attaching an strace to an already running instance. Sending SIGKILL did not work either. Running "strace uname -p" showed me that it hung while reading from /proc/cpuinfo. "cat /proc/cpuinfo" hung as well. Rebooting the system solved the problem, I don't know how or even if this can be reproduced. An ugly bug, that one. I'm using sys-kernel/gentoo-sources-2.6.17-r1, and after reboot my cpuinfo looks like this: # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 3.00GHz stepping : 4 cpu MHz : 3000.000 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pni monitor ds_cpl cid xtpr bogomips : 6026.10 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 3 model name : Intel(R) Pentium(R) 4 CPU 3.00GHz stepping : 4 cpu MHz : 3000.000 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc pni monitor ds_cpl cid xtpr bogomips : 6020.50 # uname -a Linux server 2.6.17-gentoo-r1 #1 SMP Thu Jul 6 10:19:21 CEST 2006 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz GenuineIntel GNU/Linux I'm using SMT (CONFIG_SMP=y and CONFIG_X86_HT=y) and cpufreqd (CONFIG_X86_P4_CLOCKMOD=y). Don't know if either one has anything to do with this issue.
It just happened again. I configured some package several times, and suddenly it got stuck in reading cpuinfo. So this problem was not there from the start but rather occurred somtime while the system was up. Connecting to cpufreqd fails as well: $ cpufreqd-get socket I'll try to connect: /tmp/cpufreqd-6BRIMP/cpufreqd And killing it is no betther, be it SIGTERM or SIGKILL, the process stays there. Attaching strace to it hangs as well, as with that uname above. Because I've been asked about my preemtion setings in IRC today: # CONFIG_PREEMPT_NONE is not set CONFIG_PREEMPT_VOLUNTARY=y # CONFIG_PREEMPT is not set CONFIG_PREEMPT_BKL=y I will disable CONFIG_PREEMPT_BKL for the time being and see if that helps.
Please attach dmesg output from when after the hang occurrs
(In reply to comment #2) > Please attach dmesg output from when after the hang occurrs I did not save the dmesg, but the last event was hours before the hang, there was nothing related.
And what was the last event?
(In reply to comment #4) > And what was the last event? Some message about an unknown PS/2 mouse after I last used my KVM switch. But accessing /proc/cpuinfo definitely worked several times after that. Oh, I have that line in my logs: Jul 9 04:24:22 server kernel: [4358097.856000] logips2pp: Detected unknown logitech mouse model 94 I rebooted my system at 12:36, which in my estimation was less than an hour after the bug occurred. So there were about 7 hours without any kernel messages.
Ok, in that case just post a dmesg from a clean boot please.
Created attachment 91316 [details] Boot messages of my 2.6.17-r1 kernel (In reply to comment #6) > Ok, in that case just post a dmesg from a clean boot please. I grabbed the messages from my kern.log. It is my understanding that they contain all dmesg contents as well. This log was from the latest boot process before comment #1, so a boot after which the error occurred. In the meantime I've switched to 2.6.17-r2 and disabled CONFIG_PREEMPT_BKL, so if I reboot now and grab the dmesg contents, things would be slightly different.
Created attachment 91317 [details] Boot messages of my 2.6.17-r1 kernel (In reply to comment #7) > I grabbed the messages from my kern.log. Lost a few lines in the process, sorry about that.
Please post dmesg output even if it is from a slightly different kernel. syslog often misses stuff...
Also you need to reproduce this on a clean kernel (i.e. no fritz stuff, not even loaded then unloaded: must be completely untainted)
(In reply to comment #9) > Please post dmesg output even if it is from a slightly different kernel. > syslog often misses stuff... Currently the dmesg contents is incomplete, too many messages since the last boot. And I don't want to reboot my system just now, I'm remote accessing it and if anything goes wrong I'd probably have to drive there just to fix it. I'll create this log as soon as I manage to, probably around the end of the week. (In reply to comment #10) > Also you need to reproduce this on a clean kernel (i.e. no fritz stuff, not > even loaded then unloaded: must be completely untainted) I'll try if I can get capisuite working with misdn as well. If so, then I'll switch and have a clean kernel. Otherwise this will have to wait even longer, until I find some time when I can do without my capisuite answering machine.
(In reply to comment #11) > Currently the dmesg contents is incomplete, too many messages since the last > boot. The dmesg from when you last booted the machine is stored in /var/log/dmesg - you can attach that :)
In that case I'll close this for now, please reopen once you have reproduced on a clean kernel and provided the extra info. At that point you'd need to test the latest development kernel too, which is currently 2.6.18-rc1.
Created attachment 92068 [details] dmesg on 2.6.17-gentoo-r2 (In reply to comment #12) > The dmesg from when you last booted the machine is stored in /var/log/dmesg - > you can attach that :) Thanks! It just happened again. I got that and combined it with the current dmesg, they had enough overlap. This is the kernel from my comment #7, 2.6.17-r2 with CONFIG_PREEMPT_BKL disabled. I'm now installing vanilla sources and will boot them without fritzcapi.
(In reply to comment #13) > At that point you'd need to test the latest development kernel too 2.6.18-rc2 does not work for me. I just filed bug 141015 about this. (In reply to comment #10) > Also you need to reproduce this on a clean kernel (i.e. no fritz stuff, not > even loaded then unloaded: must be completely untainted) At least I have capisuite working with mISDN instead of fritzcapi, so my kernel should be untainted, although it still uses modules not from the main kernel source tree.
Created attachment 92636 [details] dmesg on 2.6.17-gentoo-r2 untainted (In reply to comment #10) > Also you need to reproduce this on a clean kernel (i.e. no fritz stuff, not > even loaded then unloaded: must be completely untainted) OK, it just happened again, with mISDN instead of fritzcapi this time. (In reply to comment #13) > In that case I'll close this for now, please reopen once you have reproduced > on a clean kernel and provided the extra info. Reproduced with an clean i.e. untainted kernel. Merged /var/log/dmesg and current dmesg to build attached file. > At that point What point? Before or after reopening? Because reopening is done in an instant, trying to reproduce it may take months, and actually reproducing it certainly won't coincide with me reopening this bug. > you'd need to test the latest development kernel too. As 2.6.18-rc2 still does not work for me, that is out of the question for now. I could try the 2.6.17.6 vanilla sources, if that is any help.
As this is about an ancient kernel, using ancient drivers, and I seem to be the only one who had been affected by this, and don't encounter it any more, I'm changing the resolution from NEEDINFO to OBSOLETE.