During operation I get occasional softlocks The system hangs completely for one/two minutes and then returns to normal again. So could anybody please help me? Is my hardware bugy, should I replace the mainboard? Any settings in the BIOS? I have AHCI, ALPE and ASP enabled. If you need more information, please let me know. regards Bjoern Reproducible: Always Steps to Reproduce: 1.Power on PC 2.Use ist or not 3.The softlock WILL occure in any case (normly only on time ber boot but today I had 2 softlocks) Actual Results: Collection of softlocks: Softlock Nr.1 (08.03.2007) BUG: soft lockup detected on CPU#0! Call Trace: <IRQ> [<ffffffff80254eb0>] softlockup_tick+0xda/0xf5 [<ffffffff8023a7da>] update_process_times+0x42/0x68 [<ffffffff80216bb1>] smp_local_timer_interrupt+0x34/0x52 [<ffffffff80217285>] smp_apic_timer_interrupt+0x44/0x5f [<ffffffff8020a016>] apic_timer_interrupt+0x66/0x70 [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff802562e3>] handle_edge_irq+0xe4/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80236a21>] __do_softirq+0x3e/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff802084ab>] mwait_idle+0x0/0x45 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> [<ffffffff8024164c>] worker_thread+0x0/0x14a [<ffffffff802084ed>] mwait_idle+0x42/0x45 [<ffffffff8020842f>] cpu_idle+0x88/0xbf [<ffffffff807bb762>] start_kernel+0x2a2/0x2ae [<ffffffff807bb15c>] _sinittext+0x15c/0x160 irq 7: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff80255b53>] __report_bad_irq+0x30/0x72 [<ffffffff80255d53>] note_interrupt+0x1be/0x203 [<ffffffff80256824>] handle_level_irq+0xbc/0xf4 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff8025680f>] handle_level_irq+0xa7/0xf4 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff802562e3>] handle_edge_irq+0xe4/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff802562e3>] handle_edge_irq+0xe4/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80236a21>] __do_softirq+0x3e/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff802084ab>] mwait_idle+0x0/0x45 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> [<ffffffff8024164c>] worker_thread+0x0/0x14a [<ffffffff802084ed>] mwait_idle+0x42/0x45 [<ffffffff8020842f>] cpu_idle+0x88/0xbf [<ffffffff807bb762>] start_kernel+0x2a2/0x2ae [<ffffffff807bb15c>] _sinittext+0x15c/0x160 handlers: [<ffffffff80487893>] (usb_hcd_irq+0x0/0x52) Disabling IRQ #7 Expected Results: System should operate smoothly. Hardware Environment: 0:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub (rev c0) 00:01.0 PCI bridge: Intel Corporation 82975X PCI Express Root Port (rev c0) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 01) 00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 SATA controller: Intel Corporation 82801GR/GH (ICH7 Family) SATA AHCI Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:00.0 Multimedia audio controller: Creative Labs SB X-Fi 01:01.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11) 01:01.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11) 01:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 02:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 02:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 20) 05:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce 7600 GT] (rev a1)
Softlock Nr.4 2007.03.13 19:00:14 BUG: soft lockup detected on CPU#0! Call Trace: <IRQ> [<ffffffff80254eb0>] softlockup_tick+0xda/0xf5 [<ffffffff8023a7da>] update_process_times+0x42/0x68 [<ffffffff80216bb1>] smp_local_timer_interrupt+0x34/0x52 [<ffffffff80217285>] smp_apic_timer_interrupt+0x44/0x5f [<ffffffff8020a016>] apic_timer_interrupt+0x66/0x70 [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff802562e3>] handle_edge_irq+0xe4/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff8025a992>] pfn_to_page+0x2e/0x36 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff802777fc>] kmem_cache_free+0x40/0x1b0 [<ffffffff880dcdbd>] :sky2:sky2_tx_complete+0xc9/0x134 [<ffffffff880deb8d>] :sky2:sky2_poll+0x76f/0x920 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff804d8b12>] net_rx_action+0xae/0x1b7 [<ffffffff80236a2c>] __do_softirq+0x49/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> irq 1275: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff80255b53>] __report_bad_irq+0x30/0x72 [<ffffffff80255d53>] note_interrupt+0x1be/0x203 [<ffffffff802562f8>] handle_edge_irq+0xf9/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff8025a992>] pfn_to_page+0x2e/0x36 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff802777fc>] kmem_cache_free+0x40/0x1b0 [<ffffffff880dcdbd>] :sky2:sky2_tx_complete+0xc9/0x134 [<ffffffff880deb8d>] :sky2:sky2_poll+0x76f/0x920 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff804d8b12>] net_rx_action+0xae/0x1b7 [<ffffffff80236a2c>] __do_softirq+0x49/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> handlers: [<ffffffff8045b039>] (ahci_interrupt+0x0/0x45a) Disabling IRQ #1275 Softlock Nr.5 2007.03.16 17:00:14 BUG: soft lockup detected on CPU#0! Call Trace: <IRQ> [<ffffffff80254eb0>] softlockup_tick+0xda/0xf5 [<ffffffff8023a7da>] update_process_times+0x42/0x68 [<ffffffff80216bb1>] smp_local_timer_interrupt+0x34/0x52 [<ffffffff80217285>] smp_apic_timer_interrupt+0x44/0x5f [<ffffffff8020a016>] apic_timer_interrupt+0x66/0x70 [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff802562e3>] handle_edge_irq+0xe4/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff804cc8a0>] pci_conf1_read+0x0/0xc6 [<ffffffff88128967>] :nvidia:_nv003350rm+0xf/0x10 [<ffffffff8830f5af>] :nvidia:_nv007025rm+0x7d/0xb0 [<ffffffff883fb35d>] :nvidia:_nv001501rm+0x29/0xfe [<ffffffff883f9bf6>] :nvidia:_nv001505rm+0x2e/0x4e [<ffffffff883f9dd1>] :nvidia:_nv001511rm+0x39/0x52 [<ffffffff8824b21f>] :nvidia:_nv005982rm+0x3b/0x108 [<ffffffff8841bb89>] :nvidia:_nv009292rm+0xe9/0x658 [<ffffffff88115032>] :nvidia:_nv003618rm+0xe/0xdc [<ffffffff882623c8>] :nvidia:_nv009294rm+0x50/0x64 [<ffffffff8838a555>] :nvidia:_nv004932rm+0x165/0x4fa [<ffffffff8838097f>] :nvidia:_nv004943rm+0x8b/0xd2 [<ffffffff8812afaf>] :nvidia:_nv002554rm+0x99/0xbe [<ffffffff8812fde1>] :nvidia:rm_isr_bh+0x53/0x56 [<ffffffff88433377>] :nvidia:nv_kern_isr_bh+0x16/0x18 [<ffffffff80236af8>] tasklet_action+0x53/0x9d [<ffffffff80236a2c>] __do_softirq+0x49/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> irq 14: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff80255b53>] __report_bad_irq+0x30/0x72 [<ffffffff80255d53>] note_interrupt+0x1be/0x203 [<ffffffff802562f8>] handle_edge_irq+0xf9/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff804cc8a0>] pci_conf1_read+0x0/0xc6 [<ffffffff88128967>] :nvidia:_nv003350rm+0xf/0x10 [<ffffffff8830f5af>] :nvidia:_nv007025rm+0x7d/0xb0 [<ffffffff883fb35d>] :nvidia:_nv001501rm+0x29/0xfe [<ffffffff883f9bf6>] :nvidia:_nv001505rm+0x2e/0x4e [<ffffffff883f9dd1>] :nvidia:_nv001511rm+0x39/0x52 [<ffffffff8824b21f>] :nvidia:_nv005982rm+0x3b/0x108 [<ffffffff8841bb89>] :nvidia:_nv009292rm+0xe9/0x658 [<ffffffff88115032>] :nvidia:_nv003618rm+0xe/0xdc [<ffffffff882623c8>] :nvidia:_nv009294rm+0x50/0x64 [<ffffffff8838a555>] :nvidia:_nv004932rm+0x165/0x4fa [<ffffffff8838097f>] :nvidia:_nv004943rm+0x8b/0xd2 [<ffffffff8812afaf>] :nvidia:_nv002554rm+0x99/0xbe [<ffffffff8812fde1>] :nvidia:rm_isr_bh+0x53/0x56 [<ffffffff88433377>] :nvidia:nv_kern_isr_bh+0x16/0x18 [<ffffffff80236af8>] tasklet_action+0x53/0x9d [<ffffffff80236a2c>] __do_softirq+0x49/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> handlers: [<ffffffff80452fcc>] (ata_interrupt+0x0/0x206) Disabling IRQ #14 Softlock Nr.6 2007.03.16 20:16:05 BUG: soft lockup detected on CPU#0! Call Trace: <IRQ> [<ffffffff80254eb0>] softlockup_tick+0xda/0xf5 [<ffffffff8023a7da>] update_process_times+0x42/0x68 [<ffffffff80216bb1>] smp_local_timer_interrupt+0x34/0x52 [<ffffffff80217285>] smp_apic_timer_interrupt+0x44/0x5f [<ffffffff8020a016>] apic_timer_interrupt+0x66/0x70 [<ffffffff80255171>] handle_IRQ_event+0x1a/0x53 [<ffffffff802562e3>] handle_edge_irq+0xe4/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80259b9e>] mempool_free_slab+0x0/0xe [<ffffffff80435d47>] scsi_put_command+0x48/0x61 [<ffffffff80439370>] scsi_next_command+0x25/0x39 [<ffffffff8043959a>] scsi_end_request+0xbb/0xc9 [<ffffffff804396e5>] scsi_io_completion+0xec/0x2c9 [<ffffffff8044798e>] sd_rw_intr+0x188/0x1b3 [<ffffffff80580f1e>] _spin_unlock_irqrestore+0x16/0x31 [<ffffffff80399dcc>] blk_done_softirq+0x5c/0x6a [<ffffffff80236a2c>] __do_softirq+0x49/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> irq 1275: nobody cared (try booting with the "irqpoll" option) Call Trace: <IRQ> [<ffffffff80255b53>] __report_bad_irq+0x30/0x72 [<ffffffff80255d53>] note_interrupt+0x1be/0x203 [<ffffffff802562f8>] handle_edge_irq+0xf9/0x128 [<ffffffff8020ba39>] do_IRQ+0xf1/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa [<ffffffff80259b9e>] mempool_free_slab+0x0/0xe [<ffffffff80435d47>] scsi_put_command+0x48/0x61 [<ffffffff80439370>] scsi_next_command+0x25/0x39 [<ffffffff8043959a>] scsi_end_request+0xbb/0xc9 [<ffffffff804396e5>] scsi_io_completion+0xec/0x2c9 [<ffffffff8044798e>] sd_rw_intr+0x188/0x1b3 [<ffffffff80580f1e>] _spin_unlock_irqrestore+0x16/0x31 [<ffffffff80399dcc>] blk_done_softirq+0x5c/0x6a [<ffffffff80236a2c>] __do_softirq+0x49/0xb8 [<ffffffff8020a56c>] call_softirq+0x1c/0x28 [<ffffffff8020b8f7>] do_softirq+0x2c/0x7d [<ffffffff802369d7>] irq_exit+0x36/0x42 [<ffffffff8020ba85>] do_IRQ+0x13d/0x160 [<ffffffff80209931>] ret_from_intr+0x0/0xa <EOI> handlers: [<ffffffff8045b039>] (ahci_interrupt+0x0/0x45a) Disabling IRQ #1275
Erm, don't you think that telling us which kernel version(s) are affected would be important here?
Sry, I had to remove the "emerge --info" because the comment was to long so i just forgot it. Currently I am using vanilla 2.6.21-r1 which solves a bug with my NIC so. All Kernel version from 2.6.19 to 2.6.21.r1 Here is emerge --info Portage 2.1.2.2 (default-linux/amd64/2006.1, gcc-4.1.1, glibc-2.5-r0, 2.6.21-rc1 x86_64) ================================================================= System uname: 2.6.21-rc1 x86_64 Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz Gentoo Base System release 1.12.9 Timestamp of tree: Fri, 16 Mar 2007 17:50:01 +0000 dev-java/java-config: 1.3.7, 2.0.31 dev-lang/python: 2.4.3-r4 dev-python/pycrypto: 2.0.1-r5 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.61 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.16.1-r3 sys-devel/gcc-config: 1.3.14 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.17-r2 ACCEPT_KEYWORDS="amd64" AUTOCLEAN="ja" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=nocona -O2 -pipe -fomit-frame-pointer" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config" CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c" CXXFLAGS="-march=nocona -O2 -pipe -fomit-frame-pointer" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict" GENTOO_MIRRORS="ftp://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ ftp://ftp.easynet.nl/mirror/gentoo/ http://distfiles.gentoo.org http://www.ibiblio.org/pub/Linux/distributions/gentoo" LANG="de_DE.utf8" LC_ALL="de_DE.utf8" LINGUAS="de sv" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage /usr/portage/local/layman/xeffects" SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage" USE="X a52 aac aalib aiglx alsa amd64 ares asf bash-completion berkdb bitmap-fonts bittorrent bluetooth bzip2 cairo cdparanoia cli connectionstatus cpudetection cracklib crypt css cups curl dbus dga divx4linux dri dts dvd dvdr dvdread edl emovix encode exif fam fbcon ffmpeg flac fortran gdbm gif gimp glitz gnutls gpm gtk gtk2 hal highlight history iconv imagemagick isdnlog java jpeg jpeg2k kde libg++ lirc live lm_sensors logitech-mouse lzo mad madwifi matroska metalink midi modplug mp3 musepack musicbrainz mythtv ncurses network nfs nls nptl nptlonly nsplugin nvidia ogg openal opengl pam pam_console pcre pda pdf perl png ppds pppd python qt3 quicktime readline reflection rtc samba scanner sdl session sndfile spell spl ssl svg tcltk tcpd theora tiff tk transcode transparency truetype truetype-fonts type1-fonts unicode usb utempter v4l v4l2 vcd vorbis xine xinerama xml xorg xvid yahoo zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de sv" LIRC_DEVICES="dvico" USERLAND="GNU" VIDEO_CARDS="nvidia vesa fbdev" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LDFLAGS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Reopen...
I tried without ALPE and ASP but I still get softlocks. After I disabled ALPE and ASP I got a softlock targeting [<ffffffff80487893>] (usb_hcd_irq+0x0/0x52) Disabling IRQ #18 I'll see if the softlocks with target on ahci and ata will occure after turning ALPE and ASP off regards Bjoern
Nothing changed when turning ALPE and ASP off. So now I am in a catch 22. Any ideas? regards Bjoern
Please reproduce this without the binary nvidia module loaded (and make sure it is not loaded at any point that boot). Also post /proc/interrupts before and after the soft lockup (if possible).
I switched to latest kernel 2.6.21-rc5 and now its hard to reproduce. I switched to that kernel because I noticed some changes to the jmicron driver in the changelog. The System locks now completely so I can't access any logs nore can I read any error msg. This happens both with nvidia and without nvidia driver. But I can't guarantee that its the same errro. Shold I try to hunt the bug with the latest kernel or should I switch back to the one where I got a running system back after a lock? Do you have any Ideas how I can get any error messages except sitting in front of the monitor and staring at a TTY wher the error will be dumped? I'll invastigate further and post my results here. Suggestions how to get debug outbut on such a crash would be nice. regards blubbi
Stick with 2.6.21_rc5 for now. The hard hang appears randomly after some time, or what? Have you ever seen it happen while on the console without X running?
Okay, Here's the error without nvidia CS driver. All I can give you is a picture of the error: http://olausson.name/temp/IMG_4192.JPG Any ideas what this error is about? Irealized that I had ex4dv as filesystem. So I switched back zo ext3 and bootet again with the nvidia CS driver and the ps locked again. Now I am gonna try to geht a "Monitor Shot" with ext3 and NV driver. Regards blubbi
Here are my interrupts: CPU0 CPU1 0: 75418 0 IO-APIC-edge timer 1: 2 0 IO-APIC-edge i8042 6: 3 0 IO-APIC-edge floppy 8: 1 0 IO-APIC-edge rtc 9: 0 0 IO-APIC-fasteoi acpi 12: 4 0 IO-APIC-edge i8042 14: 631 0 IO-APIC-edge libata 15: 0 0 IO-APIC-edge libata 16: 0 0 IO-APIC-fasteoi libata 17: 4178 0 IO-APIC-fasteoi libata, uhci_hcd:usb3 18: 9517 0 IO-APIC-fasteoi uhci_hcd:usb4 19: 262 0 IO-APIC-fasteoi uhci_hcd:usb5, HDA Intel 20: 1841 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb2 21: 3 0 IO-APIC-fasteoi ohci1394 22: 3 0 IO-APIC-fasteoi bttv0 23: 17053 0 IO-APIC-fasteoi wifi0 1274: 641 0 PCI-MSI-edge eth0 1275: 8285 0 PCI-MSI-edge libata NMI: 0 0 LOC: 74001 73927 ERR: 0
Finally I got a screenshot output for you from the whole bug. Luckiely I had a konsole open with some text in it. The system got locked, keyboard is no longer funktional. All I could use was my mouse. So I tried to copy and paste the letters for "dmesg > BUG" together with the mous and used a whole marked line to send it of. For securety reasons I have taken a screenshot to. I got the screenhot in that way: Again with the mouse searched all letters for "cat BUG" and then took a screenshot of the output. And it was good that way... after a reboot I couldn't find the BUG file.... so here's the "MonitorShot" http://olausson.name/temp/IMG_4193.JPG This one is now without nvidia CS driver and without ext4dev. regards Bjoern
Thats useful, thanks. It seems like an odd problem though, possibly a hardware issue. Can you run memtest for a few passes and check that it doesn't bring up any errors?
(In reply to comment #13) > Thats useful, thanks. It seems like an odd problem though, possibly a hardware > issue. Can you run memtest for a few passes and check that it doesn't bring up > any errors? > No problem, gonna run it tomorrow. Thanks for your help. I could provide a remote access (ssh) if it would help. regards Bjoern
Menmtest86+ now ran for 3,5 houres. 10 Tests without any error. Want to access this nasty machine via ssh? regrads and thanks Bjoern
Whats your most favoured guess: 1) Hardware error 2) Driver bug if 1) I'll go and grab a new board if 2) I'll pray that it will be fixed ;-) When I run WinXP my CD-R (Plextor, pata) drive, connected to the jmicron controller, is not found. I have to use the DeviceManager to "search for new hardware" after this procedure my CD-R drive is shown in the Explorer and in the DeviceManager. After this everything works fine. anything more I can do to help you? regrads Bjoern
By the way, I posted the same bug on bugzilla.kernel.org ( http://bugzilla.kernel.org/show_bug.cgi?id=8259 ) but I didn't get an answere till now. Therefor I started the bug in the gentoo bug tracker. Maybe we should focus on one bugtracker. Maybe the kernel.org bugtracker? http://bugs.gentoo.org/show_bug.cgi?id=171185 regards Bjoern
We'll watch the upstream bug, thanks.