I'm running gentoo-sources-2.6.14-r5 on an IBM 5664 Netfinity server with no problems until I try running Rosegarden. Every time I have the system reboots after 30-60 seconds with the following error message in the kernel log: Uhhuh. NMI received for unknown reason 35 on CPU 0. The one unusual hardware component of the server is an M Audio Delta 1010 that has served me faithfully to date using Ardour. It may have something to do with the MIDI interfaces since this is the first time I'm trying to do anything with MIDI. Note the device initialization phase of dmesg includes: oprofile: using NMI interrupt. I did an strace of rosegarden and it generated several megabytes of data. The messages rosegarden prints don't appear to reveal any glaring problems to my ignorant eyes. I searched the IBM website for non-maskable interrupt issues without success. As I bought the box on eBay I do not have IBM software or documentation, nor can I afford support. I would appreciate the assistance of the more knowledgable among this list with determining the best direction to go with this problem. The full kernel log and dmesg files follow. = = = = = = = = = = = = = = = = = = = = = = = = = = = dlc@muse ~ $ sudo cat /var/log/kernel/current Password: Jan 16 01:12:00 [kernel] [4294667.296000] Linux version 2.6.14-gentoo-r5 (root@muse) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0,pie-8.7.8)) #1 SMP PREEMPT Mon Jan 16 01:00:29 EST 2006 Jan 16 01:12:02 [kernel] [ 40.966297] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 Jan 16 01:12:30 [kernel] [ 68.762937] mtrr: Serverworks LE rev < 6 detected. Write-combining disabled. Jan 16 01:12:30 [kernel] [ 68.763128] mtrr: your processor doesn't support write-combining Jan 16 01:26:48 [kernel] [ 931.386133] Uhhuh. NMI received for unknown reason 35 on CPU 0. Jan 16 01:29:59 [kernel] [4294667.296000] Linux version 2.6.14-gentoo-r5 (root@muse) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0,pie-8.7.8)) #1 SMP PREEMPT Mon Jan 16 01:00:29 EST 2006 Jan 16 01:30:01 [kernel] [ 41.265632] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 Jan 16 01:30:16 [kernel] [ 56.039429] mtrr: Serverworks LE rev < 6 detected. Write-combining disabled. dlc@muse ~ $ = = = = = = = = = = = = = = = = = = = = = = = = = = = dlc@muse ~ $ dmesg [4294667.296000] Linux version 2.6.14-gentoo-r5 (root@muse) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)) #1 SMP +PREEMPT Mon Jan 16 01:00:29 EST 2006 [4294667.296000] BIOS-provided physical RAM map: [4294667.296000] BIOS-e820: 0000000000000000 - 000000000009e000 (usable) [4294667.296000] BIOS-e820: 000000000009e000 - 00000000000a0000 (reserved) [4294667.296000] BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved) [4294667.296000] BIOS-e820: 0000000000100000 - 000000001fffb8c0 (usable) [4294667.296000] BIOS-e820: 000000001fffb8c0 - 0000000020000000 (ACPI data) [4294667.296000] BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) [4294667.296000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) [4294667.296000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) [4294667.296000] 511MB LOWMEM available. [4294667.296000] found SMP MP-table at 0009e140 [4294667.296000] On node 0 totalpages: 131067 [4294667.296000] DMA zone: 4096 pages, LIFO batch:1 [4294667.296000] Normal zone: 126971 pages, LIFO batch:31 [4294667.296000] HighMem zone: 0 pages, LIFO batch:1 [4294667.296000] DMI 2.1 present. [4294667.296000] Intel MultiProcessor Specification v1.4 [4294667.296000] Virtual Wire compatibility mode. [4294667.296000] OEM ID: IBM GNK Product ID: Teton SMP APIC at: 0xFEE00000 [4294667.296000] Processor #1 6:8 APIC version 17 [4294667.296000] Processor #0 6:8 APIC version 17 [4294667.296000] I/O APIC #14 Version 17 at 0xFEC00000. [4294667.296000] I/O APIC #15 Version 17 at 0xFEC01000. [4294667.296000] Enabling APIC mode: Flat. Using 2 I/O APICs [4294667.296000] Processors: 2 [4294667.296000] Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000) [4294667.296000] Built 1 zonelists [4294667.296000] Kernel command line: BOOT_IMAGE=G2-2.6.14-r5 root=804 devfs=mount [4294667.296000] mapped APIC to ffffd000 (fee00000) [4294667.296000] mapped IOAPIC to ffffc000 (fec00000) [4294667.296000] mapped IOAPIC to ffffb000 (fec01000) [4294667.296000] Initializing CPU#0 [4294667.296000] PID hash table entries: 2048 (order: 11, 32768 bytes) [ 0.000000] Detected 866.763 MHz processor. [ 83.528552] Using tsc for high-res timesource [ 83.530340] Console: colour VGA+ 80x25 [ 83.532344] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [ 83.535141] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [ 83.580862] Memory: 515400k/524268k available (2206k kernel code, 8336k reserved, 750k data, 200k init, 0k highmem) [ 83.580910] Checking if this processor honours the WP bit even in supervisor mode... Ok. [ 83.640761] Calibrating delay using timer specific routine.. 1734.30 BogoMIPS (lpj=867151) [ 83.640862] Security Framework v1.0.0 initialized [ 83.640911] Mount-cache hash table entries: 512 [ 83.641140] CPU: After generic identify, caps: 0387fbff 00000000 00000000 00000000 00000000 00000000 00000000 [ 83.641156] CPU: After vendor identify, caps: 0387fbff 00000000 00000000 00000000 00000000 00000000 00000000 [ 83.641175] CPU: L1 I cache: 16K, L1 D cache: 16K [ 83.641203] CPU: L2 cache: 256K [ 83.641222] CPU serial number disabled. [ 83.641247] CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000 [ 83.641261] Intel machine check architecture supported. [ 83.641286] Intel machine check reporting enabled on CPU#0. [ 83.641322] mtrr: v2.0 (20020519) [ 83.641351] Enabling fast FPU save and restore... done. [ 83.641376] Enabling unmasked SIMD FPU exception support... done. [ 83.641410] Checking 'hlt' instruction... OK. [ 83.645009] CPU0: Intel Pentium III (Coppermine) stepping 06 [ 83.645080] Booting processor 1/0 eip 2000 [ 83.655398] Initializing CPU#1 [ 83.715742] Calibrating delay using timer specific routine.. 1732.25 BogoMIPS (lpj=866129) [ 83.715753] CPU: After generic identify, caps: 0387fbff 00000000 00000000 00000000 00000000 00000000 00000000 [ 83.715764] CPU: After vendor identify, caps: 0387fbff 00000000 00000000 00000000 00000000 00000000 00000000 [ 83.715779] CPU: L1 I cache: 16K, L1 D cache: 16K [ 83.715784] CPU: L2 cache: 256K [ 83.715788] CPU serial number disabled. [ 83.715793] CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000 [ 83.715803] Intel machine check architecture supported. [ 83.715810] Intel machine check reporting enabled on CPU#1. [ 83.716168] CPU1: Intel Pentium III (Coppermine) stepping 06 [ 83.716350] Total of 2 processors activated (3466.56 BogoMIPS). [ 83.716542] ENABLING IO-APIC IRQs [ 83.716569] BIOS bug, IO-APIC#1 ID is 15 in the MPC table!... [ 83.716592] ... fixing up to 15. (tell your hw vendor) [ 83.716866] ..TIMER: vector=0x31 pin1=2 pin2=0 [ 83.726902] ..MP-BIOS bug: 8254 timer not connected to IO-APIC [ 83.726928] ...trying to set up timer (IRQ0) through the 8259A ... [ 83.726941] ..... (found pin 0) ...works. [ 83.838724] checking TSC synchronization across 2 CPUs: passed. [ 0.001023] Brought up 2 CPUs [ 0.001915] NET: Registered protocol family 16 [ 0.002239] PCI: PCI BIOS revision 2.10 entry at 0xfd34c, last bus=4 [ 0.002269] mtrr: your CPUs had inconsistent fixed MTRR settings [ 0.002289] mtrr: probably your BIOS does not setup all CPUs. [ 0.002313] mtrr: corrected configuration. [ 0.003615] SCSI subsystem initialized [ 0.003664] PCI: Probing PCI hardware [ 0.003689] PCI: Probing PCI hardware (bus 00) [ 0.004054] Boot video device is 0000:00:01.0 [ 0.005078] PCI: Discovered peer bus 01 [ 0.005877] PCI->APIC IRQ transform: 0000:00:02.0[A] -> IRQ 225 [ 0.005907] PCI->APIC IRQ transform: 0000:00:0a.0[A] -> IRQ 161 [ 0.005939] PCI->APIC IRQ transform: 0000:00:0f.2[A] -> IRQ 9 [ 0.005982] PCI->APIC IRQ transform: 0000:01:05.0[A] -> IRQ 169 [ 0.006008] PCI->APIC IRQ transform: 0000:01:07.0[A] -> IRQ 201 [ 0.009367] Machine check exception polling timer started. [ 0.011241] audit: initializing netlink socket (disabled) [ 0.011289] audit(1137392965.261:1): initialized [ 0.011763] Initializing Cryptographic API [ 0.011887] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 0.011910] ibmphpd: IBM Hot Plug PCI Controller Driver version: 0.6 [ 0.015533] Real Time Clock Driver v1.12 [ 0.017241] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 0.017341] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 0.017369] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled [ 0.017679] ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.017889] ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 0.018419] parport0: PC-style at 0x378 [PCSPP,TRISTATE] [ 0.100193] mice: PS/2 mouse device common for all mice [ 0.100222] io scheduler noop registered [ 0.100282] io scheduler deadline registered [ 0.100385] Floppy drive(s): fd0 is 1.44M [ 0.115330] FDC 0 is a National Semiconductor PC87306 [ 0.116492] pcnet32.c:v1.30j 29.04.2005 tsbogend@alpha.franken.de [ 0.116613] pcnet32: PCnet/FAST III 79C975 at 0x2000, 00 06 29 a8 d8 95 assigned IRQ 225. [ 0.116813] eth0: registered as PCnet/FAST III 79C975 [ 0.116849] pcnet32: 1 cards_found. [ 0.116970] Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 [ 0.117000] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx [ 0.117216] Probing IDE interface ide0... [ 0.126777] input: AT Translated Set 2 keyboard on isa0060/serio0 [ 0.296636] logips2pp: Detected unknown logitech mouse model 62 [ 0.342294] input: ImExPS/2 Logitech Explorer Mouse on isa0060/serio1 [ 0.917997] hda: LTN485S, ATAPI CD/DVD-ROM drive [ 1.223680] Probing IDE interface ide1... [ 1.743568] ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 [ 1.744592] hda: ATAPI 48X CD-ROM drive, 120kB Cache [ 1.744639] Uniform CD-ROM driver Revision: 3.20 [ 1.757433] scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 [ 1.757438] <Adaptec 2940 Ultra SCSI adapter> [ 1.757441] aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs [ 1.757444] [ 16.760710] Vendor: SEAGATE Model: ST318203LC Rev: 0002 [ 16.760922] Type: Direct-Access ANSI SCSI revision: 02 [ 16.760987] scsi0:A:0:0: Tagged Queuing enabled. Depth 32 [ 16.761035] target0:0:0: Beginning Domain Validation [ 16.766186] target0:0:0: wide asynchronous. [ 16.769550] target0:0:0: Domain Validation skipping write tests [ 16.769954] target0:0:0: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 8) [ 16.774323] target0:0:0: Ending Domain Validation [ 17.044195] Vendor: SEAGATE Model: ST318203LC Rev: 0003 [ 17.044395] Type: Direct-Access ANSI SCSI revision: 02 [ 17.044457] scsi0:A:2:0: Tagged Queuing enabled. Depth 32 [ 17.044506] target0:0:2: Beginning Domain Validation [ 17.049608] target0:0:2: wide asynchronous. [ 17.052950] target0:0:2: Domain Validation skipping write tests [ 17.053348] target0:0:2: FAST-20 WIDE SCSI 40.0 MB/s ST (50 ns, offset 8) [ 17.057702] target0:0:2: Ending Domain Validation [ 20.211323] SCSI device sda: 35566480 512-byte hdwr sectors (18210 MB) [ 20.212806] SCSI device sda: drive cache: write back [ 20.213707] SCSI device sda: 35566480 512-byte hdwr sectors (18210 MB) [ 20.215189] SCSI device sda: drive cache: write back [ 20.215217] sda: sda1 sda2 sda3 sda4 [ 20.226722] Attached scsi disk sda at scsi0, channel 0, id 0, lun 0 [ 20.229867] SCSI device sdb: 35566480 512-byte hdwr sectors (18210 MB) [ 20.231337] SCSI device sdb: drive cache: write back [ 20.232212] SCSI device sdb: 35566480 512-byte hdwr sectors (18210 MB) [ 20.233692] SCSI device sdb: drive cache: write back [ 20.233714] sdb: sdb1 sdb2 sdb3 sdb4 [ 20.240753] Attached scsi disk sdb at scsi0, channel 0, id 2, lun 0 [ 20.240889] Advanced Linux Sound Architecture Driver Version 1.0.10rc1 (Mon Sep 12 08:13:09 2005 UTC). [ 20.266387] ALSA device list: [ 20.266420] #0: M Audio Delta 1010 at 0x4b00, irq 201 [ 20.266444] oprofile: using NMI interrupt. [ 20.266588] NET: Registered protocol family 2 [ 20.277026] IP route cache hash table entries: 8192 (order: 3, 32768 bytes) [ 20.277425] TCP established hash table entries: 32768 (order: 6, 393216 bytes) [ 20.278966] TCP bind hash table entries: 32768 (order: 6, 393216 bytes) [ 20.280787] TCP: Hash tables configured (established 32768 bind 32768) [ 20.280817] TCP reno registered [ 20.280929] TCP bic registered [ 20.280998] NET: Registered protocol family 1 [ 20.281037] NET: Registered protocol family 17 [ 20.281091] Starting balanced_irq [ 20.281150] Using IPI Shortcut mode [ 20.293174] EXT3-fs: INFO: recovery required on readonly filesystem. [ 20.293206] EXT3-fs: write access will be enabled during recovery. [ 20.434486] (fs/jbd/recovery.c, 255): journal_recover: JBD: recovery, exit status 0, recovered transactions 415317 to 415365 [ 20.434501] (fs/jbd/recovery.c, 257): journal_recover: JBD: Replayed 921 and revoked 149/153 blocks [ 21.285801] EXT3-fs: recovery complete. [ 21.285859] kjournald starting. Commit interval 5 seconds [ 21.287246] EXT3-fs: mounted filesystem with ordered data mode. [ 21.287701] VFS: Mounted root (ext3 filesystem) readonly. [ 21.288013] Freeing unused kernel memory: 200k freed [ 24.434187] Adding 996000k swap on /dev/sdb2. Priority:-1 extents:1 across:996000k [ 24.443726] Adding 996000k swap on /dev/sda2. Priority:-2 extents:1 across:996000k [ 24.743493] EXT3 FS on sda4, internal journal [ 34.305509] kjournald starting. Commit interval 5 seconds [ 34.306179] EXT3 FS on sdb3, internal journal [ 34.306189] EXT3-fs: mounted filesystem with ordered data mode. [ 34.321230] usbcore: registered new driver usbfs [ 34.322680] usbcore: registered new driver hub [ 41.265632] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 [ 56.039429] mtrr: Serverworks LE rev < 6 detected. Write-combining disabled. [ 56.039447] mtrr: your processor doesn't support write-combining
Strange. I don't recall seeing this reported before. The usual cause of action is to test the most recent kernels and then report it as an upstream bug. Can you please confirm that the bug still appears on the latest development kernel (currently 2.6.16-rc1)?
(In reply to comment #1) > Strange. I don't recall seeing this reported before. The usual cause of action > is to test the most recent kernels and then report it as an upstream bug. Can > you please confirm that the bug still appears on the latest development kernel > (currently 2.6.16-rc1)? > Okay, I just confirmed 2.6.16-rc1 shares the problem. The kernel log: Jan 21 23:50:36 [kernel] [4294667.296000] Linux version 2.6.16-rc1 (root@muse) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)) #1 SMP PREEMPT Sat Jan 21 21:08:16 EST 2006 Jan 21 23:50:38 [kernel] [ 42.427995] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 Jan 21 23:51:02 [kernel] [ 66.101887] mtrr: Serverworks LE rev < 6 detected. Write-combining disabled. Jan 21 23:59:01 [kernel] [ 549.275209] Uhhuh. NMI received for unknown reason 35 on CPU 0. Jan 22 00:02:12 [kernel] [4294667.296000] Linux version 2.6.16-rc1 (root@muse) (gcc version 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)) #1 SMP PREEMPT Sat Jan 21 21:08:16 EST 2006 Jan 22 00:02:14 [kernel] [ 43.295865] eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 Jan 22 00:02:32 [kernel] [ 61.127642] mtrr: Serverworks LE rev < 6 detected. Write-combining disabled. Jan 22 00:02:32 [kernel] [ 61.127848] mtrr: your processor doesn't support write-combining
Can you see if any BIOS upgrades are available? It would also be worth booting a kernel without ACPI or APM and seeing if that makes any difference.
They are already disabled. I'll check on the BIOS upgrade question.
No new updates since October 2002, 1.16, which is what I'm running.
I found another way to trigger the NMI 35. Using qjackctl's Connections window to connect the Delta's MIDI to ardour's seq causes a 6.22 second lockout followed by a reboot after 15 more seconds. The kernel NMI message is generated in this instance, also.
Should I report the bug upstream? If so, would that be at kernel.org's bugzilla service or somewhere else?
Yes, I'm out of ideas, please report this at http://bugzilla.kernel.org and post the new URL here.
Created bug 5960 at http://bugzilla.kernel.org/show_bug.cgi?id=5960