Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 249751 - [2.6.27.6 regression] vmware guest panics on boot with CONFIG_VMI=Y
Summary: [2.6.27.6 regression] vmware guest panics on boot with CONFIG_VMI=Y
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: http://bugzilla.kernel.org/show_bug.c...
Whiteboard: linux-2.6.27.6-regression
Keywords:
Depends on:
Blocks:
 
Reported: 2008-12-04 00:01 UTC by Norman Back
Modified: 2008-12-17 14:00 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
config-2.6.27-gentoo-r4-3 (config-2.6.27-gentoo-r4-3,54.25 KB, text/plain)
2008-12-04 00:04 UTC, Norman Back
Details
Extract of vmware debug log (vmware-VMI-bug.log,13.45 KB, text/plain)
2008-12-04 00:11 UTC, Norman Back
Details
/proc/cpuinfo (cpuinfo,1.31 KB, text/plain)
2008-12-04 06:36 UTC, Norman Back
Details
lspci from the vmware guest (guest-lspci,2.63 KB, text/plain)
2008-12-04 06:58 UTC, Norman Back
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Norman Back 2008-12-04 00:01:26 UTC
Panic when starting kernel-2.6.27-r4 in vmware-workstation-6.5.0 when kernel is complied with CONFIG_VMI=Y 

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
BUG: Int 14: CR2 fbe00000
      EDI c05b1f98  ESI fbe00000  EBP 00a6e003  ESP c05b1f7c
      EBX c05b1f98  EDX 0000000e  ECX 00000003  EAX fbe00000
      err 00000000  EIP c05db95c   CS 00000062  flg 00010092
 Stack: c00cc618 c00cc625 00000003 00000000 00000000 00000563 c05b1ff8 fbe00000
        fbe10000 fbe00000 c05dba7e c05b1ff8 c05b1ff8 00646513 00609000 c05bac50
        00000800 00099d00 c059a000 00a6e003 00000800 00099d00 c059a000 c05b66d2

If kernel is recomplied with CONFIG_VMI=N it boot witout error.
Also occurs with sys-kernel/git-sources-2.6.28_rc7-r1


Reproducible: Always

Steps to Reproduce:
1. Compile sys-kernel/gentoo-sources-2.6.27-r4 with CONFIG_VMI=Y
2. Boot as guest in app-emulation/vmware-workstation-6.5.0.118166
.

Actual Results:  
Decompressing Linux... Parsing ELF... done.
Booting the kernel.
BUG: Int 14: CR2 fbe00000
      EDI c05b1f98  ESI fbe00000  EBP 00a6e003  ESP c05b1f7c
      EBX c05b1f98  EDX 0000000e  ECX 00000003  EAX fbe00000
      err 00000000  EIP c05db95c   CS 00000062  flg 00010092
 Stack: c00cc618 c00cc625 00000003 00000000 00000000 00000563 c05b1ff8 fbe00000
        fbe10000 fbe00000 c05dba7e c05b1ff8 c05b1ff8 00646513 00609000 c05bac50
        00000800 00099d00 c059a000 00a6e003 00000800 00099d00 c059a000 c05b66d2


Expected Results:  
Successful boot.

# emerge --info
WARNING: One or more repositories have missing repo_name entries:

        /usr/local/portage/profiles/repo_name

NOTE: Each repo_name entry should be a plain text file containing a
unique name for the repository on the first line.
Portage 2.2_rc16 (default/linux/x86/2008.0, gcc-3.4.6, glibc-2.6.1-r0, 2.6.27-gentoo-r4-2 i686)
=================================================================
System uname: Linux-2.6.27-gentoo-r4-2-i686-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6000+-with-glibc2.0
Timestamp of tree: Wed, 03 Dec 2008 03:02:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.4 [enabled]
app-shells/bash:     3.2_p33
dev-lang/python:     2.4.4-r6, 2.5.2-r7
dev-python/pycrypto: 2.0.1-r6
dev-util/ccache:     2.4-r7
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.61-r2
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.1-r1
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.23-r3
ACCEPT_KEYWORDS="x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=athlon-xp -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-O2 -march=athlon-xp -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="ccache distcc distlocks parallel-fetch preserve-libs protect-owned sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="http://gentoo.blueyonder.co.uk ftp://mirrors.blueyonder.co.uk/mirrors/gentoo http://www.mirrorservice.org/sites/www.ibiblio.org/gentoo/ ftp://ftp.mirrorservice.org/sites/www.ibiblio.org/gentoo/"
LDFLAGS="-Wl,-O1"
MAKEOPTS="-j10"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://fox2/gentoo-portage"
USE="3dnow X a52 aac accessibility acl alsa amarok amr arts authdaemond berkdb bzip2 caps cdr cli cracklib crypt cups dri dts dv dvd dvdr dvdread encode fam fortran gdbm gpm gtk iconv ieee1394 ipv6 isdnlog jack jpeg kde kdeprefix lm_sensors midi mmx mp3 mpeg mudflap mysql ncurses network nls nptl nptlonly ogg opengl openmp oss pam pcre pdf perl png pppd python qt3 readline reflection samba sasl sdl session spl sse ssl sysfs tcpd tiff truetype type1 uk_bleb uk_rt unicode v4l v4l2 vorbis x86 xine xinerama xorg xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="fbdev glint i810 intel mach64 mga neomagic nv r128 radeon savage sis tdfx trident vesa vga via vmware voodoo"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Norman Back 2008-12-04 00:04:29 UTC
Created attachment 174206 [details]
config-2.6.27-gentoo-r4-3
Comment 2 Norman Back 2008-12-04 00:11:47 UTC
Created attachment 174207 [details]
Extract of vmware debug log

Interesting lines in this log are the panic:

Dec 03 22:58:25.136: vcpu-0| Unknown int 10h func 0x0000
Dec 03 22:58:25.315: vcpu-0| Entering paravirt mode on vcpu 0
Dec 03 22:58:25.925: vcpu-0| Exiting on CLI;HLT at 0x60:0xc0100359
Dec 03 22:58:25.945: vmx| Stopping VCPU threads...

and the screen shot:

Dec 03 22:58:26.005: vmx| Decompressing Linux... Parsing ELF... done.
Dec 03 22:58:26.006: vmx| Booting the kernel.
Dec 03 22:58:26.006: vmx|
Dec 03 22:58:26.007: vmx|
Dec 03 22:58:26.007: vmx|
Dec 03 22:58:26.008: vmx|
Dec 03 22:58:26.008: vmx|
Dec 03 22:58:26.009: vmx|
Dec 03 22:58:26.010: vmx|
Dec 03 22:58:26.011: vmx| BUG: Int 14: CR2 fbe00000
Dec 03 22:58:26.011: vmx|      EDI c05b1f98  ESI fbe00000  EBP 00a6e003  ESP c05b1f7c
Dec 03 22:58:26.012: vmx|      EBX c05b1f98  EDX 0000000e  ECX 00000003  EAX fbe00000
Dec 03 22:58:26.012: vmx|      err 00000000  EIP c05db95c   CS 00000062  flg 00010092
Dec 03 22:58:26.013: vmx| Stack: c00cc618 c00cc625 00000003 00000000 00000000 00000563 c05b1ff8 fbe00000
Dec 03 22:58:26.013: vmx|        fbe10000 fbe00000 c05dba7e c05b1ff8 c05b1ff8 00646513 00609000 c05bac50
Dec 03 22:58:26.014: vmx|        00000800 00099d00 c059a000 00a6e003 00000800 00099d00 c059a000 c05b66d2
Comment 3 Norman Back 2008-12-04 06:36:42 UTC
Created attachment 174213 [details]
/proc/cpuinfo
Comment 4 Norman Back 2008-12-04 06:58:40 UTC
Created attachment 174214 [details]
lspci from the vmware guest

lspci from the vmware guest after booting with CONFIG_VMI=N
Comment 5 Ryan Tandy 2008-12-04 07:02:05 UTC
Please try with CONFIG_COMPAT_VDSO disabled and see if that changes anything.

Is there a kernel version where this option has worked for you?  Is the host
system running Gentoo?  What kernel is it running?
Comment 6 Norman Back 2008-12-04 07:20:43 UTC
Vmware Host uname -a
Linux diamond 2.6.27-gentoo-r4-1 #1 SMP PREEMPT Sat Nov 22 07:50:50 GMT 2008 i686 AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ AuthenticAMD GNU/Linux

Mother board is Asrock ALiveNF5SLI-1394 with 8GB ram

Will try CONFIG_COMPAT_VDSO=N on guest as suggested.
Comment 7 Norman Back 2008-12-04 09:00:51 UTC
Tried sys-kernel/gentoo-sources-2.6.24-r8 and it boots OK as vmware guest with CONFIG_VMI=Y
Comment 8 Norman Back 2008-12-04 16:02:23 UTC
After a bit more testing with CONFIG_VMI=Y:
sys-kernel/gentoo-sources-2.6.27-r2 boots OK 
sys-kernel/gentoo-sources-2.6.27-r3 panics with Int 14: CR2
Comment 9 Markos Chandras (RETIRED) gentoo-dev 2008-12-04 17:23:42 UTC
Can you clarify several things please?

Where are you trying to enable/disable CONFIG_VMI option? On hosts kernel or on guest?

The first attachment is from the host machine or the guest one? When you are trying different kernels are you using the *exact* same configuration?

Comment 10 Norman Back 2008-12-04 18:37:21 UTC
"Where are you trying to enable/disable CONFIG_VMI option? On hosts kernel or on guest?"
I was trying to enable the CONFIG_VMI option on the guest.

"The first attachment is from the host machine or the guest one?"
From the guest. The host is running sys-kernel/gentoo-sources-2.6.27-r4

"When you are trying different kernels are you using the *exact* same configuration?"
I used "make oldconfig" to upgrade .config from sys-kernel/gentoo-sources-2.6.27-r2 to sys-kernel/gentoo-sources-2.6.27-r3 accepting the default replies.
Comment 11 Axel Dyks 2008-12-04 18:58:16 UTC
(In reply to comment #10)
> I used "make oldconfig" to upgrade .config from
> sys-kernel/gentoo-sources-2.6.27-r2 to sys-kernel/gentoo-sources-2.6.27-r3
> accepting the default replies.

So this comes down to the difference between K_GENPATCHES_VER=4 (2.6.27-r2)
  http://sources.gentoo.org/viewcvs.py/linux-patches/genpatches-2.6/tags/2.6.27-4/
and K_GENPATCHES_VER=5 (2.6.27-r3)
  http://sources.gentoo.org/viewcvs.py/linux-patches/genpatches-2.6/tags/2.6.27-5/

or in terms of "vanilla"

  2.6.27.4 <--> 2.6.27.6

Could you try "vanilla" sources? This would narrow it down further more.

Cheers
Axel
Comment 12 Norman Back 2008-12-04 22:58:49 UTC
2.6.27.4 and 2.6.27.5 boot OK.
2.6.27.6 panics with Int 14: CR2
Comment 13 Daniel Drake (RETIRED) gentoo-dev 2008-12-04 23:51:20 UTC
great, thanks for the fast diagnosis
Comment 14 Daniel Drake (RETIRED) gentoo-dev 2008-12-04 23:57:02 UTC
Please try disabling CONFIG_X86_RESERVE_LOW_64K on 2.6.27.6
Comment 15 Norman Back 2008-12-05 08:17:30 UTC
(In reply to comment #14)
> Please try disabling CONFIG_X86_RESERVE_LOW_64K on 2.6.27.6

Tried this but still panics Int 14: CR2
Comment 16 Axel Dyks 2008-12-05 12:38:42 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > Please try disabling CONFIG_X86_RESERVE_LOW_64K on 2.6.27.6
> 
> Tried this but still panics Int 14: CR2
 
Hmm, not all changes in "setup_arch" are disabled when setting CONFIG_X86_RESERVE_LOW_64K=n. It's just the "bad_bios_dmi_table"
that is made empty.

You could try to insert various "printk()" statements into "setup_arch()"
(arch/x86/kernel/setup.c) from 2.6.27.6 and maybe figure out at which
point the kernel crashes.

Just an idea. This is what I would try next ... 
Comment 17 Daniel Drake (RETIRED) gentoo-dev 2008-12-05 12:55:00 UTC
There are other changes in 2.6.27.6 and our guess that it was related to the 64k reservation was probably wrong.

As a next step I would suggest doing a bisection to find (for sure) the exact commit that introduced the bug.

The process is described here:

http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

but you want to use the following git tree, not the one described there:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.27.y.git

use v2.6.27.5 as good and v2.6.27.6 as bad
Comment 18 Daniel Drake (RETIRED) gentoo-dev 2008-12-05 12:57:52 UTC
FYI, the above process will require you to test about 7 kernels before telling you which commit is bad
Comment 19 Norman Back 2008-12-05 15:16:01 UTC
(In reply to comment #18)
> FYI, the above process will require you to test about 7 kernels before telling
> you which commit is bad

I have used bisecion once before (successfully). I'll give it a try later. 

Comment 21 Norman Back 2008-12-05 17:25:40 UTC
(In reply to comment #18)
> FYI, the above process will require you to test about 7 kernels before telling
> you which commit is bad
> 

Done!

5c371b31be32033b0a4a993431484da8a2305369 is first bad commit
commit 5c371b31be32033b0a4a993431484da8a2305369
Author: Yinghai Lu <yhlu.kernel@gmail.com>
Date:   Mon Sep 22 02:52:26 2008 -0700

    x86: fix CONFIG_X86_RESERVE_LOW_64K=y

    commit 2216d199b1430d1c0affb1498a9ebdbd9c0de439 upstream

    The bad_bios_dmi_table() quirk never triggered because we do DMI setup
    too late. Move it a bit earlier.

    Also change the CONFIG_X86_RESERVE_LOW_64K quirk to operate on the e820
    table directly instead of messing with early reservations - this handles
    overlaps (which do occur in this low range of RAM) more gracefully.

    Signed-off-by: Ingo Molnar <mingo@elte.hu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

:040000 040000 b7b81ffb62eddf60c2d8545a61566f0d34c1b2a9
858d983687c53db5304015a245ee0c23f10c266d M      arch

Comment 22 Axel Dyks 2008-12-05 17:44:57 UTC
(In reply to comment #21)
> (In reply to comment #18)
> > FYI, the above process will require you to test about 7 kernels before telling
> > you which commit is bad
> > 
> 
> Done!
> 
> 5c371b31be32033b0a4a993431484da8a2305369 is first bad commit

Exactly what I guessed.
dmi_scan_machine must not be called at this stage.
There is a comment in "setup.c" (* NOTE: ...) that indicates the first
point of time at which early_ioremap may be called on x86-32.

We are curerntly working on a patch that reverts the entire
CONFIG_X86_RESERVE_LOW_64K stuff, i. e. the last five commits
on arch/x86/kernel/setup.c
Comment 23 Axel Dyks 2008-12-05 17:52:11 UTC
Could you open an upstream bug on http://bugzilla.kernel.org/
and if done so, post the link here?
Comment 24 Norman Back 2008-12-05 19:53:42 UTC
(In reply to comment #23)
> Could you open an upstream bug on http://bugzilla.kernel.org/
> and if done so, post the link here?
> 

Done

http://bugzilla.kernel.org/show_bug.cgi?id=12167
Comment 25 Axel Dyks 2008-12-05 20:25:36 UTC
(In reply to comment #24)
> (In reply to comment #23)
> > Could you open an upstream bug on http://bugzilla.kernel.org/
> > and if done so, post the link here?
> > 
> 
> Done
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=12167

Thanks!

Now that we know what caused the problem, I think we can skip
the patch that reverts the CONFIG_X86_RESERVE_LOW_64K stuff
in favor of helping upstream to actually SOLVE the problem.
Comment 26 Ryan Tandy 2008-12-13 23:29:43 UTC
Zach has posted his fix to LKML: http://lkml.org/lkml/2008/12/13/149
Comment 27 Axel Dyks 2008-12-14 02:10:43 UTC
(In reply to comment #26)
> Zach has posted his fix to LKML: http://lkml.org/lkml/2008/12/13/149
Yeah, and he attached it to http://bugzilla.kernel.org/show_bug.cgi?id=12167#c21  
But it didn't make it into 2.6.27.9 and so far I haven't spotted it in Linus'
tree ... 
Comment 28 Daniel Drake (RETIRED) gentoo-dev 2008-12-14 18:13:28 UTC
Zach sent a fix upstream...
Comment 29 Daniel Drake (RETIRED) gentoo-dev 2008-12-14 18:15:18 UTC
...which is included in gentoo-sources-2.6.27-r6. Thanks for your help working on this one.
Comment 30 Norman Back 2008-12-15 08:34:14 UTC
(In reply to comment #29)
> ...which is included in gentoo-sources-2.6.27-r6. Thanks for your help working
> on this one.

.. and tested OK.

# uname -r
2.6.27-gentoo-r6-1
# dmesg | grep -i vmi
VMI: Found VMware, Inc. Hypervisor OPROM, API version 3.0, ROM version 1.0
vmi: registering clock event vmi-timer. mult=12582912 shift=22
vmi: registering clock event vmi-timer. mult=12582912 shift=22
Booting paravirtualized kernel on vmi
vmi: registering clock source khz=3000000