Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 231463 - app-emulation/vmware-server-1.0.6.91891: vmware-serverd segfaults while starting/loading .vmx files
Summary: app-emulation/vmware-server-1.0.6.91891: vmware-serverd segfaults while start...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Server (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo VMWare Bug Squashers [disabled]
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-07-10 21:06 UTC by Chris Frederick
Modified: 2008-07-12 03:46 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
vmware log file from opening an existing virtual host .vmx file (vmware-serverd.existing.log,17.33 KB, text/plain)
2008-07-10 21:09 UTC, Chris Frederick
Details
vmware log file from creating a new virtual host .vmx file (vmware-serverd.new.log,20.15 KB, text/plain)
2008-07-10 21:09 UTC, Chris Frederick
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Frederick 2008-07-10 21:06:35 UTC
Vmware server crashes when loading .vmx files.  This happens on existing vmware guests, and when creating a new guest.

Reproducible: Always

Steps to Reproduce:
1.start vmware service
2.connect to server via vmware-server-console
3a.open existing .vmx file
or
3b.create new virtual machine and follow steps in wizard.  crash happens after setting virtual disk options.

Actual Results:  
vmware-serverd drops the connection to the vmware-server-console.  running top on the server shows a zombie vmware-serverd process.  Looking at the /var/log/vmware/vmware-serverd.log file shows a signal 11 (segfault) after loading the .vmx file.

Expected Results:  
virtual machine should load and be ready to start

I've ran revdep-rebuild, and emerge -eav world to "freshen" the system, nothing changed.  I've also tried the 1.04, 1.05, and 1.06 versions of vmware-server, and i removed /etc/vmware each time to keep a clean configuration, none of this has helped.

emerge --info
Portage 2.1.4.4 (default-linux/x86/2007.0, gcc-4.1.2, glibc-2.6.1-r0, 2.6.24-gentoo-r8 i686)
=================================================================
System uname: 2.6.24-gentoo-r8 i686 Intel(R) Xeon(TM) CPU 3.00GHz
Timestamp of tree: Tue, 08 Jul 2008 12:45:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
app-shells/bash:     3.2_p33
dev-lang/python:     2.4.4-r13
dev-python/pycrypto: 2.0.1-r6
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.61-r2
sys-devel/automake:  1.10.1
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.23-r3
ACCEPT_KEYWORDS="x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium4 -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-O2 -march=pentium4 -pipe -fomit-frame-pointer"
DISTDIR="/usr/distfiles"
FEATURES="distlocks metadata-transfer sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="ftp://gentoo.mirrors.tds.net/gentoo"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/overlay"
SYNC="rsync://rsync.us.gentoo.org/gentoo-portage"
USE="acl acpi apache2 berkdb bzip2 cli cracklib crypt cups dri fam fortran ftp gd gdbm iconv imap isdnlog ldap maildir midi mmx mudflap ncurses nptl nptlonly openmp pam pcre perl php postgres pppd python readline reflection samba session spl sse sse2 ssl tcpd threads unicode vim-syntax x86 xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="vga vesa radeon"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Chris Frederick 2008-07-10 21:09:13 UTC
Created attachment 160081 [details]
vmware log file from opening an existing virtual host .vmx file
Comment 2 Chris Frederick 2008-07-10 21:09:49 UTC
Created attachment 160082 [details]
vmware log file from creating a new virtual host .vmx file
Comment 3 Mike Auty (RETIRED) gentoo-dev 2008-07-10 22:51:39 UTC
Hi Chris,

Have you ever had vmware-server working on this system, or not?  There haven't been any major changes to any of the vmware-server packages recently (only to vmware-server-console), so I can't really imagine what's causing the problem.

Your emerge info all looks very trouble free, everything seems stable, gcc is only 4.1.2 (so no 4.2+ weirdness), I can't see any issues there.  Could you please specify what filesystem /vmware is using, and any other pertinent information you can think of relating to vmware read/writing files or write special/unusual files such as sockets or locks in nearby locations?  I know there have been reported issues with JFS filesystems (as noted at the end of the install procedure), so perhaps it's something along those lines?  Any extra information you can provide would be helpful.  Otherwise it's not clear what we can do, given the closed nature of the program...
Comment 4 Chris Frederick 2008-07-11 13:56:00 UTC
Hello Mike,

This server is running 2 web servers, a database server, and our secondary mail server.  The server is relatively new, but it has been running for over a month.  The last thing I did before the failure was update the ram (1G to 4G), and kernel to support the extra ram (high mem).  That was last week, but the server ran ok until the weekend.  There was no scheduled maintenance done on the server over the weekend, so I don't know what could have changed to cause the segmentation fault.

The /vmware is running on an ext3 file system across a raid 5 array (hardware based).  The rest of the file systems are on a separate drive, using ext3.  The /tmp is mounted as tmpfs with noexec,nosuid.  As the server is/was accessible to the internet, it was running the hardened profile.  I followed the HOWTO_Install_VMWare_Server gentoo wiki for setting up the server.  When the failure occurred, I started disabling pax/grsec options in the kernel, when those didn't work I switched profiles to the default 2.6 profile (thus the 'emerge -eav's i mentioned).

Anything else i should look into?
Comment 5 Mike Auty (RETIRED) gentoo-dev 2008-07-11 14:29:19 UTC
Hmmmm, no, not that I can think of.  When you say you upgraded to highmem, did you go to 64-bit resources and PAE (ie, 64Gb support, to get full use of the 4Gb of RAM) or just to high mem (4Gb of RAM, but free only reports 3.2 Gb ish)?

The rest of it all sounds fine.  My best guess at the moment is that either some of the ram is damaged, and after the first boot, vmware server started getting loaded into the damaged ram (seems unlikely), or a similar situation this time with highmem and lowmem causing the issues?

The hardened stuff may also have been an issue, but given that the biggest event to happen beforehand was the RAM change, I'm gonna stick with that as the best guess culprit...
Comment 6 Chris Frederick 2008-07-11 15:34:42 UTC
I just went to high mem (no 64bit).  I didn't know if 64bit would cause an issue or not.  The memory is all registered ecc 512M chips pulled from an unused server, so I'm guessing that a bad chip would be noticed by the system and/or bios.  Would the 64bit method work better?
Comment 7 Mike Auty (RETIRED) gentoo-dev 2008-07-11 17:00:15 UTC
No, it's marked as experimental and would be far more likely to cause problems, you should stick with highmem at best.

If this is a production server, has it been rebooted since the vmware issues started?  I'm clutching at straws here because I've got no other ideas I'm afraid...
Comment 8 Chris Frederick 2008-07-11 22:01:08 UTC
Got it working, I use ldap for authentication so I had to log into vmware-server via root since thats the only local account and vmware ships with incompatible openssl libraries so it can't tls the ldap connection.  I installed stunnel and rigged a ldaps tunnel between the ldap server and the vmware server.  I then 'chown -R'd the /vmware directory to a non-root account and I was able to add the existing hosts back into vmware and start them up.  I'm not sure why root wasn't able to run the virtual servers after it had been doing so for over a month, but it seems to be working now.

I'd tack this up as a vmware openssl incompatibility with pam_ldap.  Again, I'm not sure why root didn't work, but at least this fixed it so the virtual servers are back and running.

Thanks for the help Mike.
Comment 9 Mike Auty (RETIRED) gentoo-dev 2008-07-12 03:46:49 UTC
No probs, I think there's an amd64 pam_ldap bug somewhere, but I don't know if it's related.  I'm going to mark this as WORKSFORME since I couldn't duplicate the bug, if you want to reopen it or think it should be marked differently, please feel free to reopen it and let me know...  5:)