Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 597554 - =sys-kernel/hardened-sources-4.7.6: Kernel panic when starting KVM guests
Summary: =sys-kernel/hardened-sources-4.7.6: Kernel panic when starting KVM guests
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Hardened (show other bugs)
Hardware: All Linux
: Normal critical
Assignee: The Gentoo Linux Hardened Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-19 16:19 UTC by Christian Roessner
Modified: 2018-10-11 23:29 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
kern.log (kern.log.gz,20.99 KB, application/x-gzip)
2016-10-19 16:19 UTC, Christian Roessner
Details
lspci -vvxxx (lspci.txt.gz,9.42 KB, application/x-gzip)
2016-10-19 16:19 UTC, Christian Roessner
Details
dmidecode (dmidecode.gz,4.32 KB, application/x-gzip)
2016-10-19 16:20 UTC, Christian Roessner
Details
Kernel configuration (config.gz,25.67 KB, application/x-gzip)
2016-10-19 16:20 UTC, Christian Roessner
Details
diff btwn old no-crash 4.7.5 and new crashing 4.7.5 (config-4.7.5-hardened-16-old-new.diff,3.88 KB, patch)
2016-10-22 16:38 UTC, miro.rovis
Details | Diff
syslog Call Trace 4.7.9-hardened (messages_161023_0047_kernel_4.7.9,7.89 KB, text/plain)
2016-10-23 01:46 UTC, miro.rovis
Details
The emerge--info.txt , complete, as root. Also for later. (emerge--info.txt,16.67 KB, text/plain)
2016-10-24 09:39 UTC, miro.rovis
Details
config of 4.7.9-hardened w/o SANITIZE (config-4.7.9-hardened-161024_09,117.32 KB, text/x-mpsub)
2016-10-24 09:41 UTC, miro.rovis
Details
syslog w/ Call Trace (messages_161024_1110_g5n,66.93 KB, text/plain)
2016-10-24 09:48 UTC, miro.rovis
Details
config-4.7.9-hardened-161024_12.diff (config-4.7.9-hardened-161024_12.diff,5.85 KB, patch)
2016-10-24 10:59 UTC, miro.rovis
Details | Diff
messages_161024_1325_g5n (messages_161024_1325_g5n,1.95 KB, text/plain)
2016-10-24 11:48 UTC, miro.rovis
Details
iff config-4.8.3-161024_14 config-4.8.3-161024_1430 (config-4.8.3-161024_1430.diff,757 bytes, patch)
2016-10-24 12:51 UTC, miro.rovis
Details | Diff
config-4.8.3-161024_1430_CallTrace.txt (config-4.8.3-161024_1430_CallTrace.txt,2.89 KB, text/plain)
2016-10-24 14:08 UTC, miro.rovis
Details
config-4.8.3-161024_1430_CallTrace_v2.txt (config-4.8.3-161024_1430_CallTrace_v2.txt,2.90 KB, text/plain)
2016-10-24 15:21 UTC, miro.rovis
Details
GentooVM (GentooVM,235 bytes, application/x-shellscript)
2016-10-24 20:48 UTC, miro.rovis
Details
GentooVM_PART1.sh (GentooVM_PART1.sh,182 bytes, text/x-sh)
2016-10-24 20:56 UTC, miro.rovis
Details
GentooVM_PART1_mem.sh (GentooVM_PART1_mem.sh,194 bytes, text/x-sh)
2016-10-24 21:09 UTC, miro.rovis
Details
GentooVM_PART1_monitor.sh (GentooVM_PART1_monitor.sh,201 bytes, text/x-sh)
2016-10-24 21:13 UTC, miro.rovis
Details
config-4.7.9-161023_CallTrace.txt (config-4.7.9-161023_CallTrace.txt,8.40 KB, text/plain)
2016-10-24 21:46 UTC, miro.rovis
Details
kernel-4.7.9-hardened-161024_09_at_161024_2353.syslog (kernel-4.7.9-hardened-161024_09_at_161024_2353.syslog,737 bytes, text/plain)
2016-10-24 22:03 UTC, miro.rovis
Details
config-4.7.9-hardened-161023 (config-4.7.9-hardened-161023,117.27 KB, text/x-mpsub)
2016-10-25 02:20 UTC, miro.rovis
Details
hardened-4.7.9-161023-161025_0332_test_syslog_kernel_1_boot.log (hardened-4.7.9-161023-161025_0332_test_syslog_kernel_1_boot.log,53.97 KB, text/x-log)
2016-10-25 02:21 UTC, miro.rovis
Details
hardened-4.7.9-161023-161025_0332_test_syslog_kernel_2_CallTrace.log (hardened-4.7.9-161023-161025_0332_test_syslog_kernel_2_CallTrace.log,8.88 KB, text/x-log)
2016-10-25 02:21 UTC, miro.rovis
Details
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_1_boot.log (hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_1_boot.log,59.75 KB, text/x-log)
2016-10-25 02:22 UTC, miro.rovis
Details
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_2_CallTrace_NONE.log (hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_2_CallTrace_NONE.log,704 bytes, text/x-log)
2016-10-25 02:23 UTC, miro.rovis
Details
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_3.stout (hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_3.stout,166 bytes, text/plain)
2016-10-25 02:23 UTC, miro.rovis
Details
messages_161108_125631_g5n with Call Traces (messages_161108_125631_g5n,9.59 KB, text/plain)
2016-11-08 13:56 UTC, miro.rovis
Details
config-4.7.10-hardened-r2-161107_06.diff from config-4.7.9-hardened-161024_09 (config-4.7.10-hardened-r2-161107_06.diff,1.07 KB, text/plain)
2016-11-08 14:14 UTC, miro.rovis
Details
Qemu_GentooVM_170108_emerge--info.txt (Qemu_GentooVM_170108_emerge--info.txt,16.78 KB, text/plain)
2017-01-08 17:06 UTC, miro.rovis
Details
config-4.8.15-hardened-r2-170106_05 (config-4.8.15-hardened-r2-170106_05,118.15 KB, text/plain)
2017-01-08 17:09 UTC, miro.rovis
Details
config-4.9.1-170108_04 (config-4.9.1-170108_04,115.27 KB, text/plain)
2017-01-08 17:11 UTC, miro.rovis
Details
config-4.9.1-170108_05.diff (config-4.9.1-170108_05.diff,698 bytes, text/plain)
2017-01-08 17:12 UTC, miro.rovis
Details
config-4.8.15-hardened-r2-170108_05.diff (config-4.8.15-hardened-r2-170108_05.diff,882 bytes, text/plain)
2017-01-08 17:13 UTC, miro.rovis
Details
config-4.8.15-hardened-r2-170108_10.diff (config-4.8.15-hardened-r2-170108_10.diff,8.41 KB, patch)
2017-01-08 17:14 UTC, miro.rovis
Details | Diff
config-4.8.15-hardened-r2-170108_18.diff (config-4.8.15-hardened-r2-170108_18.diff,6.65 KB, patch)
2017-01-08 19:38 UTC, miro.rovis
Details | Diff
config-4.8.15-hardened-r2-170108_20.diff (config-4.8.15-hardened-r2-170108_20.diff,707 bytes, patch)
2017-01-08 20:01 UTC, miro.rovis
Details | Diff
Strace of failing qemu on hardened-sources-4.8.17-r2 with sysfs protection (strace_sysrestrict_4.8.17-r2_public,18.84 KB, text/plain)
2017-02-23 12:52 UTC, Étienne Buira
Details
Remove grsec protection from debugfs (unprotect_debugfs.diff,537 bytes, patch)
2017-02-23 19:52 UTC, Étienne Buira
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Roessner 2016-10-19 16:19:22 UTC
Created attachment 450752 [details]
kern.log

I just upgraded hardened-sources (stable) to latest 4.7.6 release. On my HP ProLiant Server, the server boots and at the moment where I try to start a KVM guest, the kernel panics (See kern.log as attached)

Falling back to 4.4.8-hardened-r1
Comment 1 Christian Roessner 2016-10-19 16:19:56 UTC
Created attachment 450754 [details]
lspci -vvxxx
Comment 2 Christian Roessner 2016-10-19 16:20:19 UTC
Created attachment 450756 [details]
dmidecode
Comment 3 Christian Roessner 2016-10-19 16:20:41 UTC
Created attachment 450758 [details]
Kernel configuration
Comment 4 Christian Roessner 2016-10-19 16:21:37 UTC
The following info is for the "running" kernel!

emerge --info
Portage 2.3.0 (python 2.7.10-final-0, hardened/linux/amd64/no-multilib, gcc-4.9.3, glibc-2.22-r4, 4.4.8-hardened-r1 x86_64)
=================================================================
System uname: Linux-4.4.8-hardened-r1-x86_64-Intel-R-_Xeon-R-_CPU_L5640_@_2.27GHz-with-gentoo-2.2
KiB Mem:    49452540 total,  22810492 free
KiB Swap:   16777212 total,  16777212 free
Timestamp of repository gentoo: Tue, 18 Oct 2016 21:15:01 +0000
sh bash 4.3_p48
ld GNU ld (Gentoo 2.25.1 p1.1) 2.25.1
ccache version 3.2.4 [enabled]
app-shells/bash:          4.3_p48::gentoo
dev-lang/perl:            5.22.2::gentoo
dev-lang/python:          2.7.10-r1::gentoo, 3.4.3-r1::gentoo
dev-util/ccache:          3.2.4::gentoo
dev-util/cmake:           3.5.2-r1::gentoo
dev-util/pkgconfig:       0.28-r2::gentoo
sys-apps/baselayout:      2.2::gentoo
sys-apps/openrc:          0.21.7::gentoo
sys-apps/sandbox:         2.10-r1::gentoo
sys-devel/autoconf:       2.69::gentoo
sys-devel/automake:       1.14.1::gentoo, 1.15::gentoo
sys-devel/binutils:       2.25.1-r1::gentoo
sys-devel/gcc:            4.9.3::gentoo
sys-devel/gcc-config:     1.7.3::gentoo
sys-devel/libtool:        2.4.6::gentoo
sys-devel/make:           4.1-r1::gentoo
sys-kernel/linux-headers: 4.3::gentoo (virtual/os-headers)
sys-libs/glibc:           2.22-r4::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.europe.gentoo.org/gentoo-portage
    priority: -1000

croessner
    location: /usr/local/portage
    masters: gentoo
    priority: 0

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/easy-rsa /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--keep-going --with-bdeps=y --binpkg-respect-use=y --binpkg-changed-deps=y --usepkg=y --rebuilt-binaries=y --rebuilt-binaries-timestamp=20140405050000"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs ccache compressdebug config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://de-mirror.org/gentoo/ rsync://de-mirror.org/gentoo/"
LANG="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j25"
PKGDIR="/export/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="acl adns aio amd64 bacula-clientonly bacula-console bash-completion berkdb bindist btrfs bzip2 caps cli cracklib crypt curl cxx device-mapper dri gdbm hardened iconv ipv6 justify logrotate loop-aes lzo mmap mmx mmxext modules ncurses nls nptl nscd ntp openmp openssl pam pax_kernel pcre pie readline seccomp session sse sse2 ssl ssp tcpd threads unicode urandom vim-syntax xattr xtpax zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog aggregation cgroups contextswitch cpu cpufreq curl curl_json curl_xml disk email entropy ethstat exec filecount fscache hddtemp ipmi iptables logfile log_logstash multimeter netlink network nfs nginx ntpd numa openvpn ping processes protocols python sensors snmp uptime users uuid virt" CPU_FLAGS_X86="mmx sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" L10N="de en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="de en" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_4" QEMU_SOFTMMU_TARGETS="x86_64 i386" QEMU_USER_TARGETS="x86_64 i386" RUBY_TARGETS="ruby20 ruby21" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 5 Anthony Basile gentoo-dev 2016-10-19 18:02:55 UTC
Thanks for the report.  I try my best to check many situations but I can't check everything.  I didn't test for this scenario before stabilizing.  I'll let the grsec/pax people know.
Comment 6 PaX Team 2016-10-19 19:55:39 UTC
1. can you try a vanilla kernel?
2. what's the qemu command line you use to start the vm?
Comment 7 Christian Roessner 2016-10-20 06:20:15 UTC
(In reply to PaX Team from comment #6)
> 1. can you try a vanilla kernel?
I will send feedback after this comment.

> 2. what's the qemu command line you use to start the vm?

/usr/bin/qemu-system-x86_64 -name guest=db.roessner-net.de,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-db.roessner-net.de/master-key.aes -machine pc-i440fx-2.5,accel=kvm,usb=off,vmport=off -cpu Westmere,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+pclmuldq,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+dca,+arat,+pdpe1gb,+rdtscp -m 1024 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 3e7575b2-266d-497f-b933-9c1ddaa24cf1 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-db.roessner-net.de/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot menu=on,strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive if=none,id=drive-ide0-0-1,readonly=on -device ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -drive file=/var/lib/libvirt/images/db.roessner-net.de.img,format=raw,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=22,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:17:02:30,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -chardev socket,id=charchannel1,path=/var/lib/libvirt/qemu/channel/target/domain-1-db.roessner-net.de/org.qemu.guest_agent.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0 -spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device i6300esb,id=watchdog0,bus=pci.0,addr=0x9 -watchdog-action reset -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -msg timestamp=on

It is started with libvirt

virsh start db.roessner-net.de

The XML looks like this:

<domain type='kvm'>                                                                                                                                                     [42/82]  <name>db.roessner-net.de</name>                                                                                                                                                <uuid>3e7575b2-266d-497f-b933-9c1ddaa24cf1</uuid>                                                                                                                              <memory unit='KiB'>1048576</memory>                                                                                                                                            <currentMemory unit='KiB'>1048576</currentMemory>                                                                                                                              <vcpu placement='static'>2</vcpu>                                                                                                                                              <os>
    <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type>
    <bootmenu enable='yes'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <vmport state='off'/>
  </features>
  <cpu mode='host-model'>
    <model fallback='allow'/>
  </cpu>
  <clock offset='utc'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='no'/>
    <suspend-to-disk enabled='no'/>
  </pm>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='block' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/db.roessner-net.de.img'/>
      <target dev='vda' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:17:02:30'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='unix'>
      <source mode='bind'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='spice' autoport='yes'>
      <listen type='address'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <redirdev bus='usb' type='spicevmc'>
    </redirdev>
    <watchdog model='i6300esb' action='reset'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </watchdog>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/random</backend>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </rng>
  </devices>
</domain>
Comment 8 Christian Roessner 2016-10-20 06:50:29 UTC
(In reply to PaX Team from comment #6)
> 1. can you try a vanilla kernel?

Tested right now:

=vanilla-sources-4.7.8 is working perfectly
Comment 9 Agostino Sarubbo gentoo-dev 2016-10-20 12:58:15 UTC
It works for me with qemu-2.5.1
Comment 10 Christian Roessner 2016-10-20 13:34:14 UTC
(In reply to Agostino Sarubbo from comment #9)
> It works for me with qemu-2.5.1

I use the stable version 2.7.0-r4, which does not work. Older versions are no more available.

Also latest stable libvirt or in short: My system is up-to-date with stable packages ;-)
Comment 11 miro.rovis 2016-10-22 10:49:04 UTC
I was suggested by PaX Team:
linux-grsec-4.7.7 locks up within 30 minutes
https://forums.grsecurity.net/viewtopic.php?f=3&t=4586&start=15#p16689
that the bug I encountered and reported here:
banning user... until system restart for ... kernel crash w/ Qemu
https://forums.grsecurity.net/viewtopic.php?f=3&t=4593
was this one at this page. There are a few Call Traces that I posted there, if they can be of any help.
Comment 12 miro.rovis 2016-10-22 13:54:21 UTC
Just rsync'd and tried 4.7.8-hardened. Same crash (just no Call Trace, log corrupted instead).
Comment 13 miro.rovis 2016-10-22 16:21:35 UTC
Actually 4.7.5 crashes like the .6 and .7 if compiled with the .config of the 4.7.7 of mine that crashed.

Here my two configs of 4.7.5 with the localvesion -160929 and the localversion -161022:

$ ls -lABRgo /mnt/sr0/config-4.7.5-hardened-16*[2-9]
-rw-r--r-- 1 117369 2016-09-29 17:17 /mnt/sr0/config-4.7.5-hardened-160929
-rw-r--r-- 1 120012 2016-10-22 16:34 /mnt/sr0/config-4.7.5-hardened-161022
$

The older runs kvm guests fine, the newer crashes like shown.

The diff btwn the two:

$ diff /mnt/sr0/config-4.7.5-hardened-16*[2-9]
56c56
< CONFIG_LOCALVERSION="-160929"
---
> CONFIG_LOCALVERSION="-161022"
90a91
> # CONFIG_IRQ_DOMAIN_DEBUG is not set
146c147,161
< # CONFIG_CGROUPS is not set
---
> CONFIG_CGROUPS=y
> CONFIG_PAGE_COUNTER=y
> CONFIG_MEMCG=y
> CONFIG_MEMCG_SWAP=y
> CONFIG_MEMCG_SWAP_ENABLED=y
> # CONFIG_BLK_CGROUP is not set
> # CONFIG_CGROUP_SCHED is not set
> # CONFIG_CGROUP_PIDS is not set
> # CONFIG_CGROUP_FREEZER is not set
> # CONFIG_CGROUP_HUGETLB is not set
> # CONFIG_CPUSETS is not set
> # CONFIG_CGROUP_DEVICE is not set
> # CONFIG_CGROUP_CPUACCT is not set
> # CONFIG_CGROUP_PERF is not set
> # CONFIG_CGROUP_DEBUG is not set
281a297
> # CONFIG_GCOV_KERNEL is not set
367a384
> # CONFIG_IOSF_MBI_DEBUG is not set
570a588
> # CONFIG_ACPI_CUSTOM_METHOD is not set
709a728
> CONFIG_NET_EGRESS=y
844c863,864
< # CONFIG_NETFILTER_NETLINK_GLUE_CT is not set
---
> CONFIG_NF_CT_NETLINK_HELPER=y
> CONFIG_NETFILTER_NETLINK_GLUE_CT=y
884,885c904,905
< # CONFIG_NETFILTER_XT_CONNMARK is not set
< # CONFIG_NETFILTER_XT_SET is not set
---
> CONFIG_NETFILTER_XT_CONNMARK=y
> CONFIG_NETFILTER_XT_SET=y
890c910
< # CONFIG_NETFILTER_XT_TARGET_CHECKSUM is not set
---
> CONFIG_NETFILTER_XT_TARGET_CHECKSUM=y
920a941
> # CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
950c971
< # CONFIG_NETFILTER_XT_MATCH_PHYSDEV is not set
---
> CONFIG_NETFILTER_XT_MATCH_PHYSDEV=y
1083c1104
< # CONFIG_BRIDGE_EBT_MARK_T is not set
---
> CONFIG_BRIDGE_EBT_MARK_T=y
1109c1130,1194
< # CONFIG_NET_SCHED is not set
---
> CONFIG_NET_SCHED=y
> 
> #
> # Queueing/Scheduling
> #
> # CONFIG_NET_SCH_CBQ is not set
> CONFIG_NET_SCH_HTB=y
> # CONFIG_NET_SCH_HFSC is not set
> # CONFIG_NET_SCH_PRIO is not set
> # CONFIG_NET_SCH_MULTIQ is not set
> # CONFIG_NET_SCH_RED is not set
> # CONFIG_NET_SCH_SFB is not set
> CONFIG_NET_SCH_SFQ=y
> # CONFIG_NET_SCH_TEQL is not set
> # CONFIG_NET_SCH_TBF is not set
> # CONFIG_NET_SCH_GRED is not set
> # CONFIG_NET_SCH_DSMARK is not set
> # CONFIG_NET_SCH_NETEM is not set
> # CONFIG_NET_SCH_DRR is not set
> # CONFIG_NET_SCH_MQPRIO is not set
> # CONFIG_NET_SCH_CHOKE is not set
> # CONFIG_NET_SCH_QFQ is not set
> # CONFIG_NET_SCH_CODEL is not set
> # CONFIG_NET_SCH_FQ_CODEL is not set
> # CONFIG_NET_SCH_FQ is not set
> # CONFIG_NET_SCH_HHF is not set
> # CONFIG_NET_SCH_PIE is not set
> CONFIG_NET_SCH_INGRESS=y
> # CONFIG_NET_SCH_PLUG is not set
> 
> #
> # Classification
> #
> CONFIG_NET_CLS=y
> # CONFIG_NET_CLS_BASIC is not set
> # CONFIG_NET_CLS_TCINDEX is not set
> # CONFIG_NET_CLS_ROUTE4 is not set
> CONFIG_NET_CLS_FW=y
> CONFIG_NET_CLS_U32=y
> # CONFIG_CLS_U32_PERF is not set
> # CONFIG_CLS_U32_MARK is not set
> # CONFIG_NET_CLS_RSVP is not set
> # CONFIG_NET_CLS_RSVP6 is not set
> # CONFIG_NET_CLS_FLOW is not set
> # CONFIG_NET_CLS_CGROUP is not set
> # CONFIG_NET_CLS_BPF is not set
> # CONFIG_NET_CLS_FLOWER is not set
> # CONFIG_NET_EMATCH is not set
> CONFIG_NET_CLS_ACT=y
> CONFIG_NET_ACT_POLICE=y
> CONFIG_NET_ACT_GACT=y
> # CONFIG_GACT_PROB is not set
> # CONFIG_NET_ACT_MIRRED is not set
> # CONFIG_NET_ACT_IPT is not set
> # CONFIG_NET_ACT_NAT is not set
> # CONFIG_NET_ACT_PEDIT is not set
> # CONFIG_NET_ACT_SIMP is not set
> # CONFIG_NET_ACT_SKBEDIT is not set
> # CONFIG_NET_ACT_CSUM is not set
> CONFIG_NET_ACT_VLAN=y
> # CONFIG_NET_ACT_BPF is not set
> # CONFIG_NET_ACT_CONNMARK is not set
> # CONFIG_NET_ACT_IFE is not set
> # CONFIG_NET_CLS_IND is not set
> CONFIG_NET_SCH_FIFO=y
1123a1209,1210
> # CONFIG_CGROUP_NET_PRIO is not set
> # CONFIG_CGROUP_NET_CLASSID is not set
1476a1564
> # CONFIG_IFB is not set
1478c1566,1567
< # CONFIG_MACVLAN is not set
---
> CONFIG_MACVLAN=y
> CONFIG_MACVTAP=y
1539a1629
> # CONFIG_SKY2_DEBUG is not set
3509a3600
> # CONFIG_MCE_AMD_INJ is not set
3699a3791
> # CONFIG_NFSD_FAULT_INJECTION is not set
3786a3879
> # CONFIG_DYNAMIC_DEBUG is not set
3799c3892
< # CONFIG_DEBUG_FS is not set
---
> CONFIG_DEBUG_FS=y
3919a4013
> # CONFIG_LKDTM is not set
3973a4068
> # CONFIG_DEBUG_BOOT_PARAMS is not set
$

Regards!
Miroslav Rovis
https://www.CroatiaFidelis.hr
Comment 14 miro.rovis 2016-10-22 16:38:20 UTC
Created attachment 451046 [details, diff]
diff btwn old no-crash 4.7.5 and new crashing 4.7.5

I should have attached that diff... Correcting that now.
Comment 15 miro.rovis 2016-10-23 01:46:42 UTC
Created attachment 451082 [details]
syslog Call Trace 4.7.9-hardened

For clarity I attach the Call Trace of the syslog with 4.7.9-hardened kernel.
Fallback to 4.4.8-hardened-r1 works here too, with all the netfilter advanced frills, with libvirt and all.
Comment 16 PaX Team 2016-10-23 13:05:32 UTC
i tried to reproduce this without success so let me ask you guys for a few more tests. note that we don't touch any related code and the irqfd list handling itself is simple enough that i don't see how it would be wrong so my guess is that there's probably a higher level race and/or use-after-free condition somewhere that clears the irqfds.items list pointers to NULL (which isn't a valid state even for an otherwise empty list, hence the oops/NULL-deref). so the tests:

1. try to disable SANITIZE
2. try to disable everything in grsec (but still patch it in)
3. try a vanilla kernel (no grsec patched in) but with PAGE_POISONING (new feature, imitates a subset of SANITIZE) with and then without PAGE_POISONING_ZERO as well.

these tests will hopefully narrow the problem down a bit. also if you can come up with a simpler reproducer than what Christian and Miro posted, i'd like to know.
Comment 17 PaX Team 2016-10-23 13:20:31 UTC
(In reply to PaX Team from comment #16)
> 3. try a vanilla kernel (no grsec patched in) but with PAGE_POISONING (new
> feature, imitates a subset of SANITIZE) with and then without
> PAGE_POISONING_ZERO as well.
for the above tests you'll also have to pass page_poison=on on the kernel command line to actually activate poisoning.
Comment 18 miro.rovis 2016-10-24 01:12:23 UTC
(In reply to PaX Team from comment #17)
> (In reply to PaX Team from comment #16)
> > 3. try a vanilla kernel (no grsec patched in) but with PAGE_POISONING (new
> > feature, imitates a subset of SANITIZE) with and then without
> > PAGE_POISONING_ZERO as well.
> for the above tests you'll also have to pass page_poison=on on the kernel
> command line to actually activate poisoning.

I'll do what I can to try and do the tests you suggest. In slow time, because I'm not advanced enough an my systems are all slow (and for other reasons), but I'll be working at this.

If anyone else can do it, pls. do, because it is not likely that it will be soon (great luck if small number of hours, more likely 10-20 hours, can't tell) that I can come up with the results with any of the tests, due to the above.
Comment 19 miro.rovis 2016-10-24 09:39:23 UTC
Created attachment 451296 [details]
The emerge--info.txt , complete, as root. Also for later.

First test (no SANITIZE) done. Kernel config and Call Trace should follow.
Comment 20 miro.rovis 2016-10-24 09:41:11 UTC
Created attachment 451298 [details]
config of 4.7.9-hardened w/o SANITIZE

The config of 4.7.9-hardened of 16-10-24 at 9h, complete.
Comment 21 miro.rovis 2016-10-24 09:48:16 UTC
Created attachment 451300 [details]
syslog w/ Call Trace

The messages_161024_1110_g5n when the Call Trace happened, at 11:10.

I did attempt exacly as in my previous tries. I.e.: to follow the
https://wiki.gentoo.org/wiki/QEMU/Linux_guest
guide.

And it happened just like most of the last times.

Next, I'll try the test no 2) that PaX Team suggested (disable everything in grsec (but still patch it in)).

However, while I taught beginners to grsec-patch vanilla kernel in Debian Forums, I have never yet patched a hardened-sources ;-) The big boys always did it for me... If I get in trouble, maybe you could, blueness and swift, tell the Forum people to let me in the forums, so I can ask for help (I'm still banned since a few months ago)? Could you?

Regards!
---
Miroslav Rovis
Zagreb, Croatia
http://www.CroatiaFidelis.hr
Try refute: [url=http://www.crmbuyer.com/story/39565.html]rootkit hooks in kernel[/url],
[url=https://forums.grsecurity.net/viewtopic.php?f=7&t=2522]linux capabilities for intrusion[/url]? (Linus?)
Comment 22 miro.rovis 2016-10-24 10:34:01 UTC
I thought about this:
> However, while I taught beginners to grsec-patch vanilla kernel in Debian
> Forums, I have never yet patched a hardened-sources ;-) The big boys always
> did it for me... If I get in trouble, maybe you could, blueness and swift,
> tell the Forum people to let me in the forums, so I can ask for help (I'm
> still banned since a few months ago)? Could you?
and I figured out I do not need to go any special ebuild rewriting or somesuch way. But use the same kernel source, in which the grsec is already patched in (looked up the build logs in portage/logs of hardened-sources-4.7.9 just now), and only disable in it everything in grsec.

So I'll be doing the test 2) now.
Comment 23 miro.rovis 2016-10-24 10:59:33 UTC
Created attachment 451310 [details, diff]
config-4.7.9-hardened-161024_12.diff

This is the diff btwn the:

config-4.7.9-hardened-161024_09 (posted complete, 2-3 or so comments above)
and:

config-4.7.9-hardened-161024_12 (being compiled as I write)

PaX Team, if that is not quite what you meant, pls. do tell!

Once it compiles, I'll reboot into it and run same commands as quite a few times previously by now. If anyone can think of a better way to test for this bug, pls. tell!
Comment 24 PaX Team 2016-10-24 11:06:14 UTC
(In reply to miro.rovis from comment #23)
> This is the diff btwn the:
> 
> config-4.7.9-hardened-161024_09 (posted complete, 2-3 or so comments above)
> and:
> 
> config-4.7.9-hardened-161024_12 (being compiled as I write)
> 
> PaX Team, if that is not quite what you meant, pls. do tell!
this diff is somewhat confusing, can you try 'diff -u old_file new_file' next time please?
Comment 25 miro.rovis 2016-10-24 11:45:19 UTC
(In reply to PaX Team from comment #24)
> this diff is somewhat confusing, can you try 'diff -u old_file new_file'
> next time please?
I will.

I'll attach the results (qemu booted fine). Next.
Comment 26 miro.rovis 2016-10-24 11:48:51 UTC
Created attachment 451314 [details]
messages_161024_1325_g5n

These are the messages rougly during the time of the qemu command run.

The Linux_guest wiki page test went fine. I booted into guest gentoo minimal
amd64 install CD, and rebooted and quit. All with the same commands as
previously.

Here are the logs. With the ' port 0 '. And it's all happening in the Air-Gapped that never sees online (I still got to add more to the topis on grsec Forums about it, but I guess it's unrelated to here).

Next I'll try the test 3) that you asked for.
Comment 27 miro.rovis 2016-10-24 12:47:31 UTC
(In reply to PaX Team from comment #16)
> 3. try a vanilla kernel (no grsec patched in) but with PAGE_POISONING (new
> feature, imitates a subset of SANITIZE) with and then without
> PAGE_POISONING_ZERO as well.
Does the order matter? I've just compiled (but not yet booted into) 4.8.3, but without PAGE_POISONING.

config-4.8.3-161024_14

And I'm now compiling the

config-4.8.3-161024_1430

with PAGE_POISONING and PAGE_POISONING_ZERO. I'll give the diff next. If the order matters, which one do I run first?

I'll try, in some 15-20 minutes that it takes to compile, the with PAGE_POISONING and PAGE_POISONING_ZERO first.
Comment 28 miro.rovis 2016-10-24 12:51:27 UTC
Created attachment 451318 [details, diff]
iff config-4.8.3-161024_14 config-4.8.3-161024_1430
Comment 29 PaX Team 2016-10-24 13:11:20 UTC
(In reply to miro.rovis from comment #26)
> These are the messages rougly during the time of the qemu command run.
> 
> The Linux_guest wiki page test went fine. I booted into guest gentoo minimal
> amd64 install CD, and rebooted and quit. All with the same commands as
> previously.
so it means that it's one of the grsecurity features that triggers the oops. this is good news because now you can do a binary search on these options to find out which one.
Comment 30 PaX Team 2016-10-24 13:12:40 UTC
(In reply to miro.rovis from comment #27)
> (In reply to PaX Team from comment #16)
> > 3. try a vanilla kernel (no grsec patched in) but with PAGE_POISONING (new
> > feature, imitates a subset of SANITIZE) with and then without
> > PAGE_POISONING_ZERO as well.
> Does the order matter? I've just compiled (but not yet booted into) 4.8.3,
> but without PAGE_POISONING.
no, you can try them in any order, they're independent tests.
Comment 31 miro.rovis 2016-10-24 13:26:43 UTC
Went for the with PAGE_POISONING (and _ZERO).

I was just about expecting all to go fine, because the qemu script from the wiki page ran fine, booted the install-gentoo-minimal-somthing.iso and rebooted and quit, but then upon issuing:
# shutdown -r 0
to reboot and try the without both PAGE_POISONING (and _ZERO), and there the panic!

Since it is not guarranteed that it will be in the system log, I had better manually copy it (my cellphone with the camera is broken).

In the next comment, or I'll make it an attachment. In short, there is no string "kvm_irqfd_release" there, and neither kvm on its own, but it is to do with my old Hauppauge HVR3000 Hybrid TV card, because there are cx8800 strings...

Useful to manually type and post an attachment with likely pretty accurate Call Trace? Or not needed?
Comment 32 miro.rovis 2016-10-24 14:08:44 UTC
Created attachment 451322 [details]
config-4.8.3-161024_1430_CallTrace.txt

A few lines, at the start should be uppercase to lowercase and lowercase to uppercase.
This is not necessarily the final takedown of it, if important parts are not clear enough.
So, PaX Team, pls. do tell me, do I need to go again through all of the screen (eyes huring a bit, buy I will if it is necessary) to check on exactly which lines?
The screen will be waiting for your reply. Not rebooting till then.
Comment 33 PaX Team 2016-10-24 15:07:39 UTC
(In reply to miro.rovis from comment #32)
> So, PaX Team, pls. do tell me, do I need to go again through all of the
> screen (eyes huring a bit, buy I will if it is necessary) to check on
> exactly which lines?
it's a different problem that you should perhaps report to kernel devs. so for now i'd suggest to bisect grsec options to find which one causes the irqfd oops.
Comment 34 miro.rovis 2016-10-24 15:19:27 UTC
(In reply to PaX Team from comment #33)
> (In reply to miro.rovis from comment #32)
> > So, PaX Team, pls. do tell me, do I need to go again through all of the
> > screen (eyes huring a bit, buy I will if it is necessary) to check on
> > exactly which lines?
> it's a different problem that you should perhaps report to kernel devs.
I see. Hmmmh...
Anyway, first I'll post the corrected version which I already prepared, of the Call Trace (in case that it isn't in the logs). Next.
> so
> for now i'd suggest to bisect grsec options to find which one causes the
> irqfd oops.
And then, how do I do that? Using something like half the options to see if the problem is in that half, and go on like that (I looked up "binary search" in duckduckgo.com)?
Comment 35 miro.rovis 2016-10-24 15:21:16 UTC
Created attachment 451328 [details]
config-4.8.3-161024_1430_CallTrace_v2.txt

This will be the final version in case the Call Trace is not to be found in the syslog.
Comment 36 miro.rovis 2016-10-24 15:29:00 UTC
(In reply to miro.rovis from comment #34)
> (In reply to PaX Team from comment #33)
> > (In reply to miro.rovis from comment #32)
> > so
> > for now i'd suggest to bisect grsec options to find which one causes the
> > irqfd oops.
> And then, how do I do that? Using something like half the options to see if
> the problem is in that half, and go on like that (I looked up "binary
> search" in duckduckgo.com)?
I've slept very little last night. I'll probably be off dozing away soon. Else I'm sick for certain.
And also, I need to study Qemu and its options first, to make reasonable command lines out of the few that are there.
So I need time now.

If anyone else is reading here and wants to step in and try, great!

Ah, first I'll go once without PAGE_POISONING, to finish the proposed test, and post it.
Comment 37 miro.rovis 2016-10-24 15:42:56 UTC
I booted into 4.8.3 vanilla compiled without PAGE_POISONING, and I saw the same, well, minor differences, but lots of hex numbers were the same, [I saw the same] Call Trace as with 4.8.3 vanilla compiled with PAGE_POISONING.

Sure the main difference was this one was localversion 14, not 1430.

I was right to take it down manually. There is nothing in the system log, for none of the with/without PAGE_POISONING vanilla kernel tests.

Tests done. Need time now.
Comment 38 miro.rovis 2016-10-24 20:48:04 UTC
Created attachment 451342 [details]
GentooVM

To make (belatedly) more clear which command I used in all the examples in the bug page, I'll first post the GentooVM command (differs in the current command of that name in the Wiki page, because it uses the -netdev syntax, as per the current note on that Wiki page https://wiki.gentoo.org/wiki/QEMU/Linux_guest).
Comment 39 miro.rovis 2016-10-24 20:56:47 UTC
Created attachment 451344 [details]
GentooVM_PART1.sh

Sorry that the previous GentooVM attachment doesn't show in Bugzilla but has to be downloaded. I'll add '.sh' to the next command to try and see if it then will.

And I cut the last part from the command that causes panic.
That is the first next test that I will run.
I am now back at:
4.7.9-hardened
with all the cgroup and advanced netfilter frills that libvirt, if they are not configured when libvirt compiles, says, "is not set when it should be".
Comment 40 miro.rovis 2016-10-24 21:07:15 UTC
The command:
GentooVM_PART1.sh -boot d -cdrom install-amd64-minimal-20161020.iso
ran fine, except moaning about "no space left" and dropping me to an emergency shell. (necessary to post the system logs to confirm this?)
So I now will run the command like the GentooVM_PART1.sh, but with the memory option added, next.
Comment 41 miro.rovis 2016-10-24 21:09:53 UTC
Created attachment 451348 [details]
GentooVM_PART1_mem.sh

GentooVM_PART1_mem.sh booted without "no space left" complaint.
Comment 42 miro.rovis 2016-10-24 21:13:35 UTC
Created attachment 451350 [details]
GentooVM_PART1_monitor.sh

The command:
GentooVM_PART1_monitor.sh -boot d -cdrom install-amd64-minimal-20161020.iso
also ran fine.
Comment 43 miro.rovis 2016-10-24 21:17:28 UTC
I'm sorry I messed up again. Pls forget all these wrong last three tests or so.
I ran them with the grsec patched in but all options disabled. Which means I ran them with the kernel of the localversion -161024_12 instead of -161024_09 (see attachment https://597554.bugs.gentoo.org/attachment.cgi?id=451310 with the diff of those two)
Rerunning those three last tests next, but with the 4.7.9-hardened, localvesion -161024_09.
Comment 44 miro.rovis 2016-10-24 21:34:58 UTC
Nope, but I was closer. First I just ran with yesterday's (European time) kernel:
which from the 
4.7.9-hardened, localvesion -161024_09
differs only in that SANITIZE is enabled in it.
And I don't want to lose the Call Trace, but have no patience nor time to manually type it again (since in grsec-hardened kernel of late it seems to me Call Traces are mostly saved in the logs anyway).
The difference btwn these already posted Call Traces I'll try and tell...
Compared to this one:
https://597554.bugs.gentoo.org/attachment.cgi?id=451082
It has similar text (not the same!) to:

Oct 23 00:47:39 g5n kernel: [  170.480825] RAX: 0000000000000000 RBX: ffff8803f88b4000 RCX: 0000000000000001
Oct 23 00:47:39 g5n kernel: [  170.480974] RDX: 0000000000000001 RSI: ffff880427845000 RDI: ffff8803f88b4a50
Oct 23 00:47:39 g5n kernel: [  170.481122] RBP: ffffc9000b423d50 R08: 0000000000000000 R09: 0000000000000000
Oct 23 00:47:39 g5n kernel: [  170.481271] R10: ffff8804278450d0 R11: 0000039286840000 R12: ffff8803f88b4a58
Oct 23 00:47:39 g5n kernel: [  170.481419] R13: ffff8803f88b4a50 R14: ffff8800b999d9c0 R15: ffff8800b9d441a8
Oct 23 00:47:39 g5n kernel: [  170.481569] FS:  0000039286640b00(0000) GS:ffff88043fc80000(0000) knlGS:0000000000000000
Oct 23 00:47:39 g5n kernel: [  170.481738] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 23 00:47:39 g5n kernel: [  170.481882] CR2: 0000000000000000 CR3: 0000000002187000 CR4: 00000000000006f0
Oct 23 00:47:39 g5n kernel: [  170.482029] Stack:
Oct 23 00:47:39 g5n kernel: [  170.482075]  ffff8803f88b4000 0000000000000008 ffff8800b9808b40 ffffc9000b423d68
Oct 23 00:47:39 g5n kernel: [  170.482248]  ffffffff8101a334 ffff880427845000 ffffc9000b423da8 ffffffff8119ce39
Oct 23 00:47:39 g5n kernel: [  170.482419]  ffff8804278450d0 ffff880427845080 ffff8803f883e400 ffff8803f883e840
Oct 23 00:47:39 g5n kernel: [  170.482590] Call Trace:
Oct 23 00:47:39 g5n kernel: [  170.482650]  [<ffffffff8101a334>] kvm_vm_release+0x17/0x32
Oct 23 00:47:39 g5n kernel: [  170.482768]  [<ffffffff8119ce39>] __fput+0x121/0x1d5
Oct 23 00:47:39 g5n kernel: [  170.482874]  [<ffffffff8119cf33>] ____fput+0x11/0x22
Oct 23 00:47:39 g5n kernel: [  170.482981]  [<ffffffff810d347e>] task_work_run+0x8c/0xb3
Oct 23 00:47:39 g5n kernel: [  170.483111]  [<ffffffff810bb9c1>] do_exit+0x40f/0x999
Oct 23 00:47:39 g5n kernel: [  170.483220]  [<ffffffff810dc364>] ? wake_up_state+0x1d/0x2d
Oct 23 00:47:39 g5n kernel: [  170.483341]  [<ffffffff810c4c04>] ? signal_wake_up_state+0x2c/0x4b
Oct 23 00:47:39 g5n kernel: [  170.483471]  [<ffffffff810bbfdd>] do_group_exit+0x48/0xb8
Oct 23 00:47:39 g5n kernel: [  170.483586]  [<ffffffff810bc05f>] sys_exit_group+0x12/0x1a
Oct 23 00:47:39 g5n kernel: [  170.483705]  [<ffffffff81b63924>] entry_SYSCALL_64_fastpath+0x13/0xa3
Oct 23 00:47:39 g5n kernel: [  170.483841] Code: 00 00 00 00 55 48 89 e5 41 55 41 54 49 89 fc 4d 8d ac 24 50 0a 00 00 49 81 c4 58 0a 00 00 53 4c 89 ef e8 b7 44 b4 00 49 8b 04 24 <48> 8b 18 48 8d b8 40 ff ff ff 48 81 eb c0 00 00 00 48 8d 87 c0 
Oct 23 00:47:39 g5n kernel: [  170.484609] RIP  [<ffffffff8101ec78>] kvm_irqfd_release+0x27/0x85
Oct 23 00:47:39 g5n kernel: [  170.484744]  RSP <ffffc9000b423d38>
Oct 23 00:47:39 g5n kernel: [  170.484818] CR2: 0000000000000000

and does not have the rest, including the

...
kvm_irqfd_release
...

Just if I lose it, that it be approximately clear what it looked like.
Comment 45 miro.rovis 2016-10-24 21:46:54 UTC
Created attachment 451352 [details]
config-4.7.9-161023_CallTrace.txt

Of course I was wrong. It was there, it only didn't show on the screen (wouldn't fit on 1024x768 ?).

Now do I try the same with:

4.7.9-hardened, localvesion -161024_09

(SANITIZE disabled)? I think that I should.
Comment 46 miro.rovis 2016-10-24 22:03:50 UTC
Created attachment 451354 [details]
kernel-4.7.9-hardened-161024_09_at_161024_2353.syslog

This is all that happened:

$ GentooVM_PART1.sh -boot d -cdrom install-amd64-minimal-20161020.iso 
ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory
failed to initialize KVM: Cannot allocate memory
$

So I guess I should try GentooVM_PART1_mem.sh command.

Pls. PaX Team do tell me if I'm still making wrong choices.
Comment 47 miro.rovis 2016-10-24 22:07:26 UTC
And I now get that same "Cannot allocate memory failed to initialize KVM" error with any of:
GentooVM_PART1.sh
GentooVM_PART1_mem.sh
and even
GentooVM
(see previous attachments).

What's happening? I'm lost...
Comment 48 miro.rovis 2016-10-24 22:18:17 UTC
The same kernel as in:
Comment 21 
And the same command:
GentooVM -boot d -cdrom install-amd64-minimal-20161020.iso 
as in:
https://597554.bugs.gentoo.org/attachment.cgi?id=451300
(the attachment of that Comment).
Also with the later variants of GentooVM_<something>.
But now it all flops, nothing happens, no errors in the logs, qemu doesn't start, and it's uninfluenced from outside Air-Gapped machine...
Comment 49 miro.rovis 2016-10-25 02:19:56 UTC
I'll rerun tests with 
first) my regular, previous to PaX Team call for tests more or less complete grsec-including features
second) the no-SANITIZE one.
I'll be including the kernel config comparison with the one posted in 
https://bugs.gentoo.org/show_bug.cgi?id=597554#c20

I can confirm that the one in my machine that I posted in comment 20 is:
bc5961bb27839b11202b95e6ddd189d7fab3444970e259fa2a9725097e3e852a  config-4.7.9-hardened-161024_09
just as (should be) when you download it.

Below also the first of the next five attachment should be:

11e03066799548c1680a288a935579d64e76bc5acd837b71e17e3661d5552397  config-4.7.9-hardened-161023

And I'll post the config-4.7.9-hardened-161023 for comparison.

then the syslog to confirm it, then the test itself for each, with the Call Trace or if not with analysis.

I have a few hours maximum the time to work on this. I have important things to do for some maybe even two or three days afterwords, so after a few more tests, I'll be off.

So there are now two tests, complete. These files I'll post, it is:
config-4.7.9-hardened-161023
hardened-4.7.9-161023-161025_0332_test_syslog_kernel_1_boot.log
hardened-4.7.9-161023-161025_0332_test_syslog_kernel_2_CallTrace.log
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_1_boot.log
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_2_CallTrace_NONE.log
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_3.stout

I'll attach these files one by one, in the above order.
Comment 50 miro.rovis 2016-10-25 02:20:52 UTC
Created attachment 451360 [details]
config-4.7.9-hardened-161023
Comment 51 miro.rovis 2016-10-25 02:21:23 UTC
Created attachment 451362 [details]
hardened-4.7.9-161023-161025_0332_test_syslog_kernel_1_boot.log
Comment 52 miro.rovis 2016-10-25 02:21:59 UTC
Created attachment 451364 [details]
hardened-4.7.9-161023-161025_0332_test_syslog_kernel_2_CallTrace.log
Comment 53 miro.rovis 2016-10-25 02:22:41 UTC
Created attachment 451366 [details]
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_1_boot.log
Comment 54 miro.rovis 2016-10-25 02:23:09 UTC
Created attachment 451368 [details]
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_2_CallTrace_NONE.log
Comment 55 miro.rovis 2016-10-25 02:23:45 UTC
Created attachment 451370 [details]
hardened-4.7.9-161024_09_161025_0347_test_syslog_kernel_3.stout
Comment 56 miro.rovis 2016-10-25 02:31:45 UTC
Now some kind advice could be great to hear, if what I'll (with the last available time left for now) try to understand, and if soon advised, run more tests.

What does this mean?

If the only differenc btwn the two kernels, one panic'ing, the other not doing the job is:

# diff -u config-4.7.9-hardened-161023 config-4.7.9-hardened-161024_09 
--- config-4.7.9-hardened-161023	2016-10-23 00:40:35.000000000 +0200
+++ config-4.7.9-hardened-161024_09	2016-10-25 04:10:47.379531793 +0200
@@ -53,7 +53,7 @@
 CONFIG_INIT_ENV_ARG_LIMIT=32
 CONFIG_CROSS_COMPILE=""
 # CONFIG_COMPILE_TEST is not set
-CONFIG_LOCALVERSION="-161023"
+CONFIG_LOCALVERSION="-161024_09"
 # CONFIG_LOCALVERSION_AUTO is not set
 CONFIG_HAVE_KERNEL_GZIP=y
 CONFIG_HAVE_KERNEL_BZIP2=y
@@ -3905,6 +3905,7 @@
 # Memory Debugging
 #
 # CONFIG_PAGE_EXTENSION is not set
+# CONFIG_DEBUG_PAGEALLOC is not set
 # CONFIG_PAGE_POISONING is not set
 # CONFIG_DEBUG_OBJECTS is not set
 # CONFIG_SLUB_DEBUG_ON is not set
@@ -4153,7 +4154,7 @@
 #
 # Miscellaneous hardening features
 #
-CONFIG_PAX_MEMORY_SANITIZE=y
+# CONFIG_PAX_MEMORY_SANITIZE is not set
 CONFIG_PAX_MEMORY_STACKLEAK=y
 CONFIG_PAX_MEMORY_STRUCTLEAK=y
 # CONFIG_PAX_MEMORY_UDEREF is not set
#

which means, with SANITIZE: crash, w/o SANITIZE: no work...

If that's the only difference, what is the next test to do? I'm still at a loss.
Comment 57 miro.rovis 2016-10-25 03:39:23 UTC
While I wait just a little longer for advice on what to do now, I'll revisit from comment:
https://bugs.gentoo.org/show_bug.cgi?id=597554#c19
and also comment 20 and comment 21.
I will check out again my logs, and look up more closely because it is unbelievable that back then there was, with:
hardened-161024_09
a panic and a Call Trace, and after more tests, and after the install of vanilla kernel, there was only, with the same:
hardened-161024_09
[there was only] qemu no-work.

That discrepancy is striking to me. But I won't post about it if there is no misposting of my attachments and some error of some kind in my presenting of them, and if there are no more replies to read here...

Would shutting down the machine, and disconnecting it from the mains, and pressing the on-switch to discharge, and restarting after a few more minutes and rerunning the test with the no-SANITIZE
hardened-161024_09
kernel make any difference? I'll try that too...
Comment 58 miro.rovis 2016-10-25 04:45:50 UTC
(In reply to miro.rovis from comment #57)
> While I wait just a little longer for advice on what to do now, I'll revisit
> from comment:
> https://bugs.gentoo.org/show_bug.cgi?id=597554#c19
> and also comment 20 and comment 21.
> I will check out again my logs, and look up more closely because it is
> unbelievable that back then there was, with:
> hardened-161024_09
> a panic and a Call Trace, and after more tests, and after the install of
> vanilla kernel, there was only, with the same:
> hardened-161024_09
> [there was only] qemu no-work.
> 
> That discrepancy is striking to me. But I won't post about it if there is no
> misposting of my attachments and some error of some kind in my presenting of
> them, and if there are no more replies to read here...
OK, I'll still confirm that all those logs are correct and correspond to what I'll archive for a little longer, if need appeared that someone else make sure.
 
> Would shutting down the machine, and disconnecting it from the mains, and
> pressing the on-switch to discharge, and restarting after a few more minutes
> and rerunning the test with the no-SANITIZE
> hardened-161024_09
> kernel make any difference? I'll try that too...
No, it didn't make any difference. We are at consistent crash for hardened-161023 and now newly consistent "ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory" as in
 https://bugs.gentoo.org/show_bug.cgi?id=597554#c46
for hardened-161024_09

And we are still at (only) 4.4.8-hardened-r1 doing the job with libvirt and qemu and guests.

Everything best I wish to everybody!
Comment 59 Francisco Blas Izquierdo Riera (RETIRED) gentoo-dev 2016-10-25 20:00:10 UTC
Little googling around revealed a use after free on kvm_irqfd_release but no patches nor comments on it :(
https://lkml.org/lkml/2016/6/21/541

My guess is that this is caused by some issue on the upstream kernel and the only reason why this is noticed more actively with memory sanitization is because the memory is filled with 0s triggering a null pointer dereference instead of something else.

Just throwing around some pointers (pun intended) in case they are of help.
Comment 60 miro.rovis 2016-10-26 04:07:02 UTC
(In reply to Francisco Blas Izquierdo Riera from comment #59)
> Little googling around revealed a use after free on kvm_irqfd_release but no
> patches nor comments on it :(
> https://lkml.org/lkml/2016/6/21/541
> 
> My guess is that this is caused by some issue on the upstream kernel and the
> only reason why this is noticed more actively with memory sanitization is
> because the memory is filled with 0s triggering a null pointer dereference
> instead of something else.
> 
> Just throwing around some pointers (pun intended) in case they are of help.

kernel with kasan compiled in fails to boot #41
https://github.com/google/kasan/issues/41

I like helping getting whitehats on use-after-free bugs... If only we get the perpetrators, willing or nilling perpetrators...
Comment 61 miro.rovis 2016-11-08 13:56:21 UTC
Created attachment 452712 [details]
messages_161108_125631_g5n with Call Traces

This bug appears to carry on to 4.7.10-hardened-r2 .

It's recorded in the syslog that I attach, what happened.

Nov  8 12:52:19
$ qemu-img (qemu-img create -f qcow2 GentooVM.img 15G

Nov  8 12:56:01 (but the cat... is exec'd at bottom of the 30 seconds later
created messages_161108_125631_g5n, below)
# sleep 30 && cat /var/log/messages | grep -aE -A30000 68798.280977 \
	> messages_$(date +%y%m%d_%H%M%S)_g5n
( NOTE: I broke the line for readability in email )

Nov  8 12:56:09
$ GentooVM -boot d -cdrom install-amd64-minimal-20161020.iso
(where GentooVM is attachment:
 https://597554.bugs.gentoo.org/attachment.cgi?id=451342
at comment:
 https://bugs.gentoo.org/show_bug.cgi?id=597554#c38)

Nov  8 12:56:09
( $ the script GentooVM command line reported by grsec:
qemu-system-x86_64 -enable-kvm -cpu host -drive file=GentooVM.img ... )

Nov  8 12:56:11
( reported by kernel: BUG: unable to handle kernel NULL pointer... )
( two Call Traces reported by kernel all in that second )
 Pls. see attachment for that.

Nov  8 12:56:31
( grsec reports the execution of the cat, date and grep commands from close to
the top of this post --I only cut the excessive after lines (-A3000) from the
attachment)

I can't let this go on without trying... more testing and maybe get you
seniors to identify the roots of what causes this issue and then fix it.

My system I always build in Air-Gapped. While I can not be sure, it is still
likely that if the "higher level race and/or use-after-free condition" (which
might be what breaks it here, as PaX Team said in
https://bugs.gentoo.org/show_bug.cgi?id=597554#c16 ) are causing this, these
race and/or condition have likely, or at the least it is perfectly possible
that, they have been brought into my Air-Gapped by some package(s) that
was/were signed-allowed into some of the portage snapshots, which I
exclusively use for my emerge-webrsync Air-Gapped updates of my Gentoo (I keep
all the portage snapshots for longer yet, just in case, as well as all the
packages that I ever installed since I went Air-Gapped from scratch some
maybe three years ago now).

Should I try and follow PaX Team's suggestion by first: revising my previous
steps, but with the updated kernels, both hardened of 4.4.x and 4.7.x (or if
it gets to 4.{7,8}.x in the meantime) series as well as vanilla kernels, when
I followed PaX Team's advice at
https://bugs.gentoo.org/show_bug.cgi?id=597554#c16 , and then go for his
suggestion at https://bugs.gentoo.org/show_bug.cgi?id=597554#c29 and do a
binary search on the options given to qemu-system-x86_x64 and hopefully narrow
down as to the root causes of this?

Kind regards!
Miroslav Rovis
Zagreb, Croatia
http://www.CroatiaFidelis.hr
Comment 62 miro.rovis 2016-11-08 14:14:15 UTC
Created attachment 452714 [details]
config-4.7.10-hardened-r2-161107_06.diff from config-4.7.9-hardened-161024_09

The kernel config diff of today's Call Traces.

The reference config is:
https://bugs.gentoo.org/attachment.cgi?id=451298
at comment:
https://bugs.gentoo.org/show_bug.cgi?id=597554#c20

I hope the old emerge --info is fine, as in:
https://bugs.gentoo.org/attachment.cgi?id=451296
at comment
https://bugs.gentoo.org/show_bug.cgi?id=597554#c19

but if any more is needed, you seniors pls. do tell!
Comment 63 Étienne Buira 2016-11-14 20:17:49 UTC
After being hit by this bug with hardened-sources-4.7.10 (kvm going bad with a similar stack trace, other parts of the system looked fine) on my workstation, i ran some tests on a spare box.

I could not reproduce the issue on the test box, whatever kernel version (tried vanilla 4.7.6 with page poisoning enabled & 4.4.8, hardened-4.7.10 with and without SANITIZE). Thus, i'm inclined to think the bug happens only on some hardware configurations.

The test box (where i couldn't reproduce the bug) dates back from 2008 and has 8GB of RAM.

The workstation (where the bug happens) has 32GB of RAM.

To other people who experience the bug, can you report a bit more about your hardware?

FWIW: looking at kvm_irqfd_release source, generated code, and the stack trace, it looks like the whole kvm struct is freed/zeroed.
Comment 64 miro.rovis 2016-11-16 11:56:36 UTC
(In reply to Étienne Buira from comment #63)
> To other people who experience the bug, can you report a bit more about your
> hardware?
My hardwere is as old as this Gentoo Forum post of mine (posted right when I bought it):
https://forums.gentoo.org/viewtopic-t-940916-postdays-0-postorder-asc-start-0.html#7173430
The HDDs and some peripherals only may be newer currently.
Comment 65 miro.rovis 2017-01-08 17:06:46 UTC
Created attachment 459184 [details]
Qemu_GentooVM_170108_emerge--info.txt

There have been some changes with this bug (just: not solved yet).

I've followed (completely correctly this time around) the procedure outlined
by PaX Team in Comment 16:
> i tried to reproduce this without success so let me ask you guys for a few
> more tests. note that we don't touch any related code and the irqfd list
> handling itself is simple enough that i don't see how it would be wrong so
> my guess is that there's probably a higher level race and/or use-after-free
> condition somewhere that clears the irqfds.items list pointers to NULL
> (which isn't a valid state even for an otherwise empty list, hence the
> oops/NULL-deref). so the tests:
> 
> 1. try to disable SANITIZE
> 2. try to disable everything in grsec (but still patch it in)
> 3. try a vanilla kernel (no grsec patched in) but with PAGE_POISONING (new
> feature, imitates a subset of SANITIZE) with and then without
> PAGE_POISONING_ZERO as well.
> 
> these tests will hopefully narrow the problem down a bit. also if you can
> come up with a simpler reproducer than what Christian and Miro posted, i'd
> like to know.

and in Comment 17:
> for the above tests you'll also have to pass page_poison=on on the
> kernel command line to actually activate poisoning.

The attachment to this comment is:

Qemu_GentooVM_170108_emerge--info.txt

Here are the kernels that I did the testing with, first the names of the
kernels, well: of their config's, next their config's (for just the two
"master" kernels) or the diffs from their respective "master" (for the
remaining three).

The config's/the config.diff's I am, unless the internet should misbehave
(just generally saying, my Gentoo works just fine; meaning only: I don't
control future, and the least do I control, say, my provider)...

[The config's/the config.diff's I am, unless the internet should misbehave]
attaching next, in successive order, to this Bugzilla.

But there's one that I don't need to attach, because it's already there, and I
carefully checked it: I used that same script in all today's tests. It's the
script that I named
(just as in the:
https://wiki.gentoo.org/wiki/QEMU/Linux_guest#Configuration
the wiki page which I never yet have been able to complete _with_ the
_hardened_), "GentooVM", and it is at the address:

https://bugs.gentoo.org/attachment.cgi?id=451342

near Comment 38 above.

The command line I used in all the five tests was, again (just with the
current ISO from: https://www.gentoo.org/downloads/):

$ GentooVM -boot d -cdrom install-amd64-minimal-20170105.iso

So these are the kernels that I tested with today:

-rw-r--r-- 1  120981 2017-01-06 05:42 config-4.8.15-hardened-r2-170106_05
-rw-r--r-- 1  118033 2017-01-08 04:28 config-4.9.1-170108_04
-rw-r--r-- 1  118044 2017-01-08 06:23 config-4.9.1-170108_05
-rw-r--r-- 1  121028 2017-01-08 07:12 config-4.8.15-hardened-r2-170108_05
-rw-r--r-- 1  120082 2017-01-08 11:24 config-4.8.15-hardened-r2-170108_10

And accordingly these [will] be the attachments to follow, in this order, next:

config-4.8.15-hardened-r2-170106_05
config-4.9.1-170108_04 
config-4.9.1-170108_05.diff
config-4.8.15-hardened-r2-170108_05.diff
config-4.8.15-hardened-r2-170108_10.diff

The diffs I obtained in this way:

diff -u config-4.9.1-170108_04 config-4.9.1-170108_05 > \
	config-4.9.1-170108_05.diff

diff -u config-4.8.15-hardened-r2-170106_05 \
	config-4.8.15-hardened-r2-170108_05 > \
	config-4.8.15-hardened-r2-170108_05.diff

diff -u config-4.8.15-hardened-r2-170106_05 \
	config-4.8.15-hardened-r2-170108_10 > \
	config-4.8.15-hardened-r2-170108_10.diff

After I, hopefully, post them, I will tell how each of the five tests fared.
And I'm bracing (teeth gnashing ;-) ) for the binary search on which of the
grsecurity options clashes with the likely race condition and/or use-after
free condition we (likely) have here
(
because, in short, only the kernels corresponding to
config-4.8.15-hardened-r2-170106_05 and config-4.8.15-hardened-r2-170108_05
(the with and without SANITIZE) didn't start the VM at all, short report on
the error:
"ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory"
), and the config-4.8.15-hardened-r2-170108_10 is grsec compiled-in but no
options whatsoever selected at all.
Comment 66 miro.rovis 2017-01-08 17:09:31 UTC
Created attachment 459188 [details]
config-4.8.15-hardened-r2-170106_05
Comment 67 miro.rovis 2017-01-08 17:11:48 UTC
Created attachment 459190 [details]
config-4.9.1-170108_04
Comment 68 miro.rovis 2017-01-08 17:12:49 UTC
Created attachment 459192 [details]
config-4.9.1-170108_05.diff
Comment 69 miro.rovis 2017-01-08 17:13:56 UTC
Created attachment 459194 [details]
config-4.8.15-hardened-r2-170108_05.diff
Comment 70 miro.rovis 2017-01-08 17:14:33 UTC
Created attachment 459196 [details, diff]
config-4.8.15-hardened-r2-170108_10.diff
Comment 71 miro.rovis 2017-01-08 17:24:40 UTC
More in detail now, how those kernels fared.

Where "kvm: zapping shadow pages", text from /var/log/messages, the VM
started, booted, all fine.

So only config-4.8.15-hardened-r2-170106_05 and
config-4.8.15-hardened-r2-170108_05 no start, let alone booting.

config-4.8.15-hardened-r2-170106_05

$ GentooVM -boot d -cdrom install-amd64-minimal-20170105.iso 
ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory
failed to initialize KVM: Cannot allocate memory
$ 

---

config-4.9.1-170108_04

page_poison=on both POISON, and POISON_ZERO

Jan  8 06:27:07 g5n login[3983]: ROOT LOGIN  on '/dev/tty5'
Jan  8 06:28:20 g5n kernel: [  147.964388] kvm [4390]: vcpu0, guest rIP: 0xffffffff8103a831 unhandled rdmsr: 0xc0010048
Jan  8 06:28:20 g5n kernel: [  148.006688] kvm: zapping shadow pages for mmio generation wraparound
Jan  8 06:28:20 g5n kernel: [  148.024985] kvm: zapping shadow pages for mmio generation wraparound

---

config-4.9.1-170108_05

page_poison=on both POISON, *no* POISON_ZERO

Jan  8 06:34:56 g5n sudo:     miro : TTY=pts/11 ; PWD=/home/miro ; USER=root ; COMMAND=/bin/bash
Jan  8 06:35:39 g5n kernel: [  211.424682] kvm [4404]: vcpu0, guest rIP: 0xffffffff8103a831 unhandled rdmsr: 0xc0010048
Jan  8 06:35:39 g5n kernel: [  211.467392] kvm: zapping shadow pages for mmio generation wraparound
Jan  8 06:35:39 g5n kernel: [  211.487056] kvm: zapping shadow pages for mmio generation wraparound

---

config-4.8.15-hardened-r2-170108_05

$ GentooVM -boot d -cdrom install-amd64-minimal-20170105.iso 
ioctl(KVM_CREATE_VM) failed: 12 Cannot allocate memory
failed to initialize KVM: Cannot allocate memory
$

---

config-4.8.15-hardened-r2-170108_10

Jan  8 11:31:10 g5n sudo:     miro : TTY=pts/12 ; PWD=/home/miro ; USER=root ; COMMAND=/bin/bash
Jan  8 11:32:50 g5n kernel: [  329.157388] kvm [4450]: vcpu0, guest rIP: 0xffffffff8103a831 unhandled rdmsr: 0xc0010048
Jan  8 11:32:50 g5n kernel: [  329.200026] kvm: zapping shadow pages for mmio generation wraparound
Jan  8 11:32:50 g5n kernel: [  329.218557] kvm: zapping shadow pages for mmio generation wraparound
Jan  8 11:34:49 g5n kernel: [  447.564079] kvm [4450]: vcpu0, guest rIP: 0xffffffff8103a831 unhandled rdmsr: 0xc0010048

Next, but this has been half a day's work, so, pls. bear longer now, the binary search on which grsecurity option it is that clashes with virtualization (or reveals things...). Bear longer now pls.
Comment 72 yandereson 2017-01-08 18:51:34 UTC
KVM gives me problems with this enabled in kernel "CONFIG_GRKERNSEC_SYSFS_RESTRICT" should try and turn that off.
It gives me exact same error as you stated above.

https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity_and_PaX_Configuration_Options#Sysfs.2Fdebugfs_restriction
Comment 73 miro.rovis 2017-01-08 19:38:02 UTC
Created attachment 459214 [details, diff]
config-4.8.15-hardened-r2-170108_18.diff

(In reply to yandereson from comment #72)
> KVM gives me problems with this enabled in kernel
> "CONFIG_GRKERNSEC_SYSFS_RESTRICT" should try and turn that off.
> It gives me exact same error as you stated above.
> 
> https://en.wikibooks.org/wiki/Grsecurity/Appendix/
> Grsecurity_and_PaX_Configuration_Options#Sysfs.2Fdebugfs_restriction
Apparently that is it! Thanks!

While the kernel with the only difference, from my full grsecurity-hardened
optimized for security, being in that GRKERNSEC_SYSFS option, in the newly
being compiled, disabled, is compiling, here's the result from my initial
binary search:

The attachment config-4.8.15-hardened-r2-170108_18.diff is derived like this:

diff -u config-4.8.15-hardened-r2-170106_05 \
	config-4.8.15-hardened-r2-170108_18 > \
	config-4.8.15-hardened-r2-170108_18.diff

And qemu booted fine:

Jan  8 19:17:51 g5n kernel: [  264.064129] grsec: (miro:U:/usr/bin/qemu-system-x86_64) exec of /usr/bin/qemu-system-x86_64 (qemu-system-x86_64 -enable-kvm -cpu host -drive file=GentooVM.img,if=virtio -netdev user,id=vmnic,hostname=gentoovm -device virt) by /usr/bin/qemu-system-x86_64[GentooVM:4437] uid/euid:1000/1000 gid/egid:1000/1000, parent /bin/bash[bash:4306] uid/euid:1000/1000 gid/egid:1000/1000
...
Jan  8 19:17:59 g5n kernel: [  272.513073] kvm [4437]: vcpu0, guest rIP: 0xffffffff8103a831 unhandled rdmsr: 0xc0010048
Jan  8 19:17:59 g5n kernel: [  272.556214] kvm: zapping shadow pages for mmio generation wraparound
Jan  8 19:17:59 g5n kernel: [  272.573557] kvm: zapping shadow pages for mmio generation wraparound

Likely because if you grep the diff for that string, you get:

$ grep GRKERNSEC_SYSFS config-4.8.15-hardened-r2-170108_18.diff 
-CONFIG_GRKERNSEC_SYSFS_RESTRICT=y
+# CONFIG_GRKERNSEC_SYSFS_RESTRICT is not set
$

Will post the (likely) confirmation upon the (likely) final compilation, as
the binary search might not be necessary any more.
Comment 74 miro.rovis 2017-01-08 20:01:29 UTC
Created attachment 459218 [details, diff]
config-4.8.15-hardened-r2-170108_20.diff

Pls. see config-4.8.15-hardened-r2-170108_20.diff , with the sole difference
from my security optimized grsecurity-hardened being in that:

+# CONFIG_GRKERNSEC_SYSFS_RESTRICT is not set

And qemu booted the CD image just fine!

Jan  8 20:47:58 g5n kernel: [  192.256308] grsec: (miro:U:/usr/bin/qemu-system-x86_64) exec of /usr/bin/qemu-system-x86_64 (qemu-system-x86_64 -enable-kvm -cpu host -drive file=GentooVM.img,if=virtio -netdev user,id=vmnic,hostname=gentoovm -device virt) by /usr/bin/qemu-system-x86_64[GentooVM:4208] uid/euid:1000/1000 gid/egid:1000/1000, parent /bin/bash[bash:4079] uid/euid:1000/1000 gid/egid:1000/1000
...
Jan  8 20:48:03 g5n kernel: [  197.427606] kvm [4208]: vcpu0, guest rIP: 0xffffffff8103a831 unhandled rdmsr: 0xc0010048
Jan  8 20:48:03 g5n kernel: [  197.476991] kvm: zapping shadow pages for mmio generation wraparound
Jan  8 20:48:03 g5n kernel: [  197.509899] kvm: zapping shadow pages for mmio generation wraparound

Phew!

PaX Team, spender, blueness, what are the security considerations here?
Virtualization stripping us off from another pretective layer here? How bad
will that protection be missing to us, if I may ask, and if any of you have time to tell
us about it?
Comment 75 Étienne Buira 2017-02-23 10:02:46 UTC
I can confirm CONFIG_GRKERNSEC_SYSFS_RESTRICT exposes the bug.

From an strace of qemu (compared with and w/o sysfs_restrict), everything looks the same until ioctl VM creation.
Comment 76 Brad Spengler 2017-02-23 12:19:37 UTC
Can you paste me from the log what sysfs entries qemu is requiring access to?

Thanks,
-Brad
Comment 77 Étienne Buira 2017-02-23 12:43:49 UTC
hardened-sources-4.8.17-r2 fails in a cleaner way (ENOMEM).

@Brad: there is no sysfs required, i have open("/dev/kvm", O_RDWR|O_CLOEXEC) = 9, followed by some ioctls on fd9, then ioctl(9, KVM_CREATE_VM, 0)        = -1 ENOMEM (Cannot allocate memory)

Reading at the sources, i see only kzalloc(sizeof(struct whatever), GFP_KERNEL), kvm_kvzalloc, alloc_percpu, kmalloc(sizeof(struct whatever), GFP_KERNEL) that can return ENOMEM. Hard to believe it has something to do with sysfs protection!
Comment 78 Étienne Buira 2017-02-23 12:52:28 UTC
Created attachment 464872 [details]
Strace of failing qemu on hardened-sources-4.8.17-r2 with sysfs protection
Comment 79 Brad Spengler 2017-02-23 13:06:21 UTC
Can you apply https://grsecurity.net/~spender/debugfs_debug.diff and give me the kernel logs it produces?  I'm not sure how SYSFS_RESTRICT would be causing the failure of any of that code, but we'll get it figured out and fixed.

Thanks!
-Brad
Comment 80 Étienne Buira 2017-02-23 15:46:59 UTC
"failed to create directory", nice catch :)
Comment 81 Étienne Buira 2017-02-23 18:14:04 UTC
tried to grasp a bit more about this issue, here what i got

dentry creation failed in fs/debugfs/inode.c:start_creation because EACCES
/sys/kernel/debug/kvm is rwx------ root:root
i start qemu as a user (member of kvm group)
hence, the task had no right to create a dentry there

after pth="/sys/kernel/debug/kvm"; chown :kvm $pth; chmod 770 $pth ; the directory can be created, but it fails to create pf_fixed: "failed to create debugfs file for pf_fixed"(ie the first entry)
Comment 82 Étienne Buira 2017-02-23 19:52:57 UTC
Created attachment 464920 [details, diff]
Remove grsec protection from debugfs

Also, as debugfs mountpoint is hard enough to reach, i considered removing its grsec protection (done with attached patch). Works nice with this patch applied (with still some sysfs protection).
Comment 83 Brad Spengler 2017-02-23 22:17:09 UTC
Apply this instead:
https://grsecurity.net/~spender/kvm.diff

I'll be applying it to the next patches.

-Brad
Comment 84 Brad Spengler 2017-02-23 22:19:19 UTC
Just to add to this, I believe GRKERNSEC_KMEM being enabled would prevent this problem from happening as well, as it'd force DEBUG_FS off and then make kvm's check for debugfs presence fail, letting the function return 0 and continuing without error.

-Brad
Comment 85 Étienne Buira 2017-02-24 08:54:01 UTC
Your patch is better (despite generating a warning) and works. Thanks!