I've starten ovs-* daemons, then I shuted down kvms, next I did 'rmmod openvswitch' and 'modprobe openvswitch'. Here is result: [36037.399202] br0: port 2(ovs-most) entered disabled state [36079.367201] openvswitch: Open vSwitch switching datapath [36079.367214] PAX: please report this to pageexec@freemail.hu [36079.367217] BUG: unable to handle kernel NULL pointer dereference at (nil) [36079.367239] IP: [<ffffffffa0027015>] ovs_init_net+0x15/0x3c [openvswitch] [36079.367257] PGD 417ff8000 [36079.367268] Oops: 0000 [#1] SMP [36079.367281] Modules linked in: openvswitch(+) scsi_dh_rdac scsi_dh vhost_net tun ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bridge stp llc virtio_scsi virtio_pci virtio_mmio virtio_balloon virtio_rng virtio_ring virtio timeriomem_rng rng_core ipv6 ioatdma dca intel_mid_dma inet_lro hid_generic usbhid dm_mod sr_mod cdrom iTCO_wdt coretemp kvm_intel kvm crc32_pclmul crc32c_intel aesni_intel xts aes_x86_64 lpc_ich lrw gf128mul ehci_pci fan thermal mfd_core processor ehci_hcd ablk_helper thermal_sys e1000e usbcore ptp i2c_i801 ahci cryptd pps_core libahci usb_common i2c_core button hwmon unix [last unloaded: openvswitch] [36079.367520] CPU 0 [36079.367526] Pid: 29548, comm: modprobe Not tainted 3.9.4-hardened-r1 #1 Supermicro X9SCL/X9SCM/X9SCL/X9SCM [36079.367545] RIP: 0010:[<ffffffffa0027015>] [<ffffffffa0027015>] ovs_init_net+0x15/0x3c [openvswitch] [36079.367565] RSP: 0018:ffff8804054ffbf8 EFLAGS: 00010202 [36079.367576] RAX: 0000000000000001 RBX: ffffffffa0023680 RCX: 0000000000000000 [36079.367609] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81659600 [36079.367641] RBP: ffff8804054ffbf8 R08: 0000000000010da0 R09: ffff880416098fe0 [36079.367672] R10: ffff88042f175400 R11: 0000000000000000 R12: ffff880416098fe0 [36079.367702] R13: ffffffff81659600 R14: ffff880419495f00 R15: 0000000000000000 [36079.367733] FS: 0000026de5f53700(0000) GS:ffff88042fc00000(0000) knlGS:0000000000000000 [36079.367778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [36079.367806] CR2: 0000000000000000 CR3: 000000000160d000 CR4: 00000000001407f0 [36079.367836] DR0: 00000000000000a0 DR1: 0000000000000000 DR2: 0000000000000003 [36079.367866] DR3: 00000000000000b0 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [36079.367897] Process modprobe (pid: 29548, threadinfo ffff8801bd598538, task ffff8801bd598000) [36079.367944] Stack: [36079.367965] ffff8804054ffc48 ffffffff81386e9e ffff8804054ffc20 ffffffff81031056 [36079.368019] ffffffffa0023680 ffffffff81659600 ffffffffa0023680 ffff8804054ffc68 [36079.368073] ffff8804054ffe98 ffffffffa0019360 ffff8804054ffc98 ffffffff8138713a [36079.368125] Call Trace: [36079.368150] [<ffffffff81386e9e>] ops_init+0xae/0x150 [36079.368177] [<ffffffff81031056>] ? native_pax_close_kernel+0x26/0x40 [36079.368208] [<ffffffffa0023680>] ? .LC11+0x454/0x454 [openvswitch] [36079.368238] [<ffffffffa0023680>] ? .LC11+0x454/0x454 [openvswitch] [36079.368267] [<ffffffff8138713a>] register_pernet_operations+0xea/0x160 [36079.368297] [<ffffffffa0023680>] ? .LC11+0x454/0x454 [openvswitch] [36079.368326] [<ffffffff813871de>] register_pernet_device+0x2e/0x70 [36079.368356] [<ffffffffa002703c>] ? ovs_init_net+0x3c/0x3c [openvswitch] [36079.368386] [<ffffffffa002707f>] dp_init+0x43/0x2e6a [openvswitch] [36079.368416] [<ffffffffa002703c>] ? ovs_init_net+0x3c/0x3c [openvswitch] [36079.368445] [<ffffffff81000367>] do_one_initcall+0x147/0x170 [36079.368475] [<ffffffff81099cf4>] load_module+0x1e34/0x2480 [36079.368502] [<ffffffff81095c90>] ? sys_getegid16+0x50/0x50 [36079.368531] [<ffffffffa0027168>] ? dp_init+0x12c/0x2e6a [openvswitch] [36079.368560] [<ffffffff8109a493>] sys_init_module+0x153/0x220 [36079.368589] [<ffffffff81431375>] system_call_fastpath+0x18/0x1d [36079.368617] Code: <3b> 02 76 02 0f 0b ff c8 48 98 48 8b 44 c2 18 48 85 c0 75 02 0f 0b [36079.368704] RIP [<ffffffffa0027015>] ovs_init_net+0x15/0x3c [openvswitch] [36079.368734] RSP <ffff8804054ffbf8> [36079.368758] CR2: 0000000000000000 [36079.369125] ---[ end trace 5bc8de1a729cfb76 ]--- I'm not sure if it can be somehow related to bug #469500. I didn't recompile net-misc/openvswitch with new kernel.
# emerge --info Portage 2.1.11.62 (hardened/linux/amd64, gcc-4.7.3, glibc-2.15-r3, 3.9.4-hardened-r1 x86_64) ================================================================= System uname: Linux-3.9.4-hardened-r1-x86_64-Intel-R-_Xeon-R-_CPU_E3-1230_V2_@_3.30GHz-with-gentoo-2.2 KiB Mem: 16450984 total, 10936604 free KiB Swap: 4193264 total, 4193264 free Timestamp of tree: Sat, 01 Jun 2013 04:15:01 +0000 ld GNU gold (GNU Binutils 2.22) 1.11 ccache version 3.1.9 [enabled] app-shells/bash: 4.2_p45 dev-lang/python: 2.7.3-r3, 3.2.3-r2 dev-util/ccache: 3.1.9 dev-util/cmake: 2.8.10.2-r2 dev-util/pkgconfig: 0.28 sys-apps/baselayout: 2.2 sys-apps/openrc: 0.11.8 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.69 sys-devel/automake: 1.11.6, 1.12.6 sys-devel/binutils: 2.22-r1 sys-devel/gcc: 4.7.3, 4.8.0 sys-devel/gcc-config: 1.7.3 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r4 sys-kernel/linux-headers: 3.7 (virtual/os-headers) sys-libs/glibc: 2.15-r3 Repositories: gentoo qemu-init ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -pipe -march=native -frecord-gcc-switches -fno-unwind-tables -fno-asynchronous-unwind-tables -fexpensive-optimiza tions" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -pipe -march=native -frecord-gcc-switches -fno-unwind-tables -fno-asynchronous-unwind-tables -fexpensive-optimi zations" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs ccache collision-protect config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news p arallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv users andbox xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="pl_PL.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,--sort-common" MAKEOPTS="-j8" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_EXTRA_OPTS="-O" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --tim eout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/var/lib/layman/qemu-init" SYNC="rsync://gentoo-mirror/gentoo-portage" USE="acl acpi amd64 avx bash-completion caps custom-cflags cxx hardened hwdb iconv mmxext multilib nls openmp sse2 sse3 sse4 sse41 sse4_1 ss se3 threads udev unicode vim-syntax xattr" ABI_X86="64" CURL_SSL="openssl" ELIBC="glibc" GRUB_PLATFORMS="multiboot pc" KERNEL="linux" LINGUA S="en" PYTHON_TARGETS="python2_7 python3_2" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" USERLAND="GNU" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, USE _PYTHON
Created attachment 349874 [details] kernel config
Certainly I can't reproduce it...
(In reply to Marcin Mirosław from comment #0) > I'm not sure if it can be somehow related to bug #469500. I didn't recompile > net-misc/openvswitch with new kernel. how about you recompile the module for the current kernel and try that instead? mismatched versions are never a good idea, CONFIG_MODVERSIONS exists for a reason.
I'm using module from kernel not from net-misc/openvswitch - I just noticed it. So it was used module from running kernel. I don't have good enviroment to try reproduce problem because it's production machine.
(In reply to Marcin Mirosław from comment #5) > I'm using module from kernel not from net-misc/openvswitch - I just noticed > it. So it was used module from running kernel. in that case it'd be nice if you could eventually test a vanilla kernel as well, right now i don't see what in grsec could cause this.
In second try I couldn't reproduce it. If I find way to reproduce it on hardened then I'll try vanilla kernel.
Now I have hard lock of host when I'm trying to do `rmmmod openvswitch`. Hard lock appears on vanilla-sources-3.9.7 also. it looks grsec changes nothing in kernel behaviour.
(In reply to Marcin Mirosław from comment #8) > Now I have hard lock of host when I'm trying to do `rmmmod openvswitch`. > Hard lock appears on vanilla-sources-3.9.7 also. it looks grsec changes > nothing in kernel behaviour. Okay this is purely a vanilla issue, so I'll send it their way.
1. Is something written to log (eg. /var/log/messages) before the hard lock? 2. Are you still able to us Magic SysRq? [1] 3. Is this sys-apps/kmod or something else? Which version? 3. Which kernel version was the last working version? 4. Can you try a later version (3.9.7) and a development version (3.10-rc7)? It would be nice to know if it work(ed|s) at some point so we can inspect what changes have been done that have caused this. Also, if there is a chance that there is some kind of output being written before hard lock then that would be very useful debugging information. [1]: http://en.wikipedia.org/wiki/Magic_SysRq_key
Ad.1. No Ad.2. Rather no. Firstly I have access to console using IPMI - it isn't very reliable. Pressing any key doesn't wake up console from blanking, I didn't test magic Sysrq. I'll try to do it. Ad.3. [ebuild R ] sys-apps/kmod-12-r1 USE="tools -debug -doc -lzma -static-libs -zlib" Ad.4. I didn't try. I'm using openvswitch newly Ad.5. I'll try. It's production machine, I can't reboot it too frequently:) I can't promise I'll do it in this week. I'll try to reproduce problem in VM, it would be easier to make tests.
Have you since been able to reproduce this? Did you try a more recent kernel? As for the original call trace, I just noted this line: > [36079.368177] [<ffffffff81031056>] ? native_pax_close_kernel+0x26/0x40 That call is not present in a vanilla kernel, that makes inspecting this harder. Could you obtain us a new call trace for vanilla sources without PaX?
I've tried to reproduce problem on host and inside vm (with gentoo-sources and with hardened-sources). Without success. I've also tried on 3.9.4-hardened-r1, again without success.
(In reply to Marcin Mirosław from comment #13) > I've tried to reproduce problem on host and inside vm (with gentoo-sources > and with hardened-sources). Without success. I've also tried on > 3.9.4-hardened-r1, again without success. I'm confused. Is the bug in comment 0 reproduceable or not?
Anthony, when I did tests two days ago I couldn't reproduce it. Maybe because I changed a little configuration of openvswitch, dunno. Maybe it was a some kind of glitch.
(In reply to Marcin Mirosław from comment #15) > Anthony, when I did tests two days ago I couldn't reproduce it. Maybe > because I changed a little configuration of openvswitch, dunno. Maybe it was > a some kind of glitch. Please reopen this bug if you get this again in the future, thank you in advance.