I start ipsec daemon (strongswan). Nearly almost time I ping a remote host just after ipsec is up, I get an Oops. I got this on 3 different vm, whith thoses kernel at least: 3.7.5-r1, 3.8.3, 3.8.6. They are full vm, under vmware esxi. Reproducible: Sometimes Steps to Reproduce: 1. Have an hardened-gentoo kernel. 2. Configure ipsec with strongswan with network-network tunnel 3. Test ipsec connectivity with a simple ping Actual Results: The ping process can't be killed. It has D state in htop. I get an oops. Expected Results: Not a oops. The Oops: May 20 12:51:06 sargeras kernel: PAX: please report this to pageexec@freemail.hu May 20 12:51:06 sargeras kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000000002a0 May 20 12:51:06 sargeras kernel: IP: [<ffffffff813bd7b6>] xfrm_output_one+0xa7/0x230 May 20 12:51:06 sargeras kernel: PGD 7ca5f000 May 20 12:51:06 sargeras kernel: Thread overran stack, or stack corrupted May 20 12:51:06 sargeras kernel: Oops: 0000 [#1] SMP May 20 12:51:06 sargeras kernel: Modules linked in: xfrm_user vsock(O) vmsync(O) coretemp processor thermal_sys microcode vmci(O) May 20 12:51:06 sargeras kernel: CPU 0 May 20 12:51:06 sargeras kernel: Pid: 2274, comm: ping Tainted: G O 3.8.6-hardened #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform May 20 12:51:06 sargeras kernel: RIP: 0010:[<ffffffff813bd7b6>] [<ffffffff813bd7b6>] xfrm_output_one+0xa7/0x230 May 20 12:51:06 sargeras kernel: RSP: 0018:ffff88007b4b98e8 EFLAGS: 00010286 May 20 12:51:06 sargeras kernel: RAX: 000000000000021c RBX: ffff88007b400d80 RCX: 0000000000000000 May 20 12:51:06 sargeras kernel: RDX: 00000000fffffde4 RSI: 0000000000000000 RDI: ffff88007b400d80 May 20 12:51:06 sargeras kernel: RBP: ffff88007b4b9918 R08: 00000000d97586c6 R09: 0000000000000600 May 20 12:51:06 sargeras kernel: R10: ffff88007b4b9718 R11: ffff88007ada90f0 R12: 0000000000000000 May 20 12:51:06 sargeras kernel: R13: 8000000000000000 R14: 000000000203a8c0 R15: 0000000000000000 May 20 12:51:06 sargeras kernel: FS: 0000032bc14a9700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 May 20 12:51:06 sargeras kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 20 12:51:06 sargeras kernel: CR2: 00000000000002a0 CR3: 0000000001434000 CR4: 00000000000007f0 May 20 12:51:06 sargeras kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 20 12:51:06 sargeras kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 May 20 12:51:06 sargeras kernel: Process ping (pid: 2274, threadinfo ffff88007cb1dc40, task ffff88007cb1d850) May 20 12:51:06 sargeras kernel: Stack: May 20 12:51:06 sargeras kernel: ffff88007b4b9948 ffffffff00000001 0000000000000001 ffff88007b400d80 May 20 12:51:06 sargeras kernel: 8000000000000000 000000000203a8c0 ffff88007b4b9958 ffffffff813bda44 May 20 12:51:06 sargeras kernel: ffffffff813ed84c ffff88007b400d80 0000000000000004 ffff88007b400d80 May 20 12:51:06 sargeras kernel: Call Trace: May 20 12:51:06 sargeras kernel: [<ffffffff813bda44>] xfrm_output_resume+0x105/0x131 May 20 12:51:06 sargeras kernel: [<ffffffff813ed84c>] ? xfrm6_extract_output+0x3d/0x3d May 20 12:51:06 sargeras kernel: [<ffffffff813ed84c>] ? xfrm6_extract_output+0x3d/0x3d May 20 12:51:06 sargeras kernel: [<ffffffff813bda8a>] xfrm_output2+0x1a/0x22 May 20 12:51:06 sargeras kernel: [<ffffffff8136e65d>] ? ip_setup_cork+0xfb/0xfb May 20 12:51:06 sargeras kernel: [<ffffffff813bdb4b>] xfrm_output+0xb9/0xca May 20 12:51:06 sargeras kernel: [<ffffffff813ed84c>] ? xfrm6_extract_output+0x3d/0x3d May 20 12:51:06 sargeras kernel: [<ffffffff813ed86c>] xfrm6_output_finish+0x20/0x28 May 20 12:51:06 sargeras kernel: [<ffffffff813b4ea1>] xfrm4_output+0x78/0x87 May 20 12:51:06 sargeras kernel: [<ffffffff8136fcd7>] ip_local_out+0x31/0x3b May 20 12:51:06 sargeras kernel: [<ffffffff81370cff>] ip_send_skb+0x15/0x41 May 20 12:51:06 sargeras kernel: [<ffffffff81370d68>] ip_push_pending_frames+0x3d/0x4a May 20 12:51:06 sargeras kernel: [<ffffffff81390ab1>] raw_sendmsg+0x365/0x401 May 20 12:51:06 sargeras kernel: [<ffffffff8139b811>] inet_sendmsg+0x97/0xa6 May 20 12:51:06 sargeras kernel: [<ffffffff8130aaa7>] sock_sendmsg+0x9e/0xc5 May 20 12:51:06 sargeras kernel: [<ffffffff81319973>] ? verify_iovec+0x168/0x1e9 May 20 12:51:06 sargeras kernel: [<ffffffff8130afc4>] __sys_sendmsg+0x3d5/0x4cf May 20 12:51:06 sargeras kernel: [<ffffffff8130ab61>] ? sockfd_lookup_light+0x2a/0x73 May 20 12:51:06 sargeras kernel: [<ffffffff8130e912>] sys_sendmsg+0x43/0x6a May 20 12:51:06 sargeras kernel: [<ffffffff81427cfa>] system_call_fastpath+0x18/0x1d May 20 12:51:06 sargeras kernel: Code: 85 f6 7f 08 31 f6 85 d2 7f 0c eb 1f 85 d2 b8 00 00 00 00 0f 48 d0 b9 20 00 00 00 48 89 df e8 6a 94 f5 ff 85 c0 0f 85 79 01 00 00 <49> 8b 84 24 a0 02 00 00 49 be 00 00 00 00 00 00 00 80 48 89 de May 20 12:51:06 sargeras kernel: RIP [<ffffffff813bd7b6>] xfrm_output_one+0xa7/0x230 May 20 12:51:06 sargeras kernel: RSP <ffff88007b4b98e8> May 20 12:51:06 sargeras kernel: CR2: 00000000000002a0 May 20 12:51:06 sargeras kernel: ---[ end trace 15eb41c127dbce11 ]--- I recompiled kernel to gives me more info, so I can gets the line where the bug occurred: sargeras ~ # addr2line -e /usr/src/linux/vmlinux ffffffff813bd7b6 /usr/src/linux/net/xfrm/xfrm_output.c:57. This refers to this line: err = x->outer_mode->output(x, skb); My emerge-info: sargeras ~ # emerge --info Portage 2.1.11.62 (hardened/linux/amd64, gcc-4.6.3, glibc-2.15-r3, 3.8.6-hardened x86_64) ================================================================= System uname: Linux-3.8.6-hardened-x86_64-Intel-R-_Core-TM-_i7_CPU_920_@_2.67GHz-with-gentoo-2.2 KiB Mem: 2058088 total, 1550160 free KiB Swap: 0 total, 0 free Timestamp of tree: Wed, 15 May 2013 15:45:01 +0000 ld GNU ld (GNU Binutils) 2.22 app-shells/bash: 4.2_p45 dev-lang/python: 2.7.3-r3, 3.2.3-r2 dev-util/cmake: 2.8.9 dev-util/pkgconfig: 0.28 sys-apps/baselayout: 2.2 sys-apps/openrc: 0.11.8 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.69 sys-devel/automake: 1.11.6, 1.12.6 sys-devel/binutils: 2.22-r1 sys-devel/gcc: 4.6.3 sys-devel/gcc-config: 1.7.3 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r4 sys-kernel/linux-headers: 3.7 (virtual/os-headers) sys-libs/glibc: 2.15-r3 Repositories: gentoo kveer ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -march=native -mtune=native -mmmx -msse -msse2 -msse3 -msse4.1 -msse4.2 -maes -mpclmul -mpopcnt -mcx16 -mfpmath=sse -fomit-frame-pointer -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /var/bind" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/apache2-php5.4/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cgi-php5.4/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/php/cli-php5.4/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=native -mtune=native -mmmx -msse -msse2 -msse3 -msse4.1 -msse4.2 -maes -mpclmul -mpopcnt -mcx16 -mfpmath=sse -fomit-frame-pointer -pipe" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="ftp://mirror.ovh.net/gentoo-distfiles/ ftp://ftp.free.fr/mirrors/ftp.gentoo.org/" LANG="fr_FR.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j8" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/var/lib/layman/Kveer" SYNC="rsync://rsync.fr.gentoo.org/gentoo-portage" USE="acl amd64 bash-completion berkdb bzip2 caps cli cracklib crypt cxx dri fam gdbm gpm hardened iconv ipv6 justify mmx modules mudflap multilib ncurses nls nptl openmp pam pax_kernel pcre readline session sse sse2 ssl tcpd threads unicode urandom vim-syntax xattr zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation setenvif speling status unique_id userdir usertrack vhost_alias" APACHE2_MPMS="prefork" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="fr" NGINX_MODULES_HTTP="access autoindex browser charset dav empty_gif fancyindex fastcgi flv geo geoip gzip headers_more mp4 passenger proxy referer rewrite scgi secure_link spdy stub_status sub upload_progress" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-3 php5-4" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_2" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="geoip gradm ipp2p ipset pknock" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON Tested with strongswan 5.0.0 and 5.0.4. IPSec uses ECC certificates for mutual auth. I may do some tests if asked, although I'm absolutely not a kernel dev.
Created attachment 348726 [details] my current kernel config with debugging enabled
pipacs, hardened-sources-3.8.6 = grsecurity-2.9.1-3.8.6-201304052305. I have not pushed any of the latest because of the low mem issue on x86.
this is a null ptr deref in the kernel. does vanilla work better?
No, it doesn't. (vanilla 3.8.6, same .config)
I manage to finally make the bug reproductible. This is the situation: - 2 servers A and B with each 2 interfaces, one public with ipv6 and ipv4, the other private with ipv4 only. - an ipsec ipv6 tunnel to link the two ipv4 private network - this iptable rule on A: iptables -t nat -A POSTROUTING -o enp2s0 -s 192.168.14.32/27 -j MASQUERADE --random enp2s0 is the public interface on A. This iptable rule is to provide internet to the computers in the private network. A ping from A to B trigger the oops: ping 192.168.3.2. If the tunnel was in ipv4, the ping just print nothing, as if the packet is dropped internally, I think this is what I supposed to get instead of the oops. This oops seems in fact a misconfiguration on iptables: I have to not masquerading packets that are subject to be processed by ipsec. If I change the rule to: iptables -A POSTROUTING -s 192.168.14.32/27 -o enp2s0 -m policy --dir out --pol none -j MASQUERADE --random, all is working properly.
you should also check 3.9 and perhaps some 3.10-rc and if the problem is still there, report this to the kernel devs.
(In reply to Veovis from comment #4) > No, it doesn't. (vanilla 3.8.6, same .config) I'm pushing this to the people handling the vanilla kernel.
(In reply to PaX Team from comment #6) > you should also check 3.9 and perhaps some 3.10-rc and if the problem is > still there, report this to the kernel devs. Please do this first (3.9.7 / 3.10-rc7) so we know if upstream has already fixed this; if not fixed, I will look into the code in two days from now.
Thanks for correcting my report. I am in vacation right now, so I cannot test the newest kernel until july 8. I reported the bug on the netdev ML, but it had not received any comment so far. I have seen a bug very close to mine a couple of weeks ago, that received some patchs.
I'm testing sys-kernel/hardened-sources-3.9.9 on my buggy systems. It seems this precise bug is solved, since from this version at least. I have still kernel panics caused by ipsec kernel code on recent kernels (3.8 and 3.9 branches until 3.9.5 at least), but this is another bug. I'll refer to you if I get the kernel trace.
not anymore in the tree. @Veovis: work for you the latest version of 3.10/3.11 ?
(In reply to Agostino Sarubbo from comment #11) > @Veovis: work for you the latest version of 3.10/3.11 ? @Veovis: Please test the latest versions.
Hi, Right now, I have 3 systems on 3.10.1-hardened-r1, but with a small uptime (48 hours only). No kernel panic to report yet. Probably unrelated (or specific to my hoster maybe), but if I use IPv6-IPSec for tunneling, I have a lot of connection problems (ping OK, but connection between exchange server <=> active directory is not, lots of 0-length packets) With a IPv4-IPSEC, all is fine.