Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 529352

Summary: sys-kernel/hardened-sources-3.15.10-r1: PAX_SIZE_OVERFLOW option killing interrupt handler with IPv6 failure
Product: Gentoo Linux Reporter: Colton Reeder <cdr>
Component: HardenedAssignee: The Gentoo Linux Hardened Team <hardened>
Status: RESOLVED FIXED    
Severity: critical CC: 8573dd, alexander, creideiki+gentoo-bugzilla, pageexec, re.emese, spender
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: stack dump
hardened 3.17.2 config
proposed fix
netdev patch

Description Colton Reeder 2014-11-15 16:18:49 UTC
Im running a IPv6 in IPv4 tunnel on this machine. I noticed recently it was randomly rebooting. It took my awhile to figure out it was occurring when the WAN was down. It will cycle reboot until connection is restored. Only way to stop this is to disable the IPv6 tunnel and related services. I can reliably cause a crash by pinging a v6 address and unplugging my modem. Ive tried getting more info but all I have is the stack trace. I was led to PAX by searching for functions in the stack until I came across report_size_overflow.
Disabling the PAX_SIZE_OVERFLOW option in kernel config stops the panic.

Reproducible: Always

Steps to Reproduce:
1.Hardened kernel with PAX_SIZE_OVERFLOW enabled
2.Dual stack IPv4/IPv6
3.Ping v6 address
4.Disable connection

Actual Results:  
Kernel Panic and stack dump

Expected Results:  
Should see "Destination unreachable: Address unreachable" reported from `ping`
Comment 1 Colton Reeder 2014-11-15 16:19:46 UTC
Created attachment 389416 [details]
stack dump
Comment 2 Anthony Basile gentoo-dev 2014-11-15 16:24:13 UTC
Thanks for the report and the analysis.  I'm passing this by upstream.  They'll probably want to know if it happens on the latest version which is 3.17.3 which I'll put on the tree in an hour or so.
Comment 3 Colton Reeder 2014-11-15 16:43:40 UTC
(In reply to Anthony Basile from comment #2)
> Thanks for the report and the analysis.  I'm passing this by upstream. 
> They'll probably want to know if it happens on the latest version which is
> 3.17.3 which I'll put on the tree in an hour or so.

I tested hardened-sources umm I think it was just 3.17.2 not the -r1.
But, yeah Ill try 3.17.3 when it drops.
Comment 4 PaX Team 2014-11-15 18:03:39 UTC
can you run 'addr2line -e vmlinux -fip ffffffff8175ce91' on your kernel image (vmlinux is in the build dir)?

for best results enable CONFIG_DEBUG_INFO and perhaps CONFIG_DEBUG_INFO_REDUCED. if you no longer have vmlinux around then just take any size overflow report and use the corresponding vmlinux file and the address reported for _decode_session6.
Comment 5 PaX Team 2014-11-15 18:07:34 UTC
forgot to mention but if you happen to have the logs from just before what you posted then we'd also see what the overflow instrumentation itself reported (netconsole/SOL/etc may help capture it).
Comment 6 Emese Revfy 2014-11-15 20:01:54 UTC
Thanks for the report. Could you please send me the whole dmesg (there are some important data in it e.g., line number), your .config and the result of (all net/ipv6/xfrm6_policy.* files):
make net/ipv6/xfrm6_policy.o EXTRA_CFLAGS="-fdump-tree-all -fdump-ipa-all"
Comment 7 Colton Reeder 2014-11-15 23:29:15 UTC
Created attachment 389464 [details]
hardened 3.17.2 config
Comment 8 Colton Reeder 2014-11-15 23:31:06 UTC
(In reply to PaX Team from comment #4)
> can you run 'addr2line -e vmlinux -fip ffffffff8175ce91' on your kernel
> image (vmlinux is in the build dir)?
> 
> for best results enable CONFIG_DEBUG_INFO and perhaps
> CONFIG_DEBUG_INFO_REDUCED. if you no longer have vmlinux around then just
> take any size overflow report and use the corresponding vmlinux file and the
> address reported for _decode_session6.

I did this for 3.17.3-hardened I just built, which crashes like this also.
addr2line -e vmlinux -fip ffffffff817971c6
_decode_session6 at /usr/src/linux-3.17.3-hardened/net/ipv6/xfrm6_policy.c:184
Comment 9 Colton Reeder 2014-11-15 23:48:30 UTC
(In reply to Emese Revfy from comment #6)
> Thanks for the report. Could you please send me the whole dmesg (there are
> some important data in it e.g., line number), your .config and the result of
> (all net/ipv6/xfrm6_policy.* files):
> make net/ipv6/xfrm6_policy.o EXTRA_CFLAGS="-fdump-tree-all -fdump-ipa-all"

I posted the .config for the kernel Im currently testing. 
The dmesg is useless. I have a serial console and Ive tried 'dmesg -w' to get a live output but the stack dumps before anything appears. I could still send you it if you wanted, but all it has is a bunch of iptables logging messages.
Ill email you the xfrm6_policy stuff directly since even after compression it is still too big to attach.
Comment 10 Colton Reeder 2014-11-15 23:55:28 UTC
(In reply to Colton Reeder from comment #7)
> Created attachment 389464 [details]
> hardened 3.17.2 config

Whoops, I typoed, this is actually 3.17.3 config.
Comment 11 PaX Team 2014-11-16 00:44:46 UTC
to complete the cross-reference links, this is also tracked at https://forums.grsecurity.net/viewtopic.php?f=1&t=4083 ;)
Comment 12 Alexander Wetzel 2014-11-16 19:43:40 UTC
Created attachment 389530 [details, diff]
proposed fix

This here is based on my understanding and may be wrong. But it helped fixing the issue for me:

It looks like the ip tunnel driver has a small bug when a icmp v4 unreachable packet must be translated to icmpv6. According to my understanding the existing ipv4 skb is copied and send out modified as ipv6 packet. Here the network_header length is "reset", but not the transport_header length. The linux kernel seems to have no issue with that and is sending out the skb successful. (I assume the wrong value is changed later somehow.)

Some calls later the function skb_network_header_len is subtracting the unchanged (bigger) network_header size from the transport_header size, causing an integer underrun.
A PAX enabled kernel is catching that and the kernel panics...
Comment 13 PaX Team 2014-11-16 20:12:11 UTC
thanks for the update, two requests for those that can reproduce this:

1. test the patch ;)
2. can you run it by lkml/netdev and get a verdict on it?

also i wonder if there's a way to find such missing reset calls by static analysis...
Comment 14 Colton Reeder 2014-11-16 20:20:50 UTC
(In reply to PaX Team from comment #13)
> thanks for the update, two requests for those that can reproduce this:
> 
> 1. test the patch ;)
> 2. can you run it by lkml/netdev and get a verdict on it?
> 
> also i wonder if there's a way to find such missing reset calls by static
> analysis...

Im running a modified hardened-3.15.10-r1 with this patch now and it works as advertised.
Comment 15 Alexander Wetzel 2014-11-20 16:08:20 UTC
I sent a mail to the netdev mailing list some days ago: 

http://marc.info/?l=linux-netdev&m=141626149417509&w=2

So far no feedback.
Comment 16 Colton Reeder 2014-11-22 07:29:51 UTC
(In reply to alexander.wetzel from comment #15)
> I sent a mail to the netdev mailing list some days ago: 
> 
> http://marc.info/?l=linux-netdev&m=141626149417509&w=2
> 
> So far no feedback.

If I could reply, I would and bump it, but since I wasnt subscribed at the time it was posted it looks pretty difficult to do so.
Comment 17 Anthony Basile gentoo-dev 2014-11-29 13:15:49 UTC
(In reply to PaX Team from comment #13)
> thanks for the update, two requests for those that can reproduce this:
> 
> 1. test the patch ;)
> 2. can you run it by lkml/netdev and get a verdict on it?
> 
> also i wonder if there's a way to find such missing reset calls by static
> analysis...

@pipacs, Did your patch make it into grsecurity-3.0-3.17.4-201411260107?  If so,  then this fix is in the tree with hardened-sources-3.17.4-r1.  I'd like to remove the earlier versions which have this issue.
Comment 18 PaX Team 2014-11-29 21:59:59 UTC
i didn't have a patch for this yet and no, it isn't in grsec. we're still waiting on a verdict. an initial assessment from some network guys was that this is not the correct fix but i don't have more information yet.
Comment 19 Alexander Wetzel 2014-12-04 16:33:20 UTC
Created attachment 390932 [details, diff]
netdev patch

There is a new "real" fix proposed on netdev:
http://marc.info/?l=linux-netdev&m=141768340108789&w=2

I've removed the first fix we found and tried this one, works for me.
So here the (hopefully correct) fix from the mailing list.
Comment 20 PaX Team 2014-12-04 20:36:44 UTC
thanks for following up on it, we'll take the proposed fix in the next patches.
Comment 21 Colton Reeder 2014-12-07 12:19:07 UTC
Ive switched to the netdev patch and can confirm it works.