I had been using kernel 3.12.x until todate when I wanted to try out 3.14. I have shorewall (shorewall-core + shorewall + shorewall6) set up on my system. However it appears to crash the kernel. I had origionally enabled the new nftables as a module and though that might be causing issues, but the same thing happens even after disabling nftables and rebuilding. Same thing happened trying to test with 3.13. If I turn off the shorewall services, the system boots fine and otherwise works as expected. Reproducible: Always
Created attachment 374358 [details] emerge info
What does really happen, then? Does the kernel panic? Does the init system hang? What happens if you start shorewall after boot time?
Kernel panic. Starting manually after system is finished loading makes no difference.
1) Please post your `emerge --info net-firewall/shorewall' output in a comment. 2) Userland tools shouldn't be able to do that, but we still don't have enough information. Could you get a screenshot or write down the messages?
Created attachment 374442 [details] Shorewall emerge info
Created attachment 374444 [details] shorewall config files archive
Created attachment 374446 [details] kernel-3.14.0 config
Created attachment 374448 [details] Kernel Panic Photo
I agree with Jeroen. shorewall is unable to cause this crash (shorewall is just a perl script which will call "iptables[6]" commands). This crash must be related to kernel and or net-firewall/iptables. Can you please attache a dmesg, too? I am interested in your network hardware/driver. Does your system always immediately crash when you start shorewall (so is this reproducible) or when does this happen?
(In reply to Thomas D. from comment #9) > Does your system always immediately crash when you start shorewall (so is > this reproducible) or when does this happen? Comment #3.
@ Reuben: What are the last lines you see after you run # shorewall start (right, "shorewall start", don't use the runscript "/etc/init.d/shorewall"...) If it will crash before you see a "done." line, please restart the system and run # shorewall trace start This should give us a detailed output about the command which is causing the problems... I'll try to reproduce the problem tomorrow.
I'm aware that shorewall itself is not causing the crash. I just filed it under shorewall because that's the quickest way for me to reproduce it. There's lot of boiler plate rules that it generates, and I have no idea which rule (or combination of rules) triggers the panic. I'm willing to bet the issue stems from a bug introduced with the new nftables added in 3.13. The panic is reproducible and consistently happens within a little over a second after shorewall finishes compiling the rules (i.e. reaches the "done" message). I'm uploading the iptables as they are shown from "iptables(6) -L -v -n" under kernel 3.12.14 as well as the ouput from dmesg booting 3.14.0 with shorewall(6) disabled. And also hardware info from lshw.
Created attachment 374632 [details] iptables
Created attachment 374634 [details] iptables6
Created attachment 374636 [details] hardware
Created attachment 374638 [details] dmesg
I upgraded my test system to 3.14 (I used my configuration from 3.10 - I didn't make any adjustments, just "make oldconfig" from genkernel... so CONFIG_NF_TABLES is not set, same like you). I am currently unable to reproduce your problem. Could you please test with only shorewall (no shorewall6, so no iptables6...)? So we don't need to check if any IPv6 stuff is involved.. Also, from reading your rules file it looks like you are using the xt_geoip module from net-firewall/xtables-addons. Could you please remove the rule for testing so we can make sure it is not the xt_geoip module...?
Shorewall only loads ip4 rules. And that is what has been causing the panic. (when I start shorewall manually, shorewall6 has not yet been started)
Ok, I disabled both the rule in the "rules" config, and well as the "GEOIPDIR" option in "shorewall.conf"/ Kernel still panics, with less delay. I'm add the output of lsmod from under 4.12.14 with shorewall started, so that should give an idea of which modules are actually getting use.
Created attachment 374640 [details] kernel modules
I am now running your shorewall configuration (=this should load the same modules) and my system doesn't panic. - Remove all the rules using helpers (SMBBI, mDNSbi..) - If this is not enough, remove all the rules - does it crash with an empty rules file? If not, re-add rule for rule... - Maybe your system panic when a helper module will get traffic? Maybe you can temporarily isolate the system (pull the cables) to see if it won't panic if it won't see traffic... However, try to get a crash dump. - Maybe ipt_do_tables+0x144 is already enough to use objdump on the iptabels module to see something... - You said 3.12.x was working for you. Maybe you can do a kernel bisect?
Well the rabbit hole goes deeper... I disabled all directives in the rules config, and unplugged the network cables before booting. No dice, still panics. I'm going to try and see if I can crash it directly tomorrow with iptables. I've never tried to do kernel debugging. Where can I read up on how to obtain a crash dump? The panic error messages do consistently reference ipt_do_tables+0x144. What do I do with objdump? I've heard kernel bisect can take a considerable amount of time.. So that seems like a last resort. However I've not done that either. I assume the rebuild / testing processes are automated somehow?
I also forgot to ask, do I need to rebuild with some debug options enabled to make any sort of dump useful? My kernel config currently only has the bare minimum of debug options enabled.
Can you try without ebtables, too? You cannot replace the r8169 driver, can you? E.g. use a different NIC (and disable this driver) or use the driver from http://www.realtek.com.tw/downloads/downloadsView.aspx?Langid=1&PNid=13&PFid=4&Level=5&Conn=4&DownTypeID=3&GetDown=false&Downloads=true ? This is just a feeling but if you search for r8169 and kernel panic you will see some reports for 3.13+ like https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=741667 Regarding kernel debugging/what you need in your kernel see https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps Regarding the use of objdump see https://wiki.ubuntu.com/Kernel/KernelDebuggingTricks#Objdump Regarding bisect see https://wiki.gentoo.org/wiki/Kernel_git-bisect - It will take some time but it is not that hard like it sounds.
Well... I would have posted a kernel dump by now, if the kernel didn't panic after being started the second time. It might be because I'm loading the same kernel as the secondary. I'll try building a 3.12 debug kernel to load as the backup kernel and see if that helps.
This has since been fixed as of kernel 3.16.