Summary: | sys-kernel/gentoo-sources-4.19.86 - WARNING: CPU: 1 PID: 20926 at net/ipv4/tcp_output.c:911 tcp_wfree+0x29/0xe2 | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Vieri <rentorbuy> |
Component: | Current packages | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | CC: | jstein |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | https://lkml.org/lkml/2020/2/24/130 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
kernel syslog
kernel .config kernel syslog |
This could be fixed in later kernels. I would try upgrading to the latest gentoo-sources which is 4.19.98 as of this writing. You could also try the vanilla version of this kernel. Let us know if the oops till happens A kernel panic halts the kernel. Your kernel does not halt. There is no panic. (In reply to Jeroen Roovers from comment #2) > A kernel panic halts the kernel. Your kernel does not halt. There is no > panic. Why do you say that? As clearly reported in my first post, my system halted because there was a kernel panic. The system was useless. It did not respond to anything. I had to hard-reboot it. The attached file shows the log just before the kernel halted. The other log snippet happens once in a while (very variable time periods), but that does not halt the system. However, it shows that there's something to worry about, and it also *seems* related to the kernel panic I experienced (related to network IRQs). Anyway, I am currently in the process of updating to the latest stable gentoo-sources. In any case, this bug report *is* about a kernel panic. This morning I rebooted the system with the new kernel. So this is day 1, and I've already spotted 2 glitches: Jan 28 14:05:50 kernel: ------------[ cut here ]------------ Jan 28 14:05:50 kernel: WARNING: CPU: 0 PID: 5410 at net/ipv4/tcp_output.c:915 tcp_wfree+0x29/0xe2 Jan 28 14:05:50 kernel: Modules linked in: arc4 ecb md4 sha512_ssse3 sha512_generic cmac cifs ccm fscache nfnetlink_queue autofs4 xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_TARPIT(O) xt_comment xt_iprange xt_geoip(O) xt_set xt_NFQUEUE ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt_hashlimit xt_tcpudp xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox Jan 28 14:05:50 kernel: ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port ip_set_hash_net ip_set_hash_ip ip_set nfnetlink l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ip6table_filter ip6_tables sha256_ssse3 sha256_generic mcryptd sha1_ssse3 sha1_generic ipv6 arptable_filter arp_tables xt_iface(O) xt_conntrack iptable_mangle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter ip_tables x_tables bpfilter sch_fq_codel sch_fq snd_hda_codec_analog snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_pcm snd_timer snd k8temp ohci_pci parport_pc soundcore floppy ohci_hcd parport asus_atk0110 thermal ehci_pci fan ehci_hcd button i2c_nforce2 ata_generic pata_amd pata_acpi msdos configfs fuse f2fs jfs btrfs zstd_decompress zstd_compress xxhash lzo_compress Jan 28 14:05:50 kernel: zlib_deflate sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise ata_piix ahci libahci libata nvme nvme_core virtio_crypto crypto_engine virtio_pci virtio_balloon virtio_rng virtio_console virtio_blk virtio_ring virtio Jan 28 14:05:50 kernel: CPU: 0 PID: 5410 Comm: proftpd Tainted: G O 4.19.97-gentoo-x86_64 #1 Jan 28 14:05:50 kernel: Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010 Jan 28 14:05:50 kernel: RIP: 0010:tcp_wfree+0x29/0xe2 Jan 28 14:05:50 kernel: Code: c3 55 53 8b 87 e0 00 00 00 48 8b 6f 18 ff c8 f0 29 85 44 01 00 00 0f 88 b5 4e 08 00 75 0e 48 c7 c7 c3 3e d8 81 e8 2c 22 9c ff <0f> 0b 8b 85 44 01 00 00 3d 40 02 00 00 76 1a 65 48 8b 05 9a 99 95 Jan 28 14:05:50 kernel: RSP: 0000:ffff88811fc03df0 EFLAGS: 00010246 Jan 28 14:05:50 kernel: RAX: 0000000000000024 RBX: ffff88805c4ddc00 RCX: 0000000000000000 Jan 28 14:05:50 kernel: RDX: 0000000000000000 RSI: ffff88811fc152d8 RDI: ffff88811fc152d8 Jan 28 14:05:50 kernel: RBP: ffff888005e9f440 R08: 0000000000000001 R09: 000000000000fc00 Jan 28 14:05:50 kernel: R10: 0000000000000000 R11: 0000000000000044 R12: 0000000000000000 Jan 28 14:05:50 kernel: R13: ffff88811af107c0 R14: 000000000000003e R15: ffff88811af10000 Jan 28 14:05:50 kernel: FS: 00007f99b34c7740(0000) GS:ffff88811fc00000(0000) knlGS:0000000000000000 Jan 28 14:05:50 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 28 14:05:50 kernel: CR2: 00007f99b3d20000 CR3: 0000000074504000 CR4: 00000000000006f0 Jan 28 14:05:50 kernel: Call Trace: Jan 28 14:05:50 kernel: <IRQ> Jan 28 14:05:50 kernel: skb_release_head_state+0x74/0xa4 Jan 28 14:05:50 kernel: skb_release_all+0xa/0x20 Jan 28 14:05:50 kernel: __kfree_skb+0xa/0x14 Jan 28 14:05:50 kernel: e1000_put_txbuf+0x73/0x86 Jan 28 14:05:50 kernel: e1000_clean_tx_irq+0xb4/0x23f Jan 28 14:05:50 kernel: e1000e_poll+0x5a/0x223 Jan 28 14:05:50 kernel: net_rx_action+0x12e/0x305 Jan 28 14:05:50 kernel: __do_softirq+0x114/0x267 Jan 28 14:05:50 kernel: irq_exit+0x58/0x64 Jan 28 14:05:50 kernel: do_IRQ+0xaa/0xc8 Jan 28 14:05:50 kernel: common_interrupt+0xf/0xf Jan 28 14:05:50 kernel: </IRQ> Jan 28 14:05:50 kernel: RIP: 0033:0x7f99b3d39917 Jan 28 14:05:50 kernel: Code: 00 66 90 8b b5 f4 02 00 00 85 f6 0f 84 c2 00 00 00 48 8b 45 70 c7 44 24 74 00 00 00 00 48 c7 44 24 78 00 00 00 00 48 8b 40 08 <48> 89 44 24 18 48 8b 45 68 48 8b 40 08 48 89 44 24 10 48 8b 85 00 Jan 28 14:05:50 kernel: RSP: 002b:00007ffc803179b0 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffde Jan 28 14:05:50 kernel: RAX: 00007f99b3ce32d0 RBX: 0000000000000001 RCX: 0000000000000000 Jan 28 14:05:50 kernel: RDX: 0000000000000000 RSI: 000000000000000f RDI: 0000564024f736c6 Jan 28 14:05:50 kernel: RBP: 00007f99b3d1c000 R08: 0000000000000001 R09: 00007f99b3d593f0 Jan 28 14:05:50 kernel: R10: 00007f99b3d59130 R11: 00007ffc80317b88 R12: 0000564024f75ada Jan 28 14:05:50 kernel: R13: 0000000000000018 R14: 00007ffc80317ae0 R15: 00007f99b351d8b8 Jan 28 14:05:50 kernel: ---[ end trace b7d8a2809485a990 ]--- Jan 28 14:54:00 kernel: ------------[ cut here ]------------ Jan 28 14:54:00 kernel: WARNING: CPU: 0 PID: 0 at net/ipv4/tcp_output.c:915 tcp_wfree+0x29/0xe2 Jan 28 14:54:00 kernel: Modules linked in: arc4 ecb md4 sha512_ssse3 sha512_generic cmac cifs ccm fscache nfnetlink_queue autofs4 xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_TARPIT(O) xt_comment xt_iprange xt_geoip(O) xt_set xt_NFQUEUE ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt_hashlimit xt_tcpudp xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox Jan 28 14:54:00 kernel: ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port ip_set_hash_net ip_set_hash_ip ip_set nfnetlink l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ip6table_filter ip6_tables sha256_ssse3 sha256_generic mcryptd sha1_ssse3 sha1_generic ipv6 arptable_filter arp_tables xt_iface(O) xt_conntrack iptable_mangle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter ip_tables x_tables bpfilter sch_fq_codel sch_fq snd_hda_codec_analog snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_pcm snd_timer snd k8temp ohci_pci parport_pc soundcore floppy ohci_hcd parport asus_atk0110 thermal ehci_pci fan ehci_hcd button i2c_nforce2 ata_generic pata_amd pata_acpi msdos configfs fuse f2fs jfs btrfs zstd_decompress zstd_compress xxhash lzo_compress Jan 28 14:54:00 kernel: zlib_deflate sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise ata_piix ahci libahci libata nvme nvme_core virtio_crypto crypto_engine virtio_pci virtio_balloon virtio_rng virtio_console virtio_blk virtio_ring virtio Jan 28 14:54:00 kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W O 4.19.97-gentoo-x86_64 #1 Jan 28 14:54:00 kernel: Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010 Jan 28 14:54:00 kernel: RIP: 0010:tcp_wfree+0x29/0xe2 Jan 28 14:54:00 kernel: Code: c3 55 53 8b 87 e0 00 00 00 48 8b 6f 18 ff c8 f0 29 85 44 01 00 00 0f 88 b5 4e 08 00 75 0e 48 c7 c7 c3 3e d8 81 e8 2c 22 9c ff <0f> 0b 8b 85 44 01 00 00 3d 40 02 00 00 76 1a 65 48 8b 05 9a 99 95 Jan 28 14:54:00 kernel: RSP: 0018:ffff88811fc03df0 EFLAGS: 00010246 Jan 28 14:54:00 kernel: RAX: 0000000000000024 RBX: ffff8881171be000 RCX: 0000000000000000 Jan 28 14:54:00 kernel: RDX: 0000000000000000 RSI: ffff88811fc152d8 RDI: ffff88811fc152d8 Jan 28 14:54:00 kernel: RBP: ffff88808a4604c0 R08: 0000000000000001 R09: 0000000000005300 Jan 28 14:54:00 kernel: R10: 0000000000000000 R11: 0000000000000044 R12: 0000000000000000 Jan 28 14:54:00 kernel: R13: ffff88811af107c0 R14: 000000000000003e R15: ffff88811af10000 Jan 28 14:54:00 kernel: FS: 0000000000000000(0000) GS:ffff88811fc00000(0000) knlGS:0000000000000000 Jan 28 14:54:00 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 28 14:54:00 kernel: CR2: 00007f654a52f000 CR3: 0000000117384000 CR4: 00000000000006f0 Jan 28 14:54:00 kernel: Call Trace: Jan 28 14:54:00 kernel: <IRQ> Jan 28 14:54:00 kernel: skb_release_head_state+0x74/0xa4 Jan 28 14:54:00 kernel: skb_release_all+0xa/0x20 Jan 28 14:54:00 kernel: __kfree_skb+0xa/0x14 Jan 28 14:54:00 kernel: e1000_put_txbuf+0x73/0x86 Jan 28 14:54:00 kernel: e1000_clean_tx_irq+0xb4/0x23f Jan 28 14:54:00 kernel: e1000e_poll+0x5a/0x223 Jan 28 14:54:00 kernel: net_rx_action+0x12e/0x305 Jan 28 14:54:00 kernel: __do_softirq+0x114/0x267 Jan 28 14:54:00 kernel: irq_exit+0x58/0x64 Jan 28 14:54:00 kernel: do_IRQ+0xaa/0xc8 Jan 28 14:54:00 kernel: common_interrupt+0xf/0xf Jan 28 14:54:00 kernel: </IRQ> Jan 28 14:54:00 kernel: RIP: 0010:default_idle+0x9b/0x122 Jan 28 14:54:00 kernel: Code: 3b 00 eb e0 e8 07 5b 94 ff 89 ee 48 c7 c7 20 c2 03 82 e8 dc 1a 94 ff 8b 05 5a 27 d2 00 85 c0 7e 07 0f 00 2d b9 e4 4b 00 fb f4 <65> 44 8b 25 5d 9c 8c 7e 8b 05 8f a9 9b 00 85 c0 7e 70 65 8b 05 4c Jan 28 14:54:00 kernel: RSP: 0018:ffffffff82003ea0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffde Jan 28 14:54:00 kernel: RAX: 0000000000000000 RBX: ffffffff82011780 RCX: ffffffff82035950 Jan 28 14:54:00 kernel: RDX: 00000000239745ba RSI: 0000000000000000 RDI: 0000000000000000 Jan 28 14:54:00 kernel: RBP: 0000000000000000 R08: 00000ffc5f8d457c R09: 0000000000000400 Jan 28 14:54:00 kernel: R10: ffff88808249f500 R11: 0000000000000002 R12: 0000000000000000 Jan 28 14:54:00 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jan 28 14:54:00 kernel: do_idle+0xb2/0x172 Jan 28 14:54:00 kernel: cpu_startup_entry+0x6a/0x6c Jan 28 14:54:00 kernel: start_kernel+0x480/0x49e Jan 28 14:54:00 kernel: secondary_startup_64+0xa4/0xb0 Jan 28 14:54:00 kernel: ---[ end trace b7d8a2809485a991 ]--- The kernel hasn't panicked *yet* because it will probably behave as with the previous one. After a week or so, I may get a system freeze with a similar message as the one attached to this report. Even without a system freeze, having these messages show up in the log is a showstopper. What do you suggest I try next? Should I try vanilla-sources, or should I go for gentoo-sources 5.5.0? No need to test vanilla sources: Gentoo-sources only add stuff, we usually don't patch existing code. So I would recommend to test v5.5 to see if this is already fixed or not. In either way you will have to bisect kernel at the end: If it's fixed in 5.5 we probably want to identify the fix so this can get backported to LTS kernels. If it isn't fixed yet, we need to identify first bad commit causing that problem to find a fix. Booted 5.5 today. Getting the same behavior as in my previous message. The messages are not the same, but they still seem to be related to networking: Jan 29 09:11:14 kernel: ------------[ cut here ]------------ Jan 29 09:11:14 kernel: refcount_t: addition on 0; use-after-free. Jan 29 09:11:14 kernel: WARNING: CPU: 0 PID: 25403 at lib/refcount.c:25 refcount_warn_saturate+0x88/0xe8 Jan 29 09:11:14 kernel: Modules linked in: nfnetlink_queue autofs4 xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_TARPIT(O) xt_comment xt_iprange xt_geoip(O) xt_set xt_NFQUEUE ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt_hashlimit xt_tcpudp xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port ip_set_hash_net ip_set_hash_ip ip_set nfnetlink l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ip6table_filter ip6_tables sha1_ssse3 sha1_generic ipv6 arptable_filter arp_tables xt_iface(O) xt_conntrack iptable_mangle iptable_nat nf_nat Jan 29 09:11:14 kernel: nf_conntrack nf_defrag_ipv4 nf_defrag_ipv6 iptable_raw iptable_filter ip_tables x_tables sch_fq_codel sch_fq bpfilter snd_hda_codec_analog snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_pcm snd_timer snd soundcore k8temp parport_pc ohci_pci ohci_hcd floppy parport ehci_pci thermal ehci_hcd asus_atk0110 fan ata_generic i2c_nforce2 button pata_amd pata_acpi msdos configfs fuse f2fs jfs btrfs zstd_decompress zstd_compress xxhash lzo_compress zlib_deflate sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise ata_piix ahci libahci libata nvme nvme_core virtio_crypto crypto_engine virtio_pci virtio_balloon virtio_rng virtio_console virtio_blk virtio_ring virtio Jan 29 09:11:14 kernel: CPU: 0 PID: 25403 Comm: TX#01 Tainted: G O 5.5.0-gentoo-x86_64 #1 Jan 29 09:11:14 kernel: Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010 Jan 29 09:11:14 kernel: RIP: 0010:refcount_warn_saturate+0x88/0xe8 Jan 29 09:11:14 kernel: Code: 05 4b c7 d7 00 01 e8 5e da ca ff 0f 0b c3 80 3d 3b c7 d7 00 00 75 72 48 c7 c7 14 6f df 81 c6 05 2b c7 d7 00 01 e8 3f da ca ff <0f> 0b c3 80 3d 1b c7 d7 00 00 75 53 48 c7 c7 40 6f df 81 c6 05 0b Jan 29 09:11:14 kernel: RSP: 0018:ffffc900002bf888 EFLAGS: 00010282 Jan 29 09:11:14 kernel: RAX: 0000000000000000 RBX: ffff888118373500 RCX: 0000000000000007 Jan 29 09:11:14 kernel: RDX: 0000000000001b14 RSI: ffffc900002bf774 RDI: ffff88811fc18620 Jan 29 09:11:14 kernel: RBP: ffffc900002bf908 R08: 0000000000000001 R09: 000000000000fd00 Jan 29 09:11:14 kernel: R10: 0000000000000000 R11: 000000000000004c R12: ffff888118373500 Jan 29 09:11:14 kernel: R13: ffff888117d5b100 R14: ffffffffa06b4300 R15: 0000000000000068 Jan 29 09:11:14 kernel: FS: 00007fd65d0eb700(0000) GS:ffff88811fc00000(0000) knlGS:0000000000000000 Jan 29 09:11:14 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 29 09:11:14 kernel: CR2: 00007ff1e9b1f000 CR3: 00000001180e6000 CR4: 00000000000006f0 Jan 29 09:11:14 kernel: Call Trace: Jan 29 09:11:14 kernel: nf_queue_entry_get_refs+0x60/0xa0 Jan 29 09:11:14 kernel: nf_queue+0xcf/0x202 Jan 29 09:11:14 kernel: ? dst_mtu+0xd/0xd Jan 29 09:11:14 kernel: nf_reinject+0x187/0x194 Jan 29 09:11:14 kernel: nfqnl_recv_verdict+0x37f/0x3a5 [nfnetlink_queue] Jan 29 09:11:14 kernel: nfnetlink_rcv_msg+0x164/0x20a [nfnetlink] Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? nfnetlink_net_init+0x8c/0x8c [nfnetlink] Jan 29 09:11:14 kernel: netlink_rcv_skb+0x7d/0xd1 Jan 29 09:11:14 kernel: nfnetlink_rcv+0x10f/0x130 [nfnetlink] Jan 29 09:11:14 kernel: netlink_unicast+0x10c/0x1a5 Jan 29 09:11:14 kernel: netlink_sendmsg+0x29d/0x2d3 Jan 29 09:11:14 kernel: sock_sendmsg_nosec+0x20/0x2a Jan 29 09:11:14 kernel: ____sys_sendmsg+0xe6/0x14f Jan 29 09:11:14 kernel: ? copy_msghdr_from_user+0xfe/0x128 Jan 29 09:11:14 kernel: ___sys_sendmsg+0x7a/0xb2 Jan 29 09:11:14 kernel: ? do_futex+0x208/0x940 Jan 29 09:11:14 kernel: ? common_interrupt+0xa/0xf Jan 29 09:11:14 kernel: __sys_sendmsg+0x4c/0x7f Jan 29 09:11:14 kernel: do_syscall_64+0x15d/0x189 Jan 29 09:11:14 kernel: ? __up_read+0x12/0x3b Jan 29 09:11:14 kernel: ? __do_page_fault+0x2f6/0x38a Jan 29 09:11:14 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jan 29 09:11:14 kernel: RIP: 0033:0x7fd6641162e1 Jan 29 09:11:14 kernel: Code: 00 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 26 e9 ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 c7 48 89 44 24 08 e8 5a e9 ff ff 48 Jan 29 09:11:14 kernel: RSP: 002b:00007fd65d0e9820 EFLAGS: 00000293 ORIG_RAX: 000000000000002e Jan 29 09:11:14 kernel: RAX: ffffffffffffffda RBX: 00007fd65d0e9900 RCX: 00007fd6641162e1 Jan 29 09:11:14 kernel: RDX: 0000000000000000 RSI: 00007fd65d0e9870 RDI: 0000000000000005 Jan 29 09:11:14 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000301 Jan 29 09:11:14 kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 Jan 29 09:11:14 kernel: R13: 00007fd650268dd8 R14: 000000000a000000 R15: 0000000000000001 Jan 29 09:11:14 kernel: ---[ end trace b074762294df7e7d ]--- Jan 29 09:11:14 kernel: ------------[ cut here ]------------ Jan 29 09:11:14 kernel: refcount_t: underflow; use-after-free. Jan 29 09:11:14 kernel: WARNING: CPU: 1 PID: 25394 at lib/refcount.c:28 refcount_warn_saturate+0xa7/0xe8 Jan 29 09:11:14 kernel: Modules linked in: nfnetlink_queue autofs4 xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_TARPIT(O) xt_comment xt_iprange xt_geoip(O) xt_set xt_NFQUEUE ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt_hashlimit xt_tcpudp xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port ip_set_hash_net ip_set_hash_ip ip_set nfnetlink l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ip6table_filter ip6_tables sha1_ssse3 sha1_generic ipv6 arptable_filter arp_tables xt_iface(O) xt_conntrack iptable_mangle iptable_nat nf_nat Jan 29 09:11:14 kernel: nf_conntrack nf_defrag_ipv4 nf_defrag_ipv6 iptable_raw iptable_filter ip_tables x_tables sch_fq_codel sch_fq bpfilter snd_hda_codec_analog snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_pcm snd_timer snd soundcore k8temp parport_pc ohci_pci ohci_hcd floppy parport ehci_pci thermal ehci_hcd asus_atk0110 fan ata_generic i2c_nforce2 button pata_amd pata_acpi msdos configfs fuse f2fs jfs btrfs zstd_decompress zstd_compress xxhash lzo_compress zlib_deflate sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise ata_piix ahci libahci libata nvme nvme_core virtio_crypto crypto_engine virtio_pci virtio_balloon virtio_rng virtio_console virtio_blk virtio_ring virtio Jan 29 09:11:14 kernel: CPU: 1 PID: 25394 Comm: RX-NFQ#0 Tainted: G W O 5.5.0-gentoo-x86_64 #1 Jan 29 09:11:14 kernel: Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010 Jan 29 09:11:14 kernel: RIP: 0010:refcount_warn_saturate+0xa7/0xe8 Jan 29 09:11:14 kernel: Code: 05 2b c7 d7 00 01 e8 3f da ca ff 0f 0b c3 80 3d 1b c7 d7 00 00 75 53 48 c7 c7 40 6f df 81 c6 05 0b c7 d7 00 01 e8 20 da ca ff <0f> 0b c3 80 3d fb c6 d7 00 00 75 34 48 c7 c7 68 6f df 81 c6 05 eb Jan 29 09:11:14 kernel: RSP: 0018:ffffc90000817900 EFLAGS: 00010286 Jan 29 09:11:14 kernel: RAX: 0000000000000000 RBX: ffff888118373500 RCX: 0000000000000007 Jan 29 09:11:14 kernel: RDX: 0000000000001b52 RSI: ffffc900008177ec RDI: ffff88811fc98620 Jan 29 09:11:14 kernel: RBP: ffff888118373500 R08: 0000000000000001 R09: 0000000000001500 Jan 29 09:11:14 kernel: R10: 0000000000000000 R11: 0000000000000048 R12: 0000000000000001 Jan 29 09:11:14 kernel: R13: ffff888117d5b100 R14: ffff8881161bd9c0 R15: ffff888118373500 Jan 29 09:11:14 kernel: FS: 00007fd6618f4700(0000) GS:ffff88811fc80000(0000) knlGS:0000000000000000 Jan 29 09:11:14 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 29 09:11:14 kernel: CR2: 00007fd657172000 CR3: 00000001180e6000 CR4: 00000000000006e0 Jan 29 09:11:14 kernel: Call Trace: Jan 29 09:11:14 kernel: nf_queue_entry_release_refs+0x62/0xa2 Jan 29 09:11:14 kernel: nf_reinject+0x5d/0x194 Jan 29 09:11:14 kernel: nfqnl_recv_verdict+0x37f/0x3a5 [nfnetlink_queue] Jan 29 09:11:14 kernel: nfnetlink_rcv_msg+0x164/0x20a [nfnetlink] Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x40/0x70 Jan 29 09:11:14 kernel: ? __switch_to_asm+0x34/0x70 Jan 29 09:11:14 kernel: ? nfnetlink_net_init+0x8c/0x8c [nfnetlink] Jan 29 09:11:14 kernel: netlink_rcv_skb+0x7d/0xd1 Jan 29 09:11:14 kernel: nfnetlink_rcv+0x10f/0x130 [nfnetlink] Jan 29 09:11:14 kernel: netlink_unicast+0x10c/0x1a5 Jan 29 09:11:14 kernel: netlink_sendmsg+0x29d/0x2d3 Jan 29 09:11:14 kernel: sock_sendmsg_nosec+0x20/0x2a Jan 29 09:11:14 kernel: ____sys_sendmsg+0xe6/0x14f Jan 29 09:11:14 kernel: ? copy_msghdr_from_user+0xfe/0x128 Jan 29 09:11:14 kernel: ___sys_sendmsg+0x7a/0xb2 Jan 29 09:11:14 kernel: ? netlink_recvmsg+0x2b2/0x2e0 Jan 29 09:11:14 kernel: __sys_sendmsg+0x4c/0x7f Jan 29 09:11:14 kernel: do_syscall_64+0x15d/0x189 Jan 29 09:11:14 kernel: ? copy_kernel_to_fpregs+0x21/0x2a Jan 29 09:11:14 kernel: ? switch_fpu_return+0x54/0x6b Jan 29 09:11:14 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jan 29 09:11:14 kernel: RIP: 0033:0x7fd6641162e1 Jan 29 09:11:14 kernel: Code: 00 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 26 e9 ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2c 44 89 c7 48 89 44 24 08 e8 5a e9 ff ff 48 Jan 29 09:11:14 kernel: RSP: 002b:00007fd6618f1e80 EFLAGS: 00000293 ORIG_RAX: 000000000000002e Jan 29 09:11:14 kernel: RAX: ffffffffffffffda RBX: 00007fd6618f1f60 RCX: 00007fd6641162e1 Jan 29 09:11:14 kernel: RDX: 0000000000000000 RSI: 00007fd6618f1ed0 RDI: 0000000000000005 Jan 29 09:11:14 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000301 Jan 29 09:11:14 kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 Jan 29 09:11:14 kernel: R13: 00007fd650268dd8 R14: 0000000000000000 R15: 0000000000000000 Jan 29 09:11:14 kernel: ---[ end trace b074762294df7e7e ]--- No system freeze yet. Should I try downgrading to gentoo-sources 4.9.203 or even all the way down to 4.4.203? I was wondering if these errors could have something to do with old failing hardware. So I installed Gentoo on a brand new high-end server, and got similar error messages in syslog. The system hasn't panicked yet, but I'm getting messages such as: Feb 3 15:41:55 kernel: ------------[ cut here ]------------ Feb 3 15:41:55 kernel: WARNING: CPU: 14 PID: 0 at net/ipv4/tcp_output.c:915 tcp_wfree+0x29/0xe2 Feb 3 15:41:55 kernel: Modules linked in: arc4 ecb md4 sha512_ssse3 sha512_generic cmac cifs ccm fscache autofs4 nfnetlink_queue xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_iface(O) xt_statistic xt_connmark xt_TARPIT(O) xt_comment xt_iprange xt_geoip(O) xt_set xt_NFQUEUE arptable_filter arp_tables ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc iptable_nat nf_nat_ipv4 xt_mark iptable_mangle xt_TCPMSS xt_hashlimit xt_tcpudp xt_CT iptable_raw xt_multiport xt_conntrack nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_nat nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink Feb 3 15:41:55 kernel: nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port ip_set_hash_net ip_set_hash_ip ip_set nfnetlink sch_fq_codel sch_fq iptable_filter ip_tables x_tables bpfilter mlx5_core mlxfw tls strparser sha1_mb mcryptd sha1_ssse3 sha1_generic ipv6 crct10dif_pclmul bnxt_en ghash_clmulni_intel ixgbe i2c_piix4 ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq button aesni_intel crypto_simd cryptd glue_helper aes_x86_64 algif_rng algif_aead algif_hash algif_skcipher af_alg xts crc32c_intel crc32_pclmul crc32_generic sha256_generic msdos configfs fuse f2fs jfs btrfs zstd_decompress zstd_compress xxhash Feb 3 15:41:55 kernel: lzo_compress zlib_deflate multipath dm_zero dm_verity dm_thin_pool dm_persistent_data dm_snapshot dm_raid dm_mirror dm_region_hash dm_log dm_flakey dm_delay dm_crypt dm_bufio dm_bio_prison dm_mod dax hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey hid_microsoft hid_logitech hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech sl811_hcd ohci_hcd uhci_hcd usb_storage xhci_pci xhci_hcd ehci_pci ehci_hcd pata_sl82c105 pata_via pata_jmicron pata_marvell pata_netcell pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_platform pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar Feb 3 15:41:55 kernel: pata_rz1000 pata_sil680 pata_radisys pata_pdc2027x pata_mpiix aic94xx libsas lpfc crc_t10dif crct10dif_common qla2xxx megaraid_mbox megaraid_mm aacraid sx8 DAC960 hpsa 3w_9xxx 3w_xxxx 3w_sas mptsas mptfc scsi_transport_fc atp870u dc395x qla1280 dmx3191d sym53c8xx gdth initio BusLogic arcmsr aic7xxx aic79xx sg mpt3sas raid_class scsi_transport_sas megaraid megaraid_sas mptspi mptscsih mptbase scsi_transport_spi pdc_adma sata_inic162x sata_mv sata_qstor sata_vsc sata_uli sata_sis pata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise ata_piix ahci libahci libata nvme nvme_core virtio_crypto crypto_engine virtio_pci virtio_balloon virtio_rng virtio_console virtio_blk virtio_ring virtio Feb 3 15:41:55 kernel: CPU: 14 PID: 0 Comm: swapper/14 Tainted: G O 4.19.97-gentoo-x86_64 #1 Feb 3 15:41:55 kernel: Hardware name: Supermicro AS -1114S-WTRT/H12SSW-NT, BIOS 1.0b 11/15/2019 Feb 3 15:41:55 kernel: RIP: 0010:tcp_wfree+0x29/0xe2 Feb 3 15:41:55 kernel: Code: c3 55 53 8b 87 e0 00 00 00 48 8b 6f 18 ff c8 f0 29 85 44 01 00 00 0f 88 d9 4e 08 00 75 0e 48 c7 c7 c3 3e d8 81 e8 e4 1f 9c ff <0f> 0b 8b 85 44 01 00 00 3d 40 02 00 00 76 1a 65 48 8b 05 be 95 95 Feb 3 15:41:55 kernel: RSP: 0018:ffff88884ed83e28 EFLAGS: 00010246 Feb 3 15:41:55 kernel: RAX: 0000000000000024 RBX: ffff888804e218e8 RCX: 0000000000000000 Feb 3 15:41:55 kernel: RDX: 0000000000000000 RSI: ffff88884ed952d8 RDI: ffff88884ed952d8 Feb 3 15:41:55 kernel: RBP: ffff88877b8f6ac0 R08: 0000000000000001 R09: 0000000000025400 Feb 3 15:41:55 kernel: R10: 0000000000000000 R11: 0000000000000044 R12: 00000000ffffff07 Feb 3 15:41:55 kernel: R13: 000000000000004a R14: ffffc90002595150 R15: ffff888805d34070 Feb 3 15:41:55 kernel: FS: 0000000000000000(0000) GS:ffff88884ed80000(0000) knlGS:0000000000000000 Feb 3 15:41:55 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 3 15:41:55 kernel: CR2: 00007fe6880c4cbc CR3: 000000080dc80000 CR4: 0000000000340ee0 Feb 3 15:41:55 kernel: Call Trace: Feb 3 15:41:55 kernel: <IRQ> Feb 3 15:41:55 kernel: skb_release_head_state+0x74/0xa4 Feb 3 15:41:55 kernel: skb_release_all+0xa/0x20 Feb 3 15:41:55 kernel: __kfree_skb+0xa/0x14 Feb 3 15:41:55 kernel: igb_poll+0xbe/0xbf3 Feb 3 15:41:55 kernel: net_rx_action+0x12e/0x305 Feb 3 15:41:55 kernel: __do_softirq+0x114/0x267 Feb 3 15:41:55 kernel: irq_exit+0x58/0x64 Feb 3 15:41:55 kernel: do_IRQ+0xaa/0xc8 Feb 3 15:41:55 kernel: common_interrupt+0xf/0xf Feb 3 15:41:55 kernel: </IRQ> Feb 3 15:41:55 kernel: RIP: 0010:cpuidle_enter_state+0x245/0x297 Feb 3 15:41:55 kernel: Code: ff 31 ff e8 0e 4a a4 ff 45 84 ed 74 12 9c 58 0f ba e0 09 73 03 0f 0b fa 31 ff e8 7c 72 a7 ff fb 48 ba ff ff ff ff f3 01 00 00 <48> 2b 2c 24 b8 ff ff ff 7f 48 39 d5 7f 0d 48 89 e8 b9 e8 03 00 00 Feb 3 15:41:55 kernel: RSP: 0018:ffffc90000143e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda Feb 3 15:41:55 kernel: RAX: ffff88884ed9ddc0 RBX: ffff88882c2e8e00 RCX: 000000000000001f Feb 3 15:41:55 kernel: RDX: 000001f3ffffffff RSI: 000000002c235072 RDI: 0000000000000000 Feb 3 15:41:55 kernel: RBP: 00000e678104be06 R08: 00000e678104be06 R09: 0000000000000000 Feb 3 15:41:55 kernel: R10: 0000000000000000 R11: ffff88884ed9de40 R12: 0000000000000002 Feb 3 15:41:55 kernel: R13: 0000000000000000 R14: ffffffff820b4d38 R15: 0000000000000000 Feb 3 15:41:55 kernel: do_idle+0x104/0x172 Feb 3 15:41:55 kernel: cpu_startup_entry+0x6a/0x6c Feb 3 15:41:55 kernel: start_secondary+0x187/0x1a2 Feb 3 15:41:55 kernel: secondary_startup_64+0xa4/0xb0 Feb 3 15:41:55 kernel: ---[ end trace a6f6b996e449997f ]--- It's always about net/ipv4/tcp_output.c. Using 5.5 doesn't fix this. What can I try now? Should it be reported upstream? What old version of your kernel seems to work? Can you paste your .config ? Have you 'make oldconfig' to port the options across to new versions, and have you checked all dependencies are being pulled in? Does a plain 'defconfig' work, and/or does a genkernel-build kernel work OK? Something doesn't check out right here... (In reply to Michael 'veremitz' Everitt from comment #8) > What old version of your kernel seems to work? For the older hardware (for which I opened this bug report -- I can't test the new hardware with that older kernel): 4.9.34-gentoo amd64 gentoo-sources > Can you paste your .config ? Will attach file. > Have you 'make oldconfig' to port the options across to new versions, and > have you checked all dependencies are being pulled in? # make oldconfig scripts/kconfig/conf --oldconfig Kconfig # # configuration written to .config # How do I check that all dependencies are pulled in? > Does a plain 'defconfig' work, and/or does a genkernel-build kernel work OK? I can't use defconfig as my system requires features that are disabled by default. I use genkernel to build the kernel and modules. Created attachment 611552 [details]
kernel .config
kernel config file I used to build the kernel with genkernel
A lot more cases today (will attach file). The thing each "trace" has in common is that there are calls to automount right before (not necessarily with errors). I don't know if it's just a coincidence, or if automount (which tries to access shares on the network) is responsible for this. Created attachment 611556 [details]
kernel syslog
# cat /proc/sys/kernel/tainted 4608 which I presume is the ORed value of: 512 (W): A kernel warning has occurred. 4096 (O): An out-of-tree module has been loaded. What does "out-of-tree" mean exactly? Does it refer to any kernel module provided by a package not being sys-kernel/*-sources (eg. xtables-addons)? Here's an example of what shows up in syslog right before the dreaded kernel message. It's not always the same, but it's always about automount (used by proftpd). proftpd[5651]: pam_unix(ftp:session): session opened for user myftpuser by (uid=0) automount[17294]: handle_packet: type = 3 automount[17294]: handle_packet_missing_indirect: token 37708, name .pam_environment, request pid 5651 automount[17294]: dev_ioctl_send_fail: token = 37708 automount[17294]: handle_packet: type = 3 automount[17294]: handle_packet_missing_indirect: token 37709, name etc, request pid 5651 automount[17294]: dev_ioctl_send_fail: token = 37709 automount[17294]: handle_packet: type = 3 automount[17294]: handle_packet_missing_indirect: token 37710, name etc, request pid 5651 automount[17294]: dev_ioctl_send_fail: token = 37710 automount[17294]: handle_packet: type = 3 automount[17294]: handle_packet_missing_indirect: token 37711, name etc, request pid 5651 automount[17294]: dev_ioctl_send_fail: token = 37711 Expanding on comments 8 and 9, I know for sure that I didn't have these messages in 4.12.12-gentoo amd64 gentoo-sources. Hey, I'm really desperate now... I re-installed everything from scratch on perfectly new enterprise-grade hardware. I used genkernel to install gentoo-sources (stable). It's even worse now. Just so you get an idea of what's happening: # grep -i taint messages Feb 7 22:20:07 gw2 kernel: CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:09 gw2 kernel: CPU: 4 PID: 0 Comm: swapper/4 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:11 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:12 gw2 kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:13 gw2 kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:14 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:15 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:15 gw2 kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:16 gw2 kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:17 gw2 kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:18 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:18 gw2 kernel: CPU: 2 PID: 19519 Comm: RX-NFQ#2 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:20 gw2 kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:20 gw2 kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:24 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:28 gw2 kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:34 gw2 kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:34 gw2 kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:37 gw2 kernel: CPU: 4 PID: 0 Comm: swapper/4 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:37 gw2 kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:38 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:39 gw2 kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:39 gw2 kernel: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:40 gw2 kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:42 gw2 kernel: CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:42 gw2 kernel: CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:44 gw2 kernel: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:45 gw2 kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:46 gw2 kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:49 gw2 kernel: CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:50 gw2 kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:50 gw2 kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 Feb 7 22:20:50 gw2 kernel: CPU: 4 PID: 19519 Comm: RX-NFQ#2 Tainted: G W OE 4.19.97-gentoo-x86_64 #1 ... and there's lots, lots more... What in the world is wrong? What can I try? Sure, the system hasn't frozen yet, but you must agree that this is totally unusual. Thanks! The trace always starts with the following line every 2 seconds approximately! kernel: WARNING: CPU: 6 PID: 0 at net/ipv4/tcp_output.c:915 tcp_wfree.cold+0xc/0x13 This time the kernel is not tainted: Feb 11 16:53:40 kernel: ------------[ cut here ]------------ Feb 11 16:53:40 kernel: WARNING: CPU: 6 PID: 0 at net/ipv4/tcp_output.c:915 tcp_wfree.cold+0xc/0x13 Feb 11 16:53:40 kernel: Modules linked in: autofs4 nfnetlink_queue xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_comment xt_iprange l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel xt_set xt_NFQUEUE xt_AUDIT ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt_hashlimit xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port Feb 11 16:53:40 kernel: ip_set_hash_net ip_set_hash_ip ip_set nfnetlink ip6table_filter ip6_tables arptable_filter arp_tables xt_conntrack iptable_mangle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw sch_fq tcp_cdg tcp_bbr iptable_filter ip_tables bpfilter mlx5_ib ipmi_ssif ib_uverbs edac_mce_amd kvm_amd kvm ast ttm irqbypass crct10dif_pclmul efi_pstore ghash_clmulni_intel drm_kms_helper pcspkr efivars ixgbe igb sp5100_tco mlx5_core drm joydev bnxt_en i2c_algo_bit mdio i2c_piix4 mlxfw ccp dca i2c_core ipmi_si ipmi_devintf ipmi_msghandler pinctrl_amd pcc_cpufreq acpi_cpufreq mac_hid efivarfs aesni_intel crypto_simd cryptd glue_helper aes_x86_64 algif_rng algif_aead algif_hash algif_skcipher af_alg crc32c_intel crc32_pclmul crc32_generic msdos fat cramfs overlay squashfs Feb 11 16:53:40 kernel: loop fuse f2fs xfs nfs lockd grace sunrpc fscache jfs reiserfs btrfs ext4 mbcache jbd2 multipath linear raid10 raid1 raid0 dm_zero dm_verity reed_solomon dm_thin_pool dm_switch dm_snapshot dm_raid raid456 md_mod async_raid6_recov async_memcpy async_pq raid6_pq dm_mirror dm_region_hash dm_log_writes dm_log_userspace dm_log dm_integrity async_xor async_tx xor dm_flakey dm_delay dm_crypt dm_cache_smq dm_cache dm_persistent_data libcrc32c dm_bufio dm_bio_prison dm_mod firewire_core crc_itu_t hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey hid_microsoft hid_logitech_dj hid_logitech ff_memless hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech sl811_hcd ohci_hcd uhci_hcd uas usb_storage xhci_plat_hcd pata_sl82c105 pata_via pata_jmicron Feb 11 16:53:40 kernel: pata_marvell pata_netcell pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmcia pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_oldpiix pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pata_cmd64x pata_efar pata_sil680 pata_pdc2027x pata_mpiix lpfc nvmet_fc qla2xxx megaraid_mbox megaraid_mm aacraid sx8 hpsa 3w_9xxx 3w_xxxx 3w_sas mptsas mptfc scsi_transport_fc atp870u dc395x qla1280 dmx3191d sym53c8xx gdth initio BusLogic arcmsr aic7xxx aic79xx sr_mod cdrom sg sd_mod mpt3sas raid_class scsi_transport_sas megaraid megaraid_sas mptspi mptscsih mptbase scsi_transport_spi pdc_adma sata_inic162x sata_mv sata_qstor sata_vsc sata_uli sata_sis pata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 Feb 11 16:53:40 kernel: sata_sil sata_promise ata_piix ahci libahci nvme_fc nvme_loop nvmet nvme_rdma rdma_cm iw_cm ib_cm ib_core configfs ipv6 crc_ccitt nvme_fabrics nvme nvme_core Feb 11 16:53:40 kernel: CPU: 6 PID: 0 Comm: swapper/6 Not tainted 4.19.97-gentoo-x86_64 #1 Feb 11 16:53:40 kernel: Hardware name: Supermicro AS -1114S-WTRT/H12SSW-NT, BIOS 1.0b 11/15/2019 Feb 11 16:53:40 kernel: RIP: 0010:tcp_wfree.cold+0xc/0x13 Feb 11 16:53:40 kernel: Code: 9d 04 00 00 00 5b c6 85 9b 04 00 00 00 5d c3 48 c7 c7 70 93 06 b0 e8 f7 f7 94 ff 0f 0b c3 48 c7 c7 70 93 06 b0 e8 e8 f7 94 ff <0f> 0b e9 46 a5 ff ff 48 c7 c7 70 93 06 b0 e8 d5 f7 94 ff 0f 0b b8 Feb 11 16:53:40 kernel: RSP: 0018:ffff9e9c2b183d90 EFLAGS: 00010246 Feb 11 16:53:40 kernel: RAX: 0000000000000024 RBX: ffff9e9bc099cee8 RCX: 0000000000000000 Feb 11 16:53:40 kernel: RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 0000000000000300 Feb 11 16:53:40 kernel: RBP: ffff9e9bbef09980 R08: ffff9e9c2b1968b8 R09: 0000000000000001 Feb 11 16:53:40 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9e9bc099cee8 Feb 11 16:53:40 kernel: R13: ffff9e9a900100a8 R14: ffff9e9c0155a8c0 R15: 0000000000000026 Feb 11 16:53:40 kernel: FS: 0000000000000000(0000) GS:ffff9e9c2b180000(0000) knlGS:0000000000000000 Feb 11 16:53:40 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 11 16:53:40 kernel: CR2: 00007f90cb702820 CR3: 00000007d41b2000 CR4: 0000000000340ee0 Feb 11 16:53:40 kernel: Call Trace: Feb 11 16:53:40 kernel: <IRQ> Feb 11 16:53:40 kernel: skb_release_head_state+0x64/0xb0 Feb 11 16:53:40 kernel: skb_release_all+0xe/0x30 Feb 11 16:53:40 kernel: consume_skb+0x27/0x80 Feb 11 16:53:40 kernel: bnxt_tx_int+0xd0/0x360 [bnxt_en] Feb 11 16:53:40 kernel: bnxt_poll+0x20f/0x870 [bnxt_en] Feb 11 16:53:40 kernel: net_rx_action+0x148/0x3b0 Feb 11 16:53:40 kernel: __do_softirq+0xe8/0x2f1 Feb 11 16:53:40 kernel: irq_exit+0x100/0x110 Feb 11 16:53:40 kernel: do_IRQ+0x81/0xe0 Feb 11 16:53:40 kernel: common_interrupt+0xf/0xf Feb 11 16:53:40 kernel: </IRQ> Feb 11 16:53:40 kernel: RIP: 0010:cpuidle_enter_state+0xc3/0x320 Feb 11 16:53:40 kernel: Code: e8 82 68 a0 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 30 02 00 00 31 ff e8 84 55 a6 ff fb 66 0f 1f 44 00 00 <48> ba cf f7 53 e3 a5 9b c4 20 4c 29 f5 48 89 e8 48 c1 fd 3f 48 f7 Feb 11 16:53:40 kernel: RSP: 0018:ffffb4440021fe80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd6 Feb 11 16:53:40 kernel: RAX: ffff9e9c2b1a2200 RBX: ffff9e9c02dfe800 RCX: 000000000000001f Feb 11 16:53:40 kernel: RDX: 0000000000000000 RSI: 000000002c234d74 RDI: 0000000000000000 Feb 11 16:53:40 kernel: RBP: 000002de50bf72ea R08: 000002de50bf72ea R09: 0000000000000035 Feb 11 16:53:40 kernel: R10: 00000000ffffffff R11: ffff9e9c2b1a12e8 R12: 0000000000000002 Feb 11 16:53:40 kernel: R13: ffffffffb03954a0 R14: 000002de4de20c75 R15: ffff9e95044bcc80 Feb 11 16:53:40 kernel: do_idle+0x1dc/0x270 Feb 11 16:53:40 kernel: cpu_startup_entry+0x6f/0x80 Feb 11 16:53:40 kernel: start_secondary+0x1a7/0x200 Feb 11 16:53:40 kernel: secondary_startup_64+0xb6/0xc0 Feb 11 16:53:40 kernel: ---[ end trace 828aa59c66af655f ]--- Hi, I think I've found the root cause for this issue, or at least how to reproduce it. The warning messages I reported (which *could* lead to a system hang after a long period running) disappear if I stop using NFQUEUE. In my specific case I use NFQUEUE balance 0:5 with iptables-1.6.1. As an IPS I'm using suricata 5.0.1 with the following arguments (among others): -q 0 -q 1 -q 2 -q 3 -q 4 -q 5 I've reproduced this behavior in several recent Linux kernel versions. A reminder of the kernel warning message: Feb 13 17:10:01 kernel: ------------[ cut here ]------------ Feb 13 17:10:01 kernel: WARNING: CPU: 5 PID: 0 at net/ipv4/tcp_output.c:915 tcp_wfree.cold+0xc/0x13 Feb 13 17:10:01 kernel: Modules linked in: autofs4 nfnetlink_queue l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_comment xt_iprange xt_set xt_NFQUEUE xt_AUDIT ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt _hashlimit xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp n f_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_p ptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port Feb 13 17:10:01 kernel: ip_set_hash_net ip_set_hash_ip ip_set nfnetlink ip6table_filter ip6_tables arptable_filter arp_tables xt_conntrack iptable_ma ngle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw sch_fq tcp_cdg tcp_bbr iptable_filter ip_tables bpfilter mlx5_ib ip mi_ssif ib_uverbs edac_mce_amd ast kvm_amd ttm kvm drm_kms_helper igb irqbypass efi_pstore crct10dif_pclmul ghash_clmulni_intel sp5100_tco efivars pcspkr mlx5 _core drm ixgbe bnxt_en joydev i2c_algo_bit i2c_piix4 mdio ccp mlxfw dca i2c_core ipmi_si ipmi_devintf ipmi_msghandler pinctrl_amd pcc_cpufreq mac_hid acpi_cp ufreq efivarfs aesni_intel crypto_simd cryptd glue_helper aes_x86_64 algif_rng algif_aead algif_hash algif_skcipher af_alg crc32c_intel crc32_pclmul crc32_gen eric msdos fat cramfs overlay squashfs Feb 13 17:10:01 kernel: loop fuse f2fs xfs nfs lockd grace sunrpc fscache jfs reiserfs btrfs ext4 mbcache jbd2 multipath linear raid10 raid1 raid0 dm _zero dm_verity reed_solomon dm_thin_pool dm_switch dm_snapshot dm_raid raid456 md_mod async_raid6_recov async_memcpy async_pq raid6_pq dm_mirror dm_region_ha sh dm_log_writes dm_log_userspace dm_log dm_integrity async_xor async_tx xor dm_flakey dm_delay dm_crypt dm_cache_smq dm_cache dm_persistent_data libcrc32c dm _bufio dm_bio_prison dm_mod firewire_core crc_itu_t hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey hid_microsoft hid_logitech_dj hid_logite ch ff_memless hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech sl811_hcd ohci_hcd uhci_hcd uas usb_storage xhci_plat_ hcd pata_sl82c105 pata_via pata_jmicron Feb 13 17:10:01 kernel: pata_marvell pata_netcell pata_pdc202xx_old pata_triflex pata_atiixp pata_opti pata_amd pata_ali pata_it8213 pata_pcmcia pcmc ia pcmcia_core pata_ns87415 pata_ns87410 pata_serverworks pata_oldpiix pata_artop pata_it821x pata_optidma pata_hpt3x2n pata_hpt3x3 pata_hpt37x pata_hpt366 pa ta_cmd64x pata_efar pata_sil680 pata_pdc2027x pata_mpiix lpfc nvmet_fc qla2xxx megaraid_mbox megaraid_mm aacraid sx8 hpsa 3w_9xxx 3w_xxxx 3w_sas mptsas mptfc scsi_transport_fc atp870u dc395x qla1280 dmx3191d sym53c8xx gdth initio BusLogic arcmsr aic7xxx aic79xx sr_mod cdrom sg sd_mod mpt3sas raid_class scsi_transpo rt_sas megaraid megaraid_sas mptspi mptscsih mptbase scsi_transport_spi pdc_adma sata_inic162x sata_mv sata_qstor sata_vsc sata_uli sata_sis pata_sis sata_sx4 sata_nv sata_via sata_svw sata_sil24 Feb 13 17:10:01 kernel: sata_sil sata_promise ata_piix ahci libahci nvme_fc nvme_loop nvmet nvme_rdma rdma_cm iw_cm ib_cm ib_core configfs ipv6 crc_c citt nvme_fabrics nvme nvme_core Feb 13 17:10:01 kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.19.97-gentoo-x86_64 #1 Feb 13 17:10:01 kernel: Hardware name: Supermicro AS -1114S-WTRT/H12SSW-NT, BIOS 1.0b 11/15/2019 Feb 13 17:10:01 kernel: RIP: 0010:tcp_wfree.cold+0xc/0x13 Feb 13 17:10:01 kernel: Code: 9d 04 00 00 00 5b c6 85 9b 04 00 00 00 5d c3 48 c7 c7 70 93 06 a2 e8 f7 f7 94 ff 0f 0b c3 48 c7 c7 70 93 06 a2 e8 e8 f7 94 ff <0f> 0b e9 46 a5 ff ff 48 c7 c7 70 93 06 a2 e8 d5 f7 94 ff 0f 0b b8 Feb 13 17:10:01 kernel: RSP: 0018:ffff9e15eb143d90 EFLAGS: 00010246 Feb 13 17:10:01 kernel: RAX: 0000000000000024 RBX: ffff9e15787094e8 RCX: 0000000000000000 Feb 13 17:10:01 kernel: RDX: 0000000000000000 RSI: ffff9e15eb1568b8 RDI: ffff9e15eb1568b8 Feb 13 17:10:01 kernel: RBP: ffff9e15011f1100 R08: ffff9e15eb1568b8 R09: 0000000000000001 Feb 13 17:10:01 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9e15787094e8 Feb 13 17:10:01 kernel: R13: ffff9e0ec3ab10a8 R14: ffff9e15e39de8c0 R15: 000000000000008e Feb 13 17:10:01 kernel: FS: 0000000000000000(0000) GS:ffff9e15eb140000(0000) knlGS:0000000000000000 Feb 13 17:10:01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 13 17:10:01 kernel: CR2: 00007f711864a690 CR3: 0000000804968000 CR4: 0000000000340ee0 Feb 13 17:10:01 kernel: Call Trace: Feb 13 17:10:01 kernel: <IRQ> Feb 13 17:10:01 kernel: skb_release_head_state+0x64/0xb0 Feb 13 17:10:01 kernel: skb_release_all+0xe/0x30 Feb 13 17:10:01 kernel: consume_skb+0x27/0x80 Feb 13 17:10:01 kernel: bnxt_tx_int+0xd0/0x360 [bnxt_en] Feb 13 17:10:01 kernel: bnxt_poll+0x20f/0x870 [bnxt_en] Feb 13 17:10:01 kernel: net_rx_action+0x148/0x3b0 Feb 13 17:10:01 kernel: __do_softirq+0xe8/0x2f1 Feb 13 17:10:01 kernel: irq_exit+0x100/0x110 Feb 13 17:10:01 kernel: do_IRQ+0x81/0xe0 Feb 13 17:10:01 kernel: common_interrupt+0xf/0xf Feb 13 17:10:01 kernel: </IRQ> Feb 13 17:10:01 kernel: RIP: 0010:cpuidle_enter_state+0xc3/0x320 Feb 13 17:10:01 kernel: Code: e8 82 68 a0 ff 80 7c 24 0b 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 30 02 00 00 31 ff e8 84 55 a6 ff fb 66 0f 1f 44 00 00 <48> ba cf f7 53 e3 a5 9b c4 20 4c 29 f5 48 89 e8 48 c1 fd 3f 48 f7 Feb 13 17:10:01 kernel: RSP: 0018:ffffbfbac0217e80 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd6 Feb 13 17:10:01 kernel: RAX: ffff9e15eb162200 RBX: ffff9e15c4491c00 RCX: 000000000000001f Feb 13 17:10:01 kernel: RDX: 0000000000000000 RSI: 000000002c234c74 RDI: 0000000000000000 Feb 13 17:10:01 kernel: RBP: 000000653d4d3728 R08: 000000653d4d3728 R09: 0000000000002707 Feb 13 17:10:01 kernel: R10: 0000000000003268 R11: ffff9e15eb1612e8 R12: 0000000000000002 Feb 13 17:10:01 kernel: R13: ffffffffa23954a0 R14: 000000653d200b61 R15: ffff9e0ec44a2640 Feb 13 17:10:01 kernel: do_idle+0x1dc/0x270 Feb 13 17:10:01 kernel: cpu_startup_entry+0x6f/0x80 Feb 13 17:10:01 kernel: start_secondary+0x1a7/0x200 Feb 13 17:10:01 kernel: secondary_startup_64+0xb6/0xc0 Feb 13 17:10:01 kernel: ---[ end trace 70699422f7793e3b ]--- # ethtool -a isp1 Pause parameters for isp1: Autonegotiate:on RX:on TX:on RX negotiated:on TX negotiated:on # ethtool -c isp1 Coalesce parameters for isp1: Adaptive RX: off TX: off stats-block-usecs: 1000000 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 14 rx-frames: 15 rx-usecs-irq: 1 rx-frames-irq: 1 tx-usecs: 28 tx-frames: 30 tx-usecs-irq: 2 tx-frames-irq: 2 rx-usecs-low: 0 rx-frame-low: 0 tx-usecs-low: 0 tx-frame-low: 0 rx-usecs-high: 0 rx-frame-high: 0 tx-usecs-high: 0 tx-frame-high: 0 # ethtool -g isp1 Ring parameters for isp1: Pre-set maximums: RX:2047 RX Mini:0 RX Jumbo:8191 TX:2047 Current hardware settings: RX:511 RX Mini:0 RX Jumbo:2044 TX:511 # ethtool -i isp1 driver: bnxt_en version: 1.9.2 firmware-version: 214.0.191.0 expansion-rom-version: bus-info: 0000:c6:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: no supports-priv-flags: no # ethtool -k isp1 Features for isp1: rx-checksumming: on tx-checksumming: on tx-checksum-ipv4: on tx-checksum-ip-generic: off [fixed] tx-checksum-ipv6: on tx-checksum-fcoe-crc: off [fixed] tx-checksum-sctp: off [fixed] scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed] tcp-segmentation-offload: on tx-tcp-segmentation: on tx-tcp-ecn-segmentation: off [fixed] tx-tcp-mangleid-segmentation: off tx-tcp6-segmentation: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: on receive-hashing: on highdma: on [fixed] rx-vlan-filter: off [fixed] vlan-challenged: off [fixed] tx-lockless: off [fixed] netns-local: off [fixed] tx-gso-robust: off [fixed] tx-fcoe-segmentation: off [fixed] tx-gre-segmentation: on tx-gre-csum-segmentation: on tx-ipxip4-segmentation: on tx-ipxip6-segmentation: off [fixed] tx-udp_tnl-segmentation: on tx-udp_tnl-csum-segmentation: on tx-gso-partial: on tx-sctp-segmentation: off [fixed] tx-esp-segmentation: off [fixed] tx-udp-segmentation: off [fixed] fcoe-mtu: off [fixed] tx-nocache-copy: off loopback: off [fixed] rx-fcs: off [fixed] rx-all: off [fixed] tx-vlan-stag-hw-insert: on rx-vlan-stag-hw-parse: on rx-vlan-stag-filter: off [fixed] l2-fwd-offload: off [fixed] hw-tc-offload: on esp-hw-offload: off [fixed] esp-tx-csum-hw-offload: off [fixed] rx-udp_tunnel-port-offload: on tls-hw-tx-offload: off [fixed] tls-hw-rx-offload: off [fixed] rx-gro-hw: on tls-hw-record: off [fixed] Regards, Vieri The problem is reproducible when using Suricata (or similar program) in NFQ repeat mode. It goes away if I stop using repeat mode. It seems to be a netfilter issue. Did you contact the netfilter team as advised by upsteam? (In reply to Mike Pagano from comment #20) > Did you contact the netfilter team as advised by upsteam? Yes -- https://marc.info/?l=netfilter&m=158214108315073&w=2 I previously contacted the Suricata ML, and they told me to contact the netfilter ML too: https://lists.openinfosecfoundation.org/pipermail/oisf-users/2020-February/017411.html From the email list: "I'll ask the Suricata ML what they think about that." Any response from upstream in that ML ? (In reply to Mike Pagano from comment #22) > From the email list: > > "I'll ask the Suricata ML what they think about that." > > Any response from upstream in that ML ? Yes, that netfilter should take care of it: https://lists.openinfosecfoundation.org/pipermail/oisf-users/2020-February/017411.html Same advice here: https://lkml.org/lkml/2020/2/24/130 The netfilter team has been notified. Is this still an issue with later kernels? I haven't had this problem anymore. I guess it has been fixed, or now it just "works for me" after updating my systems. |
Created attachment 605028 [details] kernel syslog Same hardware running on older gentoo-sources for years without issues. I've recently upgraded to 4.19.86-gentoo-x86_64, and after a week or so, I got a kernel panic and system freeze. It seems to be network-related. Kernel panic log: [see attached file] Even when not reaching system freeze, I see the following messages in syslog every now and then: Jan 26 17:21:34 kernel: ------------[ cut here ]------------ Jan 26 17:21:34 kernel: WARNING: CPU: 1 PID: 20926 at net/ipv4/tcp_output.c:911 tcp_wfree+0x29/0xe2 Jan 26 17:21:34 kernel: Modules linked in: arc4 ecb md4 sha512_ssse3 sha512_generic cmac cifs ccm fscache nfnetlink_queue autofs4 xt_mac xt_REDIRECT xt_limit xt_nat xt_recent xt_statistic xt_connmark xt_TARPIT(O) xt_comment xt_iprange xt_geoip(O) xt_set xt_NFQUEUE ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark xt_TCPMSS xt_hashlimit xt_tcpudp xt_CT xt_multiport nfnetlink_log xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp pppoe pppox Jan 26 17:21:34 kernel: ppp_generic slhc ip_set_hash_mac ip_set_bitmap_port ip_set_hash_net ip_set_hash_ip ip_set nfnetlink l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel ip6table_filter ip6_tables sha256_ssse3 sha256_generic mcryptd sha1_ssse3 sha1_generic ipv6 arptable_filter arp_tables xt_iface(O) xt_conntrack iptable_mangle iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_raw iptable_filter ip_tables x_tables sch_fq_codel bpfilter sch_fq snd_hda_codec_analog snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_pcm k8temp snd_timer parport_pc floppy ohci_pci parport fan snd asus_atk0110 ohci_hcd soundcore thermal ehci_pci button ehci_hcd ata_generic i2c_nforce2 pata_amd pata_acpi msdos configfs fuse f2fs jfs btrfs zstd_decompress zstd_compress xxhash lzo_compress Jan 26 17:21:34 kernel: zlib_deflate sata_nv sata_via sata_svw sata_sil24 sata_sil sata_promise ata_piix ahci libahci libata nvme nvme_core virtio_crypto crypto_engine virtio_pci virtio_balloon virtio_rng virtio_console virtio_blk virtio_ring virtio Jan 26 17:21:34 kernel: CPU: 1 PID: 20926 Comm: W#01 Tainted: G O 4.19.86-gentoo-x86_64 #1 Jan 26 17:21:34 kernel: Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010 Jan 26 17:21:34 kernel: RIP: 0010:tcp_wfree+0x29/0xe2 Jan 26 17:21:34 kernel: Code: c3 55 53 8b 87 e0 00 00 00 48 8b 6f 18 ff c8 f0 29 85 44 01 00 00 0f 88 0f 4e 08 00 75 0e 48 c7 c7 83 3b d8 81 e8 b6 30 9c ff <0f> 0b 8b 85 44 01 00 00 3d 40 02 00 00 76 1a 65 48 8b 05 cc a8 95 Jan 26 17:21:34 kernel: RSP: 0000:ffff88811fc83ee8 EFLAGS: 00010246 Jan 26 17:21:34 kernel: RAX: 0000000000000024 RBX: ffff88807ffceee8 RCX: 0000000000000000 Jan 26 17:21:34 kernel: RDX: 0000000000000000 RSI: ffff88811fc952d8 RDI: ffff88811fc952d8 Jan 26 17:21:34 kernel: RBP: ffff8880376e2600 R08: 0000000000000001 R09: 0000000000009c00 Jan 26 17:21:34 kernel: R10: 0000000000000000 R11: 0000000000000044 R12: ffff88807ffceee8 Jan 26 17:21:34 kernel: R13: 0000000000000000 R14: 0000000000000002 R15: 0000000000000002 Jan 26 17:21:34 kernel: FS: 00007f35b6f15700(0000) GS:ffff88811fc80000(0000) knlGS:0000000000000000 Jan 26 17:21:34 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 26 17:21:34 kernel: CR2: 000055ba3dcdf848 CR3: 000000010f2dc000 CR4: 00000000000006e0 Jan 26 17:21:34 kernel: Call Trace: Jan 26 17:21:34 kernel: <IRQ> Jan 26 17:21:34 kernel: skb_release_head_state+0x74/0xa4 Jan 26 17:21:34 kernel: skb_release_all+0xa/0x20 Jan 26 17:21:34 kernel: __kfree_skb+0xa/0x14 Jan 26 17:21:34 kernel: net_tx_action+0xff/0x1bc Jan 26 17:21:34 kernel: __do_softirq+0x114/0x267 Jan 26 17:21:34 kernel: irq_exit+0x58/0x64 Jan 26 17:21:34 kernel: do_IRQ+0xaa/0xc8 Jan 26 17:21:34 kernel: common_interrupt+0xf/0xf Jan 26 17:21:34 kernel: </IRQ> Jan 26 17:21:34 kernel: RIP: 0033:0x564f07ea47e7 Jan 26 17:21:34 kernel: Code: 49 89 fc 55 48 89 cd 53 48 89 d3 4c 8b b2 a0 00 00 00 eb 12 0f 1f 80 00 00 00 00 41 80 7e 01 00 75 41 49 83 c6 10 41 0f b6 16 <49> 8b 4e 08 48 89 ee 4c 89 e7 48 8d 04 d5 00 00 00 00 48 29 d0 48 Jan 26 17:21:34 kernel: RSP: 002b:00007f35b6f13fb0 EFLAGS: 00000206 ORIG_RAX: ffffffffffffffd7 Jan 26 17:21:34 kernel: RAX: 0000564f093d71f0 RBX: 0000564f093d6880 RCX: 00007f359c009bc0 Jan 26 17:21:34 kernel: RDX: 0000000000000004 RSI: 0000564f0ffc7f90 RDI: 00007f35940d3110 Jan 26 17:21:34 kernel: RBP: 00007f359c009bc0 R08: 00007f35b6f14100 R09: 00007f35b6f14100 Jan 26 17:21:34 kernel: R10: 0000000001080007 R11: 0000000000005885 R12: 00007f35940d3110 Jan 26 17:21:34 kernel: R13: 0000564f08319400 R14: 0000564f0ffc7f50 R15: 00007f3574328aa0 Jan 26 17:21:34 kernel: ---[ end trace f5d35299bace3ecb ]--- Seems to have to do with IRQs and networking. Is this a bug? I haven't tried vanilla-source, and this is why I'm filing a report here and not on the Kernel ML. Thanks