Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 923358 - sys-kernel/gentoo-sources >6.6 free_large_kmalloc
Summary: sys-kernel/gentoo-sources >6.6 free_large_kmalloc
Status: UNCONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal major (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: https://www.spinics.net/lists/stable/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-30 19:42 UTC by Josh G
Modified: 2024-02-01 17:31 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Josh G 2024-01-30 19:42:54 UTC
Getting this error on all gentoo-sources versions >6.6

Virtual machine on Proxmox host:
Linux whphx13 6.3.0-gentoo #1 SMP PREEMPT_DYNAMIC Sun Apr 30 23:14:46 MST 2023 x86_64 AMD Ryzen 9 5950X 16-Core Processor AuthenticAMD GNU/Linux

Same behavior with nfs-utils-2.6.4-r*

Any suggestions?
-Josh

[  408.179751] ------------[ cut here ]------------
[  408.179754] WARNING: CPU: 14 PID: 5128 at mm/slab_common.c:975 free_large_kmalloc+0x48/0x80
[  408.179775] Modules linked in: nfsd xt_MASQUERADE xt_nat iptable_nat nf_nat xt_set lz4 lz4_compress ip_set_hash_ip zstd zstd_compress zram zsmalloc ip_set_hash_net ip_set sch_fq tcp_bbr crc32_pclmul crc32c_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd bochs virtio_net drm_vram_helper net_failover drm_ttm_helper failover ttm
[  408.179812] CPU: 14 PID: 5128 Comm: nfsd Not tainted 6.7.2-gentoo #3
[  408.179815] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014
[  408.179817] RIP: 0010:free_large_kmalloc+0x48/0x80
[  408.179821] Code: f7 da 48 63 d2 48 8b 03 be 06 00 00 00 48 c1 e8 3a 48 8b 3c c5 20 48 8c 82 e8 e4 60 ff ff 89 ee 48 89 df 5b 5d e9 58 a4 03 00 <0f> 0b 80 3d 93 ae 5f 01 00 74 0b 48 c7 c2 00 f0 ff ff 31 ed eb c5
[  408.179823] RSP: 0018:ffffc90000e47b88 EFLAGS: 00010246
[  408.179827] RAX: 0080000000000000 RBX: ffffea0000808880 RCX: 0000000000000000
[  408.179830] RDX: 0000000000000001 RSI: ffffffffa0222a55 RDI: ffffea0000808880
[  408.179831] RBP: ffff8980c4f88240 R08: ffffffff82606b18 R09: ffffffff82606b18
[  408.179832] R10: 0000000000000400 R11: 0000000000000000 R12: ffff8980c58bc028
[  408.179834] R13: ffff898223d18240 R14: ffff898001ad8900 R15: ffff8980c4f88000
[  408.179836] FS:  0000000000000000(0000) GS:ffff88803eb80000(0000) knlGS:0000000000000000
[  408.179840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  408.179841] CR2: 00007f374c425550 CR3: 00000100dcd3e000 CR4: 0000000000750ef0
[  408.179843] PKRU: 55555554
[  408.179845] Call Trace:
[  408.179850]  <TASK>
[  408.179852]  ? __warn+0x78/0x140
[  408.179866]  ? free_large_kmalloc+0x48/0x80
[  408.179869]  ? report_bug+0x18d/0x1c0
[  408.179885]  ? handle_bug+0x3a/0x70
[  408.179889]  ? exc_invalid_op+0x13/0x60
[  408.179895]  ? asm_exc_invalid_op+0x16/0x20
[  408.179904]  ? nfsd_setuser+0x185/0x270 [nfsd]
[  408.179927]  ? free_large_kmalloc+0x48/0x80
[  408.179929]  nfsd4_encode_fattr4+0x390/0x500 [nfsd]
[  408.179950]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.179955]  ? exportfs_decode_fh_raw+0x101/0x2b0
[  408.179968]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.179971]  ? __kmem_cache_alloc_node+0x108/0x1e0
[  408.179978]  ? security_prepare_creds+0xcc/0xf0
[  408.179990]  ? security_prepare_creds+0xcc/0xf0
[  408.179992]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.179995]  ? __kmalloc+0x43/0x150
[  408.179997]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.180000]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.180003]  ? security_prepare_creds+0x55/0xf0
[  408.180005]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.180008]  ? prepare_creds+0x18a/0x290
[  408.180017]  ? srso_alias_return_thunk+0x5/0xfbef5
[  408.180020]  ? nfsd_setuser+0x185/0x270 [nfsd]
[  408.180038]  nfsd4_encode_getattr+0x23/0x30 [nfsd]
[  408.180054]  nfsd4_encode_operation+0xa1/0x2b0 [nfsd]
[  408.180071]  nfsd4_proc_compound+0x1cb/0x660 [nfsd]
[  408.180088]  nfsd_dispatch+0xc7/0x220 [nfsd]
[  408.180110]  svc_process_common+0x3a6/0x610
[  408.180116]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[  408.180130]  svc_process+0x13a/0x170
[  408.180132]  svc_recv+0x801/0xa20
[  408.180137]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  408.180151]  nfsd+0x7b/0xe0 [nfsd]
[  408.180165]  kthread+0xee/0x120
[  408.180168]  ? __pfx_kthread+0x10/0x10
[  408.180171]  ret_from_fork+0x2b/0x40
[  408.180176]  ? __pfx_kthread+0x10/0x10
[  408.180178]  ret_from_fork_asm+0x1b/0x30
[  408.180184]  </TASK>
[  408.180185] ---[ end trace 0000000000000000 ]---
[  408.180187] object pointer: 0x00000000e625140d
Comment 1 Mike Pagano gentoo-dev 2024-01-31 12:10:38 UTC
Last working kernel ?
Comment 2 Josh G 2024-01-31 15:30:31 UTC
(In reply to Mike Pagano from comment #1)
> Last working kernel ?

6.3.0-gentoo

I jumped from 6.3.0 to 6.6.3, and then tried 6.7.2.

Even tried starting fresh from defconfig, and then adding the modules/etc we need.

This is a virtual machine on Proxmox 7.4 / QEMU 7.2. I'm not doing memory ballooning, seeing any RAM errors on the host, or anything to indicate a hardware problem.
Host is a "AMD Ryzen 9 5950X" with 128GB of ECC RAM

Strangest thing I've seen since I found an OOM bug in the kernel 15ish years ago.

Thanks!
-Josh
Comment 3 Mike Pagano gentoo-dev 2024-01-31 16:20:26 UTC
There are some patches around this area in 6.8.  Not sure if you feel brave enough to try a git-sources-6.8-rcX.
Comment 4 Josh G 2024-01-31 18:54:59 UTC
(In reply to Mike Pagano from comment #3)
> There are some patches around this area in 6.8.  Not sure if you feel brave
> enough to try a git-sources-6.8-rcX.

I'm tempted. But prudence says I shouldn't run a RC kernel on a production machine. Might try it over the weekend.

Appreciate the responses.
Comment 5 Mike Pagano gentoo-dev 2024-02-01 17:15:02 UTC
This VM is running gentoo-sources 6.3.0 ?  Can you upgrade this please?
Comment 6 Josh G 2024-02-01 17:28:20 UTC
(In reply to Mike Pagano from comment #5)
> This VM is running gentoo-sources 6.3.0 ?  Can you upgrade this please?

Mike, I'd love to.

Note that my problem is with either the NFSD driver or the memory manager on versions >6.3. 

If I had to bet, it's something wonky in the NFSD driver 4.x code. 

As per your suggestion on trying the fixes in 6.8, I'm waiting on the release, and then will give vanilla a shot.

Other than that, I'm open to suggestions.
Comment 7 Josh G 2024-02-01 17:31:48 UTC
The kernel trace is from gentoo-sources 6.7.2-r1

Brilliant me pasted the uname -a from after I reverted to 6.3

-J