Hi, I have an OverlayFS merging 2 lower layers (btrfs). This merge is exported to my network using NFS. This used to work for a very long time, but suddenly stopped after an update a few months ago (sorry, I didn’t notice the problem at once, so I can’t say at what time it broke). Here is the error: [ 815.449084] BUG: kernel NULL pointer dereference, address: 0000000000000068 [ 815.449108] #PF: supervisor read access in kernel mode [ 815.449120] #PF: error_code(0x0000) - not-present page [ 815.449132] PGD 0 P4D 0 [ 815.449145] Oops: 0000 [#1] PREEMPT SMP PTI [ 815.449159] CPU: 1 PID: 1863 Comm: rpc.mountd Not tainted 6.6.21-gentoo #1 [ 815.449176] Hardware name: System manufacturer System Product Name/PRIME B360M-K, BIOS 2811 05/27/2020 [ 815.449192] RIP: 0010:ovl_encode_real_fh+0x36/0x120 [ 815.449213] Code: 55 49 89 fd 41 54 41 89 d4 ba 98 00 00 00 53 48 83 ec 10 48 8b 3d 9a f6 f0 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <4c> 8b 76 68 be c0 0d 00 00 e8 8c da d6 ff 48 85 c0 0f 84 bb 00 00 [ 815.449241] RSP: 0018:ffffc9000099bdf0 EFLAGS: 00010246 [ 815.449256] RAX: 0000000000000000 RBX: ffffc9000099bebc RCX: 0000000000000000 [ 815.449270] RDX: 0000000000000098 RSI: 0000000000000000 RDI: ffff888100041800 [ 815.449283] RBP: ffffc9000099be28 R08: 0000000000000000 R09: 0000000000000000 [ 815.449296] R10: 000000000002d3b0 R11: 0000000000000002 R12: 0000000000000001 [ 815.449309] R13: ffff888296e97480 R14: ffff888296e97480 R15: 0000000000000000 [ 815.449322] FS: 00007f55117087c0(0000) GS:ffff88845dc80000(0000) knlGS:0000000000000000 [ 815.449339] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 815.449351] CR2: 0000000000000068 CR3: 000000011cbe4002 CR4: 00000000003706e0 [ 815.449365] Call Trace: [ 815.449374] <TASK> [ 815.449381] ? __die+0x1a/0x60 [ 815.449399] ? page_fault_oops+0x158/0x440 [ 815.449414] ? generic_permission+0x30/0x220 [ 815.449432] ? exc_page_fault+0x3d9/0x6a0 [ 815.449448] ? asm_exc_page_fault+0x22/0x30 [ 815.449463] ? ovl_encode_real_fh+0x36/0x120 [ 815.449478] ? preempt_count_add+0x65/0xa0 [ 815.449496] ovl_encode_fh+0x25d/0x420 [ 815.449511] exportfs_encode_fh+0x2b/0x70 [ 815.449532] do_sys_name_to_handle+0xaf/0x1a0 [ 815.449558] __x64_sys_name_to_handle_at+0x98/0xc0 [ 815.449575] do_syscall_64+0x38/0x90 [ 815.449588] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 815.449603] RIP: 0033:0x7f551198ecce [ 815.449614] Code: 48 8b 0d 65 51 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 2f 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 32 51 0c 00 f7 d8 64 89 01 48 [ 815.449641] RSP: 002b:00007ffe7561b2a8 EFLAGS: 00000206 ORIG_RAX: 000000000000012f [ 815.449659] RAX: ffffffffffffffda RBX: 0000561d0f629f18 RCX: 00007f551198ecce [ 815.449672] RDX: 00007ffe7561b350 RSI: 0000561d0f629f18 RDI: 00000000ffffff9c [ 815.449685] RBP: 0000561d0f629f18 R08: 0000000000000000 R09: 0000000000000001 [ 815.449698] R10: 00007ffe7561b2bc R11: 0000000000000206 R12: 000000000000000a [ 815.449711] R13: 0000561d0f629f00 R14: 0000000000000001 R15: 0000000000000000 [ 815.449725] </TASK> [ 815.449733] CR2: 0000000000000068 [ 815.449743] ---[ end trace 0000000000000000 ]--- [ 815.449754] RIP: 0010:ovl_encode_real_fh+0x36/0x120 [ 815.449770] Code: 55 49 89 fd 41 54 41 89 d4 ba 98 00 00 00 53 48 83 ec 10 48 8b 3d 9a f6 f0 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <4c> 8b 76 68 be c0 0d 00 00 e8 8c da d6 ff 48 85 c0 0f 84 bb 00 00 [ 815.449796] RSP: 0018:ffffc9000099bdf0 EFLAGS: 00010246 [ 815.449809] RAX: 0000000000000000 RBX: ffffc9000099bebc RCX: 0000000000000000 [ 815.449822] RDX: 0000000000000098 RSI: 0000000000000000 RDI: ffff888100041800 [ 815.449835] RBP: ffffc9000099be28 R08: 0000000000000000 R09: 0000000000000000 [ 815.449847] R10: 000000000002d3b0 R11: 0000000000000002 R12: 0000000000000001 [ 815.449859] R13: ffff888296e97480 R14: ffff888296e97480 R15: 0000000000000000 [ 815.449872] FS: 00007f55117087c0(0000) GS:ffff88845dc80000(0000) knlGS:0000000000000000 [ 815.449887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 815.449899] CR2: 0000000000000068 CR3: 000000011cbe4002 CR4: 00000000003706e0 [ 815.449913] note: rpc.mountd[1863] exited with irqs disabled The current version of my kernel is: Linux 6.6.21-gentoo #1 SMP PREEMPT_DYNAMIC Sat Mar 16 17:04:48 CET 2024 x86_64 Intel(R) Pentium(R) Gold G5400 CPU @ 3.70GHz GenuineIntel GNU/Linux NFS export: /exports/media -sync,no_subtree_check,mp,all_squash,ro,fsid=1 *.my.local.network xx.xx.0.0/16(insecure) fstab: overlay /exports/media overlay lowerdir=/var/media/Internes:/var/media/Externes,nfs_export=on,redirect_dir=nofollow 0 0 I used to have index=on, but dmesg report it as being useless, so I removed it, but it did not change the problem. Please tell me if I should report the bug upstream instead.
Is this vanilla, gentoo-sources, something else? Last working kernel ? Have you tried the latest 6.8.X kernel (6.8.2 as of this writing) to see if something was fixed?
Hi, Sorry for the lack of information. This is gentoo-sources, v6.6.21 (but I am pretty sure that the problem appeared before this version). I haven’t tried the v6.8 yet, but I will now and tell you the result.
Just tested with « Linux 6.8.4-gentoo » and the problem is the same.
(In reply to Stéphane Veyret from comment #3) > Just tested with « Linux 6.8.4-gentoo » and the problem is the same. what was the last working kernel ?
Unfortunately, as I said, I didn’t notice the problem at once, and so cannot say for sure when it broke. What I can say is that I update my computer once a week with the latest stable gentoo-source kernel, and it was still working by the end of december. Sorry for not being more precise.
Is this still an issue with later kernels?
No more luck with gentoo-sources-6.9.1 [dim. 19 mai 12:07:27 2024] BUG: kernel NULL pointer dereference, address: 0000000000000070 [dim. 19 mai 12:07:27 2024] #PF: supervisor read access in kernel mode [dim. 19 mai 12:07:27 2024] #PF: error_code(0x0000) - not-present page [dim. 19 mai 12:07:27 2024] PGD 0 P4D 0 [dim. 19 mai 12:07:27 2024] Oops: 0000 [#1] PREEMPT SMP PTI [dim. 19 mai 12:07:27 2024] CPU: 3 PID: 3236 Comm: rpc.mountd Not tainted 6.9.1-gentoo #1 [dim. 19 mai 12:07:27 2024] Hardware name: System manufacturer System Product Name/PRIME B360M-K, BIOS 2811 05/27/2020 [dim. 19 mai 12:07:27 2024] RIP: 0010:ovl_encode_real_fh+0x36/0x120 [dim. 19 mai 12:07:27 2024] Code: 55 49 89 fd 41 54 41 89 d4 ba 98 00 00 00 53 48 83 ec 10 48 8b 3d ca 27 f0 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <4c> 8b 76 70 be c0 0d 00 00 e8 cc f3 d9 ff 48 85 c0 0f 84 bb 00 00 [dim. 19 mai 12:07:27 2024] RSP: 0018:ffffb46fc14abde8 EFLAGS: 00010246 [dim. 19 mai 12:07:27 2024] RAX: 0000000000000000 RBX: ffffb46fc14abeb4 RCX: 0000000000000000 [dim. 19 mai 12:07:27 2024] RDX: 0000000000000098 RSI: 0000000000000000 RDI: ffffa0fe80042800 [dim. 19 mai 12:07:27 2024] RBP: ffffb46fc14abe20 R08: 0000000000000000 R09: 0000000000000000 [dim. 19 mai 12:07:27 2024] R10: ffffb46fc14abeb0 R11: 0000000000000002 R12: 0000000000000001 [dim. 19 mai 12:07:27 2024] R13: ffffa0fe87a54600 R14: ffffa0fe87a54600 R15: 0000000000000000 [dim. 19 mai 12:07:27 2024] FS: 00007f92d28f47c0(0000) GS:ffffa101ddd80000(0000) knlGS:0000000000000000 [dim. 19 mai 12:07:27 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [dim. 19 mai 12:07:27 2024] CR2: 0000000000000070 CR3: 000000021326c002 CR4: 00000000003706f0 [dim. 19 mai 12:07:27 2024] Call Trace: [dim. 19 mai 12:07:27 2024] <TASK> [dim. 19 mai 12:07:27 2024] ? __die+0x1a/0x60 [dim. 19 mai 12:07:27 2024] ? page_fault_oops+0x157/0x450 [dim. 19 mai 12:07:27 2024] ? generic_permission+0x30/0x220 [dim. 19 mai 12:07:27 2024] ? inode_permission+0xd2/0x180 [dim. 19 mai 12:07:27 2024] ? generic_permission+0x30/0x220 [dim. 19 mai 12:07:27 2024] ? exc_page_fault+0x3de/0x6a0 [dim. 19 mai 12:07:27 2024] ? asm_exc_page_fault+0x22/0x30 [dim. 19 mai 12:07:27 2024] ? ovl_encode_real_fh+0x36/0x120 [dim. 19 mai 12:07:27 2024] ? preempt_count_add+0x64/0xa0 [dim. 19 mai 12:07:27 2024] ovl_encode_fh+0x25d/0x400 [dim. 19 mai 12:07:27 2024] ? __kmalloc+0x157/0x3b0 [dim. 19 mai 12:07:27 2024] exportfs_encode_fh+0x2b/0x70 [dim. 19 mai 12:07:27 2024] do_sys_name_to_handle+0xbc/0x1f0 [dim. 19 mai 12:07:27 2024] __x64_sys_name_to_handle_at+0x98/0xc0 [dim. 19 mai 12:07:27 2024] do_syscall_64+0x48/0x110 [dim. 19 mai 12:07:27 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e [dim. 19 mai 12:07:27 2024] RIP: 0033:0x7f92d2b761fe [dim. 19 mai 12:07:27 2024] Code: 48 8b 0d 1d cc 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 2f 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ea cb 0c 00 f7 d8 64 89 01 48 [dim. 19 mai 12:07:27 2024] RSP: 002b:00007fff1c241938 EFLAGS: 00000206 ORIG_RAX: 000000000000012f [dim. 19 mai 12:07:27 2024] RAX: ffffffffffffffda RBX: 000055fcd3c11f38 RCX: 00007f92d2b761fe [dim. 19 mai 12:07:27 2024] RDX: 00007fff1c2419e0 RSI: 000055fcd3c11f38 RDI: 00000000ffffff9c [dim. 19 mai 12:07:27 2024] RBP: 000055fcd3c11f38 R08: 0000000000000000 R09: 0000000000000001 [dim. 19 mai 12:07:27 2024] R10: 00007fff1c24194c R11: 0000000000000206 R12: 000000000000000a [dim. 19 mai 12:07:27 2024] R13: 000055fcd3c11f20 R14: 0000000000000001 R15: 0000000000000000 [dim. 19 mai 12:07:27 2024] </TASK> [dim. 19 mai 12:07:27 2024] CR2: 0000000000000070 [dim. 19 mai 12:07:27 2024] ---[ end trace 0000000000000000 ]--- [dim. 19 mai 12:07:27 2024] RIP: 0010:ovl_encode_real_fh+0x36/0x120
Can you recreate with some debug settings enabled on your kernel? CONFIG_DEBUG_KERNEL CONFIG_DEBUG_INFO CONFIG_KALLSYMS And then post the error ?
Hello, I wanted to try your suggestion with kernel from gentoo-sources-6.9.7, but this version does not have the error anymore. The bug seems corrected. Thank you for your help.