Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 928379 - kernel NULL pointer dereference with overlayfs and nfs
Summary: kernel NULL pointer dereference with overlayfs and nfs
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal major
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-01 09:59 UTC by Stéphane Veyret
Modified: 2024-07-01 05:21 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stéphane Veyret 2024-04-01 09:59:52 UTC
Hi,

I have an OverlayFS merging 2 lower layers (btrfs). This merge is exported to my network using NFS.
This used to work for a very long time, but suddenly stopped after an update a few months ago (sorry, I didn’t notice the problem at once, so I can’t say at what time it broke).

Here is the error:

[  815.449084] BUG: kernel NULL pointer dereference, address: 0000000000000068
[  815.449108] #PF: supervisor read access in kernel mode
[  815.449120] #PF: error_code(0x0000) - not-present page
[  815.449132] PGD 0 P4D 0 
[  815.449145] Oops: 0000 [#1] PREEMPT SMP PTI
[  815.449159] CPU: 1 PID: 1863 Comm: rpc.mountd Not tainted 6.6.21-gentoo #1
[  815.449176] Hardware name: System manufacturer System Product Name/PRIME B360M-K, BIOS 2811 05/27/2020
[  815.449192] RIP: 0010:ovl_encode_real_fh+0x36/0x120
[  815.449213] Code: 55 49 89 fd 41 54 41 89 d4 ba 98 00 00 00 53 48 83 ec 10 48 8b 3d 9a f6 f0 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <4c> 8b 76 68 be c0 0d 00 00 e8 8c da d6 ff 48 85 c0 0f 84 bb 00 00
[  815.449241] RSP: 0018:ffffc9000099bdf0 EFLAGS: 00010246
[  815.449256] RAX: 0000000000000000 RBX: ffffc9000099bebc RCX: 0000000000000000
[  815.449270] RDX: 0000000000000098 RSI: 0000000000000000 RDI: ffff888100041800
[  815.449283] RBP: ffffc9000099be28 R08: 0000000000000000 R09: 0000000000000000
[  815.449296] R10: 000000000002d3b0 R11: 0000000000000002 R12: 0000000000000001
[  815.449309] R13: ffff888296e97480 R14: ffff888296e97480 R15: 0000000000000000
[  815.449322] FS:  00007f55117087c0(0000) GS:ffff88845dc80000(0000) knlGS:0000000000000000
[  815.449339] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  815.449351] CR2: 0000000000000068 CR3: 000000011cbe4002 CR4: 00000000003706e0
[  815.449365] Call Trace:
[  815.449374]  <TASK>
[  815.449381]  ? __die+0x1a/0x60
[  815.449399]  ? page_fault_oops+0x158/0x440
[  815.449414]  ? generic_permission+0x30/0x220
[  815.449432]  ? exc_page_fault+0x3d9/0x6a0
[  815.449448]  ? asm_exc_page_fault+0x22/0x30
[  815.449463]  ? ovl_encode_real_fh+0x36/0x120
[  815.449478]  ? preempt_count_add+0x65/0xa0
[  815.449496]  ovl_encode_fh+0x25d/0x420
[  815.449511]  exportfs_encode_fh+0x2b/0x70
[  815.449532]  do_sys_name_to_handle+0xaf/0x1a0
[  815.449558]  __x64_sys_name_to_handle_at+0x98/0xc0
[  815.449575]  do_syscall_64+0x38/0x90
[  815.449588]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  815.449603] RIP: 0033:0x7f551198ecce
[  815.449614] Code: 48 8b 0d 65 51 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 2f 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 32 51 0c 00 f7 d8 64 89 01 48
[  815.449641] RSP: 002b:00007ffe7561b2a8 EFLAGS: 00000206 ORIG_RAX: 000000000000012f
[  815.449659] RAX: ffffffffffffffda RBX: 0000561d0f629f18 RCX: 00007f551198ecce
[  815.449672] RDX: 00007ffe7561b350 RSI: 0000561d0f629f18 RDI: 00000000ffffff9c
[  815.449685] RBP: 0000561d0f629f18 R08: 0000000000000000 R09: 0000000000000001
[  815.449698] R10: 00007ffe7561b2bc R11: 0000000000000206 R12: 000000000000000a
[  815.449711] R13: 0000561d0f629f00 R14: 0000000000000001 R15: 0000000000000000
[  815.449725]  </TASK>
[  815.449733] CR2: 0000000000000068
[  815.449743] ---[ end trace 0000000000000000 ]---
[  815.449754] RIP: 0010:ovl_encode_real_fh+0x36/0x120
[  815.449770] Code: 55 49 89 fd 41 54 41 89 d4 ba 98 00 00 00 53 48 83 ec 10 48 8b 3d 9a f6 f0 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <4c> 8b 76 68 be c0 0d 00 00 e8 8c da d6 ff 48 85 c0 0f 84 bb 00 00
[  815.449796] RSP: 0018:ffffc9000099bdf0 EFLAGS: 00010246
[  815.449809] RAX: 0000000000000000 RBX: ffffc9000099bebc RCX: 0000000000000000
[  815.449822] RDX: 0000000000000098 RSI: 0000000000000000 RDI: ffff888100041800
[  815.449835] RBP: ffffc9000099be28 R08: 0000000000000000 R09: 0000000000000000
[  815.449847] R10: 000000000002d3b0 R11: 0000000000000002 R12: 0000000000000001
[  815.449859] R13: ffff888296e97480 R14: ffff888296e97480 R15: 0000000000000000
[  815.449872] FS:  00007f55117087c0(0000) GS:ffff88845dc80000(0000) knlGS:0000000000000000
[  815.449887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  815.449899] CR2: 0000000000000068 CR3: 000000011cbe4002 CR4: 00000000003706e0
[  815.449913] note: rpc.mountd[1863] exited with irqs disabled

The current version of my kernel is: Linux 6.6.21-gentoo #1 SMP PREEMPT_DYNAMIC Sat Mar 16 17:04:48 CET 2024 x86_64 Intel(R) Pentium(R) Gold G5400 CPU @ 3.70GHz GenuineIntel GNU/Linux

NFS export:
/exports/media	-sync,no_subtree_check,mp,all_squash,ro,fsid=1 *.my.local.network xx.xx.0.0/16(insecure)

fstab:
overlay						/exports/media		overlay	lowerdir=/var/media/Internes:/var/media/Externes,nfs_export=on,redirect_dir=nofollow 0 0

I used to have index=on, but dmesg report it as being useless, so I removed it, but it did not change the problem.

Please tell me if I should report the bug upstream instead.
Comment 1 Mike Pagano gentoo-dev 2024-04-02 21:46:42 UTC
Is this vanilla, gentoo-sources, something else?

Last working kernel ?

Have you tried the latest 6.8.X kernel (6.8.2 as of this writing) to see if something was fixed?
Comment 2 Stéphane Veyret 2024-04-06 09:46:45 UTC
Hi,

Sorry for the lack of information.

This is gentoo-sources, v6.6.21 (but I am pretty sure that the problem appeared before this version).
I haven’t tried the v6.8 yet, but I will now and tell you the result.
Comment 3 Stéphane Veyret 2024-04-06 10:57:57 UTC
Just tested with « Linux 6.8.4-gentoo » and the problem is the same.
Comment 4 Mike Pagano gentoo-dev 2024-04-07 13:41:09 UTC
(In reply to Stéphane Veyret from comment #3)
> Just tested with « Linux 6.8.4-gentoo » and the problem is the same.

what was the last working kernel ?
Comment 5 Stéphane Veyret 2024-04-07 14:06:19 UTC
Unfortunately, as I said, I didn’t notice the problem at once, and so cannot say for sure when it broke. What I can say is that I update my computer once a week with the latest stable gentoo-source kernel, and it was still working by the end of december. Sorry for not being more precise.
Comment 6 Mike Pagano gentoo-dev 2024-05-08 17:20:39 UTC
Is this still an issue with later kernels?
Comment 7 Stéphane Veyret 2024-05-19 10:10:27 UTC
No more luck with gentoo-sources-6.9.1

[dim. 19 mai 12:07:27 2024] BUG: kernel NULL pointer dereference, address: 0000000000000070
[dim. 19 mai 12:07:27 2024] #PF: supervisor read access in kernel mode
[dim. 19 mai 12:07:27 2024] #PF: error_code(0x0000) - not-present page
[dim. 19 mai 12:07:27 2024] PGD 0 P4D 0 
[dim. 19 mai 12:07:27 2024] Oops: 0000 [#1] PREEMPT SMP PTI
[dim. 19 mai 12:07:27 2024] CPU: 3 PID: 3236 Comm: rpc.mountd Not tainted 6.9.1-gentoo #1
[dim. 19 mai 12:07:27 2024] Hardware name: System manufacturer System Product Name/PRIME B360M-K, BIOS 2811 05/27/2020
[dim. 19 mai 12:07:27 2024] RIP: 0010:ovl_encode_real_fh+0x36/0x120
[dim. 19 mai 12:07:27 2024] Code: 55 49 89 fd 41 54 41 89 d4 ba 98 00 00 00 53 48 83 ec 10 48 8b 3d ca 27 f0 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <4c> 8b 76 70 be c0 0d 00 00 e8 cc f3 d9 ff 48 85 c0 0f 84 bb 00 00
[dim. 19 mai 12:07:27 2024] RSP: 0018:ffffb46fc14abde8 EFLAGS: 00010246
[dim. 19 mai 12:07:27 2024] RAX: 0000000000000000 RBX: ffffb46fc14abeb4 RCX: 0000000000000000
[dim. 19 mai 12:07:27 2024] RDX: 0000000000000098 RSI: 0000000000000000 RDI: ffffa0fe80042800
[dim. 19 mai 12:07:27 2024] RBP: ffffb46fc14abe20 R08: 0000000000000000 R09: 0000000000000000
[dim. 19 mai 12:07:27 2024] R10: ffffb46fc14abeb0 R11: 0000000000000002 R12: 0000000000000001
[dim. 19 mai 12:07:27 2024] R13: ffffa0fe87a54600 R14: ffffa0fe87a54600 R15: 0000000000000000
[dim. 19 mai 12:07:27 2024] FS:  00007f92d28f47c0(0000) GS:ffffa101ddd80000(0000) knlGS:0000000000000000
[dim. 19 mai 12:07:27 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[dim. 19 mai 12:07:27 2024] CR2: 0000000000000070 CR3: 000000021326c002 CR4: 00000000003706f0
[dim. 19 mai 12:07:27 2024] Call Trace:
[dim. 19 mai 12:07:27 2024]  <TASK>
[dim. 19 mai 12:07:27 2024]  ? __die+0x1a/0x60
[dim. 19 mai 12:07:27 2024]  ? page_fault_oops+0x157/0x450
[dim. 19 mai 12:07:27 2024]  ? generic_permission+0x30/0x220
[dim. 19 mai 12:07:27 2024]  ? inode_permission+0xd2/0x180
[dim. 19 mai 12:07:27 2024]  ? generic_permission+0x30/0x220
[dim. 19 mai 12:07:27 2024]  ? exc_page_fault+0x3de/0x6a0
[dim. 19 mai 12:07:27 2024]  ? asm_exc_page_fault+0x22/0x30
[dim. 19 mai 12:07:27 2024]  ? ovl_encode_real_fh+0x36/0x120
[dim. 19 mai 12:07:27 2024]  ? preempt_count_add+0x64/0xa0
[dim. 19 mai 12:07:27 2024]  ovl_encode_fh+0x25d/0x400
[dim. 19 mai 12:07:27 2024]  ? __kmalloc+0x157/0x3b0
[dim. 19 mai 12:07:27 2024]  exportfs_encode_fh+0x2b/0x70
[dim. 19 mai 12:07:27 2024]  do_sys_name_to_handle+0xbc/0x1f0
[dim. 19 mai 12:07:27 2024]  __x64_sys_name_to_handle_at+0x98/0xc0
[dim. 19 mai 12:07:27 2024]  do_syscall_64+0x48/0x110
[dim. 19 mai 12:07:27 2024]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[dim. 19 mai 12:07:27 2024] RIP: 0033:0x7f92d2b761fe
[dim. 19 mai 12:07:27 2024] Code: 48 8b 0d 1d cc 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 2f 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ea cb 0c 00 f7 d8 64 89 01 48
[dim. 19 mai 12:07:27 2024] RSP: 002b:00007fff1c241938 EFLAGS: 00000206 ORIG_RAX: 000000000000012f
[dim. 19 mai 12:07:27 2024] RAX: ffffffffffffffda RBX: 000055fcd3c11f38 RCX: 00007f92d2b761fe
[dim. 19 mai 12:07:27 2024] RDX: 00007fff1c2419e0 RSI: 000055fcd3c11f38 RDI: 00000000ffffff9c
[dim. 19 mai 12:07:27 2024] RBP: 000055fcd3c11f38 R08: 0000000000000000 R09: 0000000000000001
[dim. 19 mai 12:07:27 2024] R10: 00007fff1c24194c R11: 0000000000000206 R12: 000000000000000a
[dim. 19 mai 12:07:27 2024] R13: 000055fcd3c11f20 R14: 0000000000000001 R15: 0000000000000000
[dim. 19 mai 12:07:27 2024]  </TASK>
[dim. 19 mai 12:07:27 2024] CR2: 0000000000000070
[dim. 19 mai 12:07:27 2024] ---[ end trace 0000000000000000 ]---
[dim. 19 mai 12:07:27 2024] RIP: 0010:ovl_encode_real_fh+0x36/0x120
Comment 8 Mike Pagano gentoo-dev 2024-06-14 16:44:07 UTC
Can you recreate with some debug settings enabled on your kernel?


CONFIG_DEBUG_KERNEL
CONFIG_DEBUG_INFO
CONFIG_KALLSYMS

And then post the error ?
Comment 9 Stéphane Veyret 2024-07-01 05:21:31 UTC
Hello,
I wanted to try your suggestion with kernel from gentoo-sources-6.9.7, but this version does not have the error anymore. The bug seems corrected.
Thank you for your help.