Summary: | sys-kernel/gentoo-sources-4.14.114-r1: bcache.cached_dev_detach_finish : invalid opcode: 0000 [#1] SMP PTI, 'BUG_ON(atomic_read(&dc->count));' | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | grzybowskik |
Component: | Current packages | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
Status: | RESOLVED NEEDINFO | ||
Severity: | major | ||
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
detailed_configuration_and_traceback
bcache patch bundle |
Created attachment 576770 [details]
bcache patch bundle
Including bundle of 15 patches to fix bcache in kernel based on version 4.14.114/115/116/117/118/119
├── 1001.bcache_avoid_nested_58f913dce281.patch
├── 1002.bcache_fix_comment_type_b1e8139e48b5.patch
├── 1003.bcache_rewrite_multipart_1dbe32ad0a82.patch
├── 1007.bcache_dont_writeback_failed_io_5fa89fb9a86b.patch
├── 1015.bcache_convert_cacheed_dev_atomi_to_refcount_3b304d24a718.patch
├── 1016.bcache_update_bucket_real_time_d44c2f9e7cc0.patch
├── 1019.bcache_4.15_backport_writeback.patch
├── 1020.bcache_convert_timers_8376d3c1f989.patch
├── 1021.bcache_journal_comment_bb22cafd7568.patch
├── 1025.bcache_use_PTR_ERR_OR_ZERO_9d13411784e2.patch
├── 1028.bcache_fix_wrong_return_bch_debug_init_539d39eb2708.patch
├── 1031.bcache_improve_efficiency_closure_e4bf791937d8.patch
├── 1033.bcache_reduce_cache_set_device_iteration_2831231d4c3f.patch
└── 1036.bcache_closure_move_control_bits_3609c471a1b8.patch
patches where created by back-porting from linux kernel git tree.
Current state of bcache in kernel 4.14.114/115/116/117/118/119
Create backing and cache device and attach works but we can't detach at all.
I/O errors are not detected which might cause unexpected problems.
Detach functionality could be fixed by applying patch 1019.bcache_4.15_backport_writeback.patch
After that we can detach but first need to write none into cache_module
echo none > /sys/block/bcacheX/bcache/cache_modue
Following to that we need to fix I/O error detection otherwise when we have active I/O operations on caching device and we attempt to detach, then will find broken symlink in /sys/block/bcacheX/bcache
So to sensible fix bcache in kernel 4.14.X we need to apply all above patches.
Above patches where tested on kernel 4.14.115.
After applying we can create backing, cache device, attach, detach, disable, partition with fixed implementation, I/O error detection works and cache device being detach with following messages in dmesg
"""
bcache: bch_count_io_errors() sdc: IO error on writing data to cache, recovering
bcache: error on ed80d9c1-5fc2-4390-bd9e-f633dd6a3a4f:
journal io error
, disabling caching
bcache: cached_dev_detach_finish() Caching disabled for dm-1
bcache: bch_count_io_errors() sdc: IO error on writing btree, recovering
bcache: cache_set_free() Cache set ed80d9c1-5fc2-4390-bd9e-f633dd6a3a4f unregistered
"""
Tested performance witch file copy and benchmark with FIO tool.
All of these patches except for the 4.15 backport are upstream and in current supported kernels. Do you experience this issue, anymore? 1001.bcache_avoid_nested_58f913dce281.patch Commit: 58f913dce2814a9ea7260e93ed3a949e0d5565e3 Date: 2017-10-16 09:07:26 -0600 1002.bcache_fix_comment_type_b1e8139e48b5.patch Date: 2017-10-16 09:07:26 -0600 Commit: b1e8139e48b58e3bc1234e619c750ffd1394be2f 1003.bcache_rewrite_multipart_1dbe32ad0a82.patch Commit :1dbe32ad0a82f39c6dfb7667c5da5c23b9333664 Date: 2017-10-16 09:07:26 -0600 1007.bcache_dont_writeback_failed_io_5fa89fb9a86b.patch Commit : 5fa89fb9a86bcc0f0b3f21ab6087a8a4170dcd2c Date: 2017-10-16 09:07:26 -0600 1015.bcache_convert_cacheed_dev_atomi_to_refcount_3b304d24a718.patch Commit : 3b304d24a718ae779ee9c7f2014dd3b2d0893b70 Date: 2017-10-30 15:57:54 -0600 1016.bcache_update_bucket_real_time_d44c2f9e7cc0.patch Commit : d44c2f9e7cc0041f0cd88df1fe7a1fceb713ab14 Date: 2017-10-30 15:57:54 -0600 1019.bcache_4.15_backport_writeback.patch 1020.bcache_convert_timers_8376d3c1f989.patch Commit : 8376d3c1f98988ae7f9e9bc2d1eeeb7d61fd206c Date: 2017-11-14 20:11:57 -0700 1021.bcache_journal_comment_bb22cafd7568.patch Commit : bb22cafd75686d799dabfe422571fac4b5c2ed94 Date: 2017-11-24 16:22:55 -0700 1025.bcache_use_PTR_ERR_OR_ZERO_9d13411784e2.patch Date: 2018-01-08 13:29:00 -0700 Commit: 9d13411784e27227162857df25ab6817a1db2a73 1028.bcache_fix_wrong_return_bch_debug_init_539d39eb2708.patch Date: 2018-01-08 13:29:00 -0700 Commit: 539d39eb27083405b82b9e604e88af01a9a46c63 1031.bcache_improve_efficiency_closure_e4bf791937d8.patch Date: 2018-01-08 13:29:00 -0700 Commit: e4bf791937d82afca79e1df4063f72dbc6960ac7 1033.bcache_reduce_cache_set_device_iteration_2831231d4c3f.patch Date: 2018-01-08 13:29:00 -0700 Commit: 2831231d4c3f999d2d062b23dfbc8b0faa4bc6e0 1036.bcache_closure_move_control_bits_3609c471a1b8.patch Date: 2018-01-09 12:18:51 -0700 Commit: 3609c471a1b86bffc812d8a2f0299892aa11a5e6 |
Created attachment 575462 [details] detailed_configuration_and_traceback This kernel module BUG is triggered on manual cached dev detach. Tested kernels 4.14.101 - 4.14.114 To test create bcached backing and caching device attach and attempt to detach. make-bcache -B /dev/sda make-bcache -C /dev/sdb echo cdev.uuid > /sys/block/bcache0/bcache/attach echo cdev.uuid > /sys/block/bcache0/bcache/detach To improve logs, enabled dynamic_debug by writing echo 'options bcache dyndbg=+pt' > /etc/modprobe.d/bcache.conf in dmesg following BUG will appear: kernel: [10279] bcache: __write_super() ver 1, flags 2305843009213693952, seq 0 kernel: ------------[ cut here ]------------ kernel: Kernel BUG at ffffffffa0784f25 [verbose debug info unavailable] kernel: invalid opcode: 0000 [#1] SMP PTI kernel: Modules linked in: drbd lru_cache iptable_filter ip_tables nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc mlx4_ib ib_core mlx4_en intel_rapl x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass intel_cstate iTCO_wdt iTCO_vendor_support mlx4_core igb input_leds bcac he devlink intel_rapl_perf pcspkr led_class i2c_i801 ptp lpc_ich joydev ioatdma pps_core ipmi_si dca ipmi_devintf rtc_cmos ipmi_msghandler pcc_cpufreq acpi_pad sch_fq_codel btrfs xor zstd_decompress zstd_compress xxhash lzo_compress raid6_pq dm_crypt usbhid mxm_wmi crc32c_intel arcmsr ahci ast xhci_pci libahci ehci _pci ttm ehci_hcd xhci_hcd libata wmi button dm_mirror dm_region_hash dm_log dm_mod kernel: CPU: 27 PID: 295 Comm: kworker/27:1 Not tainted 4.14.114-gentoo-r1-201905 #1 kernel: Hardware name: Supermicro X10DRH LN4/X10DRH-ILN4, BIOS 2.0 01/30/2016 kernel: Workqueue: events cached_dev_detach_finish [bcache] kernel: task: ffff88885cf84440 task.stack: ffffc90008a48000 kernel: RIP: 0010:cached_dev_detach_finish+0x55/0x1b0 [bcache] kernel: RSP: 0018:ffffc90008a4bdf8 EFLAGS: 00010286 kernel: RAX: 00000000ffffffff RBX: ffff88905c200af0 RCX: 0000000000000000 kernel: RDX: 0000000000000001 RSI: ffff88905c200af8 RDI: ffffc90008a4be48 kernel: RBP: ffff88905c200000 R08: 0000000000000001 R09: 0000000000000001 kernel: R10: ffffc90008a4be30 R11: 00273bdd233be0d4 R12: ffffc90008a4bdf8 kernel: R13: 0000000000000000 R14: ffff88885fc60880 R15: 0000000000000000 kernel: FS: 0000000000000000(0000) GS:ffff88885fc40000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 000055f9fae819f8 CR3: 000000000220a004 CR4: 00000000003606e0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: process_one_work+0x1cf/0x370 kernel: worker_thread+0x208/0x380 kernel: ? process_one_work+0x370/0x370 kernel: kthread+0x11d/0x130 kernel: ? kthread_destroy_worker+0x40/0x40 kernel: ret_from_fork+0x35/0x40 kernel: Code: 89 44 24 70 31 c0 49 89 e4 4c 89 e7 f3 48 ab c7 44 24 28 01 00 00 a0 48 8b 83 d0 f5 ff ff a8 02 75 02 0f 0b 8b 43 f8 85 c0 74 02 <0f> 0b 48 c7 c7 00 1d 7a a0 e8 7d 37 0f e1 48 8d 7b 68 e8 54 91 kernel: RIP: cached_dev_detach_finish+0x55/0x1b0 [bcache] RSP: ffffc90008a4bdf8 kernel: ---[ end trace e213d6c6632e8c72 ]--- In attachment included kernel config, traceback, GDB_disassemble, emerge --info