Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 723974 - sys-kernel/gentoo-sources-5.6.8 crash in pm80xx kernel module
Summary: sys-kernel/gentoo-sources-5.6.8 crash in pm80xx kernel module
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal
Assignee: Gentoo Linux bug wranglers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-19 14:15 UTC by Tomas Thiemel
Modified: 2020-05-22 20:25 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Thiemel 2020-05-19 14:15:27 UTC
- not sure if hardware or driver (kernel) problem; 
- happens when there is high disk activity (btrfs disk resize/balance)

Reproducible: Sometimes

Steps to Reproduce:
1. Hardware: Serial Attached SCSI controller: Adaptec PMC-Sierra PM8018 SAS HBA [Series 7H] (rev 06) with attached drives
2. btrfs filesystem; disk with some bad physical sectors
3. do "btrfs balance" or move data from one physical disk to another (shrink disk size to 0 by 1GB)
Actual Results:  
Kernel panic:

[  +0.000004] BTRFS info (device dm-3): new size for /dev/mapper/vg3i-DATA4tmp is 843961073664
[ +15.566405] pm[159001.384204] BUG: kernel NULL pointer dereference, address: 0000000000000010
80xx0:: mpi_sata[159001.393016] #PF: supervisor read access in kernel mode
_event  2646:SAT[159001.393017] #PF: error_code(0x0000) - not-present page
A EVENT 0x23
[ [159001.393017] PGD 0 P4D 0
 +0.000005] pm80[159001.393019] Oops: 0000 [#1] SMP
xx0:: pm80xx_sen[159001.393021] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G       AW         5.6.8-gentoo-xeon #2
d_read_log  1737[159001.393022] Hardware name: MSI MS-7759/Z77MA-G45 (MS-7759), BIOS V1.9 03/01/2013
:Executing read [159001.393025] RIP: 0010:intel_unmap_sg+0x16/0xb0
log end
[  +0.0[159001.393027] Code: 5b 48 89 ef 5d 41 5c 4c 8b 00 41 5d 41 5e e9 c1 e4 ff ff 90 41 57 41 89 cf 41 56 41 55 49 89 fd 41 54 41 89 d4 55 48 89 f5 53 <4c> 8b 76 10 4c 89 c3 e8 5e fc ff ff 49 81 e6 00 f0 ff ff 84 c0 74
00248] pm80xx0:: mpi_sata_event [159001.393027] RSP: 0018:ffffc900000dcde0 EFLAGS: 00010046
 2646:SATA EVENT[159001.393028] RAX: 0000000000000000 RBX: ffff8887ec602100 RCX: 0000000000000000
 0x26
[  +0.000[159001.393029] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f9ca50b0
001] pm80xx0:: m[159001.393029] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
pi_sata_event  2[159001.393030] R10: ffffffff815da020 R11: 000000000001d604 R12: 0000000000000000
660:task or dev [159001.393030] R13: ffff8887f9ca50b0 R14: 0000000000000004 R15: 0000000000000000
null
[  +0.0000[159001.393031] FS:  0000000000000000(0000) GS:ffff8887fe600000(0000) knlGS:0000000000000000
02] pm80xx0:: pm[159001.393032] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
80xx_send_abort_[159001.393033] CR2: 0000000000000010 CR3: 000000043acdb003 CR4: 00000000001626e0
all  1659:Execut[159001.393033] Call Trace:
ing abort task e[159001.393035]  <IRQ>
nd
[159001.393040]  pm8001_ccb_task_free+0x1a0/0x1c0 [pm80xx]
[159001.393044]  pm8001_mpi_task_abort_resp+0xa7/0x120 [pm80xx]
[159001.393047]  process_oq+0x549/0x1be0 [pm80xx]
[159001.393051]  pm80xx_chip_isr+0x3e/0x80 [pm80xx]
[159001.393054]  tasklet_action_common.isra.0+0x42/0x90
[159001.393057]  __do_softirq+0xc8/0x206
[159001.393060]  irq_exit+0x9b/0xa0
[159001.581734]  do_IRQ+0x4c/0xd0
[159001.581737]  common_interrupt+0xf/0xf
[159001.581739]  </IRQ>
[159001.581741] RIP: 0010:cpuidle_enter_state+0x114/0x210
[159001.581743] Code: e8 c1 77 98 ff 31 ff 48 89 c5 e8 f7 81 98 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 e5 00 00 00 31 ff e8 90 9d 9d ff fb 45 85 ed <0f> 88 ad 00 00 00 49 63 d5 48 89 e8 48 2b 04 24 48 6b ca 68 48 8d
[159001.581744] RSP: 0018:ffffc90000077e78 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdc
[159001.581745] RAX: ffff8887fe7e3c00 RBX: ffffffff82280ee0 RCX: 000000000000001f
[159001.581746] RDX: 0000000000000000 RSI: 000000003350fd4c RDI: 0000000000000000
[159001.581747] RBP: 0000909c65042122 R08: 0000909c65042122 R09: 0000000000000d5a
[159001.581747] R10: ffff8887fe7e28c4 R11: ffff8887fe7e28a4 R12: ffffe8ffff000500
[159001.581748] R13: 0000000000000002 R14: 0000000000000000 R15: ffffffff82280fc8
[159001.581751]  ? cpuidle_enter_state+0xf9/0x210
[159001.581753]  cpuidle_enter+0x24/0x40
[159001.581755]  do_idle+0x1bb/0x210
[159001.581757]  cpu_startup_entry+0x14/0x20
[159001.581760]  start_secondary+0x138/0x160
[159001.581762]  secondary_startup_64+0xa4/0xb0
[159001.581765] Modules linked in: vhost_net vhost tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables tun nfsd auth_rpcgss oid_registry lockd grace sunrpc nls_iso8859_1 vfat fat binfmt_misc x86_pkg_temp_thermal kvm_intel kvm dummy crct10dif_pclmul f71882fg crc32_pclmul crc32c_intel coretemp ghash_clmulni_intel iTCO_wdt iTCO_vendor_support aesni_intel crypto_simd i2c_i801 cryptd i2c_core glue_helper ehci_pci xhci_pci ehci_hcd r8169 xhci_hcd realtek video pm80xx usbcore evdev libphy backlight mei_me ie31200_edac thermal fan lpc_ich usb_common mei mfd_core
[159001.772339] CR2: 0000000000000010
[159001.772352] ---[ end trace b6d7ee3b66aa4c6a ]---
[159001.772357] RIP: 0010:intel_unmap_sg+0x16/0xb0
[159001.772358] Code: 5b 48 89 ef 5d 41 5c 4c 8b 00 41 5d 41 5e e9 c1 e4 ff ff 90 41 57 41 89 cf 41 56 41 55 49 89 fd 41 54 41 89 d4 55 48 89 f5 53 <4c> 8b 76 10 4c 89 c3 e8 5e fc ff ff 49 81 e6 00 f0 ff ff 84 c0 74
[159001.772359] RSP: 0018:ffffc900000dcde0 EFLAGS: 00010046
[159001.772361] RAX: 0000000000000000 RBX: ffff8887ec602100 RCX: 0000000000000000
[159001.772361] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8887f9ca50b0
[159001.772362] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[159001.772362] R10: ffffffff815da020 R11: 000000000001d604 R12: 0000000000000000
[159001.772363] R13: ffff8887f9ca50b0 R14: 0000000000000004 R15: 0000000000000000
[159001.772364] FS:  0000000000000000(0000) GS:ffff8887fe600000(0000) knlGS:0000000000000000
[159001.772365] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[159001.772366] CR2: 0000000000000010 CR3: 000000043acdb003 CR4: 00000000001626e0
[159001.880183] Kernel panic - not syncing: Fatal exception in interrupt
[159001.887631] Kernel Offset: disabled
[159001.892206] Rebooting in 15 seconds..
[159016.995069] ACPI MEMORY or I/O RESET_REG.


Expected Results:  
no kernel panic

echo "### add new disks"
btrfs dev add /dev/sdi1 /dev/sdj1 /DATA/BACKUP

echo "### shrink old disks to zero -> move data to new disks by 1GB step; supports ^C to terminate the shrink"
for i in {2770..0}; do ยจ
	echo -ne "$i "; 
	btrfs fi resize 5:${i}G /DATA/BACKUP/ || break
	btrfs fi resize 6:${i}G /DATA/BACKUP/ || break
	sleep 0.01; 
done
Comment 1 Jonas Stein gentoo-dev 2020-05-22 20:25:09 UTC
It is sad to read that you have problems with the hardware or kernel. The situation seems to be a bit more complicate and requires some analysis.
We can not help you efficiently via bug tracker. The bug tracker aims rather on specific problems in .ebuilds and less on individual systems. 

I have had very good experience on the gentoo IRC [1] with questions like this. Of course there are also forums and mailing lists [2,3].
I hope you understand, that I will close the bug here therefore and wish you good luck on one of the mentioned channels [4].
Please reopen the ticket in order to provide an indication for an specific error in an ebuild or any gentoo related product.

[1] https://www.gentoo.org/get-involved/irc-channels/
[2] https://forums.gentoo.org/
[3] https://www.gentoo.org/get-involved/mailing-lists/all-lists.html
[4] https://www.gentoo.org/support/