Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 930464 - sys-kernel/gentoo-sources-6.7.9: mkfs on luks USB flash drive blocks forever with "reset SuperSpeed USB device"
Summary: sys-kernel/gentoo-sources-6.7.9: mkfs on luks USB flash drive blocks forever ...
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: https://lore.kernel.org/cryptsetup/RA...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-04-22 23:23 UTC by Daniel Pouzzner
Modified: 2024-06-15 09:02 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
kernel config for 6.7.9-gentoo (linux-6.7.9-gentoo.config.gz,40.89 KB, application/gzip)
2024-04-23 22:51 UTC, Daniel Pouzzner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Pouzzner 2024-04-22 23:23:28 UTC
I believe this is an upstream kernel problem but I'll get the ball rolling here.

I'm finding that on kernel 6.7.9, with a freshly created and opened luks partition, mkfs goes into D wait, with "reset SuperSpeed USB device number # using xhci_hcd" repeating every 30 seconds on each device.

The behavior occurs with either mkfs.btrfs or mkfs.ext4, and with either luks1 or luks2.

This is new behavior.  I've used the same procedure, same hardware, and same/similar media, dozens of times before without seeing this.

The hardware itself is a pair of Kingston FCR-HS4 flash readers, and the behavior occurs on both.  The media are 512GB microSDXC.  I've tried it with fresh new Lexar and Sandisk media, both SKUs that have worked before.  Same bad result.

If I run mkfs.btrfs directly on the GPT partition, that succeeds.

If I luksOpen and mount previously formatted and mkfs.btrfs'd media, and access it extensively for an incremental backup run, that succeeds.

So what is specifically failing is mkfs on a luks partition.

The kernel log provides no information, just entries like this:

[Mon Apr 22 18:08:59 2024] usb 2-2.3: reset SuperSpeed USB device number 12 using xhci_hcd
[Mon Apr 22 18:09:04 2024] usb 4-3: reset SuperSpeed USB device number 10 using xhci_hcd
[Mon Apr 22 18:09:29 2024] usb 2-2.3: reset SuperSpeed USB device number 12 using xhci_hcd
[Mon Apr 22 18:09:36 2024] usb 4-3: reset SuperSpeed USB device number 10 using xhci_hcd
[Mon Apr 22 18:10:06 2024] usb 2-2.3: reset SuperSpeed USB device number 12 using xhci_hcd
[Mon Apr 22 18:10:14 2024] usb 4-3: reset SuperSpeed USB device number 10 using xhci_hcd
[Mon Apr 22 18:10:44 2024] usb 2-2.3: reset SuperSpeed USB device number 12 using xhci_hcd
[Mon Apr 22 18:10:52 2024] usb 4-3: reset SuperSpeed USB device number 10 using xhci_hcd


When the problem first occured, I discovered I was able to free up the stuck mkfs process by pulling the cords on the Kingston readers, and then free up the luks contexts with "cryptsetup close".  That's handy, but I'd rather have it, you know, work.
Comment 1 Jonas Stein gentoo-dev 2024-04-23 18:34:05 UTC
It is sad to read that you have problems with USB. The situation seems to be a bit more complicate and requires some analysis.
We can not help you efficiently via bug tracker. The bug tracker aims rather on specific problems in .ebuilds and less on individual systems. 

I have had very good experience on the gentoo IRC [1] with questions like this. Of course there are also forums and mailing lists [2,3].
I hope you understand, that I will close the bug here therefore and wish you good luck on one of the mentioned channels [4].
Please reopen the ticket in order to provide an indication for an specific error in an ebuild or any gentoo related product.

[1] https://www.gentoo.org/get-involved/irc-channels/
[2] https://forums.gentoo.org/
[3] https://www.gentoo.org/get-involved/mailing-lists/all-lists.html
[4] https://www.gentoo.org/support/
Comment 2 Daniel Pouzzner 2024-04-23 19:01:45 UTC
Followup on the above:

Using an older system running kernel 5.4, but with the same flash readers and media, mkfs.btrfs succeeded immediately.  When I returned the readers to the original system, the new filesystem mounted without delay or errors.

It is fairly clear from the evidence that there has been a regression in the kernel.  A quick look through the applicable genpatches set didn't turn up anything obvious, and the problem wasn't there in January on kernel 6.4.3.
Comment 3 Jonas Stein gentoo-dev 2024-04-23 22:46:54 UTC
thank you for the additional information, could you please add the kernel config just in case? Thanks.
Comment 4 Daniel Pouzzner 2024-04-23 22:51:38 UTC
Created attachment 891588 [details]
kernel config for 6.7.9-gentoo
Comment 5 Daniel Pouzzner 2024-04-23 22:59:57 UTC
Also, a diff of the configs for the 6.4.3 and 6.7.9 kernels doesn't turn up anything obviously related.

Another data point: for a couple hours I've had a btrfs send/receive pipeline writing to the filesystem that I mkfs'd from the old kernel 5.4 installation.  It's 100 GB into the transfer and going strong, no kernel messages, certainly no resets.

So the problem seems to be very specific to mkfs over dm-crypt on this kernel.  I'm not currently in a position to bisect the problem, but I'm satisfied that this is probably an upstream kernel bug.

I have an email in to one of the core Linux kernel devs for advice on next steps.
Comment 6 Mike Pagano gentoo-dev 2024-05-08 17:21:57 UTC
(In reply to Daniel Pouzzner from comment #5)
> Also, a diff of the configs for the 6.4.3 and 6.7.9 kernels doesn't turn up
> anything obviously related.
> 
> Another data point: for a couple hours I've had a btrfs send/receive
> pipeline writing to the filesystem that I mkfs'd from the old kernel 5.4
> installation.  It's 100 GB into the transfer and going strong, no kernel
> messages, certainly no resets.
> 
> So the problem seems to be very specific to mkfs over dm-crypt on this
> kernel.  I'm not currently in a position to bisect the problem, but I'm
> satisfied that this is probably an upstream kernel bug.
> 
> I have an email in to one of the core Linux kernel devs for advice on next
> steps.

Any information back from the developer?
Comment 7 Daniel Pouzzner 2024-05-08 17:49:54 UTC
Now that you mention it, nope, nothing.

I also sent a report to https://lore.kernel.org/cryptsetup/ (https://lore.kernel.org/cryptsetup/10e060de0431f88edeaf7fa395965c1763a6b749.camel@mega.nu/) but that hasn't led to anything particularly useful either.

I have my workaround for now -- mkfs on an older laptop (kernel 5.4) -- and I'll keep an eye on whether the problem is resolved after kernel upgrades.

If the problem persists, it's inevitable there will be more reports of it.  I'm assuming it's not some weird quirk of my kernel config, because this all worked fine on kernel 6.4.3 with substantially the same kernel config.

It's not hard to provoke the syndrome -- just sfdisk, cryptsetup luksFormat, cryptsetup open, and mkfs, with the underlying device a USB flash card reader.

It would be really interesting to hear results from others with kernel 6.7.* (expect to fail), and 6.8.* or 6.9-RC.  Also results on kernel 6.5 and 6.6, which would help bisect it.
Comment 8 Mike Pagano gentoo-dev 2024-06-14 16:33:51 UTC
Have you tried the suggested workaround ?


Kernel parameter: usb_storage.quirks
Comment 9 Daniel Pouzzner 2024-06-14 16:58:13 UTC
I'm waiting until I have a session up with a more recent kernel, to determine if the problem still exists at all.  The problem I was seeing is (so far) unique to mkfs invocation, and the usb_storage.quirks approach is general in impact and reported to fix a different problem than I'm seeing, so I'm holding off on trying it.

As I mentioned earlier, I unblocked myself with mkfs on an older system.
Comment 10 Mike Pagano gentoo-dev 2024-06-15 09:02:12 UTC
Ok, let us know if this is still a problem