Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 935347 - install-amd64-minimal-20240630T170408Z.iso: Terminal permanently stops responding to input (freezes)
Summary: install-amd64-minimal-20240630T170408Z.iso: Terminal permanently stops respon...
Status: UNCONFIRMED
Alias: None
Product: Gentoo Release Media
Classification: Unclassified
Component: InstallCD (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: Gentoo Release Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-02 13:27 UTC by Cam Spiers
Modified: 2024-07-03 11:14 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Cam Spiers 2024-07-02 13:27:12 UTC
Overview:

While attempting a Gentoo install using the Minimal Installation CD (on the machine, not via ssh), I observed several instances of the terminal becoming indefinitely unresponsive to input (until reboot).

The live cd successfully made it into a bash session, and I was able to run a number of steps, e.g. opening the handbook via links inside tmux, partitioning a disk. But randomly with no apparent consistency in what triggers it, the terminal would become completely unresponsive to input, and the cursor would stop blinking. The only option apparent to me was to reboot. I booted into the live media multiple times, but it would inevitably freeze.


Situations in which it "froze":

- Simply at a bash prompt without running any command (nothing user triggered even in the background)
- While using links inside a tmux session
- While using fdisk


Steps I tried during the "freeze":

- Ctrl-c
- Ctrl-z
- Ctrl-b + :kill-server (while in tmux)


Reproducible: Always

Steps to Reproduce:
1. Boot installation media
2. Wait between 0 and 20mins (even simply at a prompt without running anything)
Actual Results:  
Obviously the above "Steps to Reproduce" aren't the trigger of the issue, but the result is that at some point while in the live cd the terminal will become completely unresponsive.

Expected Results:  
It shouldn't become completely unresponsive.

Version of installation media used:

install-amd64-minimal-20240630T170408Z.iso

Method of preparing installation media:

sudo dd if=install-amd64-minimal-20240630T170408Z.iso of=/dev/sda bs=4096 status=progress && sync

Hardware:

MB: ASUS TUF Gaming B650-PLUS
RAM: XPG Lancer RGB DDR5 6000MHz 64GB (2x32GB) CL30 (ADATA AX5U6000C3032G-DCLARBK)
CPU: Ryzen 9 7950X3D
GPU: MSI 4080 16GB GAMING X TRIO

My RAM is running an XMP profile, however I ran memtest per immolo's suggestion, and a full run of memtest passed. I have been running Ubuntu, Fedora and Windows on this machine for the past year without any similar issue (I spend the most time in Ubuntu).

immolo in the gentoo discord provided me a different minimal installation CD build with a different kernel (install-amd64-minimal-NM.iso, sha256 = 4782f23b3f91fafed7b5ceccfb13b874a220009684821193475353d78afabba1). When using this version for over an hour I did not observe any freezes.
Comment 1 immolo 2024-07-02 13:38:32 UTC
To add some notes to this,

I originally became suspicious of genkernel being the issue when the livegui had trouble but the installcd was freezing. Running through all the troubleshooting steps to narrow it down to a genkernel issue with a 7950x3d but as it's only confirmed in the installcd I've marked it as a releng bug and cc the genkernel project in.
Comment 2 immolo 2024-07-02 13:40:55 UTC
to clarify again, 

livegui (dist-kernel) - works fine
installcd with genkernel - hard locks after 30 minutes
installcd with dist-kernel - works fine
Comment 3 Ben Kohler gentoo-dev 2024-07-02 14:31:07 UTC
Are you able to ssh in from another machine and see if it's otherwise responsive, and if there are any errors in dmesg?
Comment 4 immolo 2024-07-02 16:54:31 UTC
As discussed with Ben on IRC this is a full lock which suggested it could be a powermgmt issue.

I can't see anything missing in releng kconfig from a quick look at https://github.com/gentoo/releng/blob/master/releases/kconfig/amd64/amd64-6.6.30.config (I forgot to check this before CCing genkernel so apologises if unneeded.)

For users hitting this right now please use the latest version of the livegui when using a 7000x3d CPU which affected by this bug as that will allow you get around the issue and install Gentoo as normal.

Anyone that wish to help provide more info to solve this fully then please can you mount a second storage device to the system which won't wipe at shutdown and run "tail -f /var/log/dmesg > /mnt/<2nddrv>" on a 7950x3d for the full 30 minutes until it hard locks then tar.xz the log file and attach it to this bug report so we can all look further into it.
Comment 5 Cam Spiers 2024-07-02 21:09:57 UTC
I should be able to provide the output of dmesg during a crash within the next 10hrs.
Comment 6 Cam Spiers 2024-07-03 11:14:35 UTC
Running the following after booting into live CD:

```
mount /dev/nvme2n1p2 /mnt/ubuntu-root
tail -f /var/log/dmesg > /mnt/ubuntu-root/dmesg.log
```

I captured the following:

```
[   15.985254] Loading firmware: mediatek/WIFI_MT7961_patch_mcu_1_2_hdr.bin
[   15.988903] mt7921e 0000:09:00.0: HW/SW Version: 0x8a108a10, Build Time: 20240219110958a

[   15.998642] Loading firmware: mediatek/WIFI_RAM_CODE_MT7961_1.bin
[   15.998739] mt7921e 0000:09:00.0: WM Firmware Version: ____010000, Build Time: 20240219111038
[   16.030756] Loading firmware: mediatek/WIFI_RAM_CODE_MT7961_1.bin
[   16.081037] wl: loading out-of-tree module taints kernel.
[   16.081040] wl: module license 'MIXED/Proprietary' taints kernel.
[   16.081041] Disabling lock debugging due to kernel taint
[   16.081042] wl: module license taints kernel.
```

However I will note that this output happened right after boot. There was nothing more added when the freeze occurred. (I confirmed this because I catted the dmesg.log after boot, and took a photo of it).