I use а Gentoo EFI VM with a minimal kernel options on ESX 6.7 (Westmere CPU). Every package is unmasked. Beginning with kernel verion 6.5, my server started to fail at every boot. CDROM is present but it is not loaded. I used genkernel before, then tried to use dracut - that solved the problem, but with initramfs built with dracut I see that CDROM module significantly slows the boot (and reboot) process. With initramfs built with genkernel the boot process hangs on determining the root partition (udev says about it). Genkernel 4.3.6 (2023-09-28 23:28:37 UTC). Linux kernel 6.5.5-gentoo-x86_64-2023-09-29-02-23-09 Activating udev... Determining root device (trying UUID=7a47f5d9-bde2-4004-82a4-8cb9c336564a) It seems that in kernel version 6.5 udev works badly with CDROM module. With kernel 6.4.x everything boots OK. Reproducible: Always Steps to Reproduce: Build the kernel & initramfs using these steps: > cat > ~/allno.config << EOF CONFIG_SWAP=y CONFIG_NO_HZ_IDLE=y CONFIG_HIGH_RES_TIMERS=y CONFIG_CPU_ISOLATION=y CONFIG_IKCONFIG=m CONFIG_IKCONFIG_PROC=y CONFIG_BLK_DEV_INITRD=y CONFIG_RD_ZSTD=y CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y CONFIG_EXPERT=y CONFIG_MULTIUSER=y CONFIG_POSIX_TIMERS=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_SMP=y CONFIG_DMI=y CONFIG_ARCH_RANDOM=y CONFIG_JUMP_LABEL=y CONFIG_VMAP_STACK=y CONFIG_BLOCK=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_PARTITION_ADVANCED=y CONFIG_MSDOS_PARTITION=y CONFIG_BINFMT_ELF=y CONFIG_BINFMT_SCRIPT=y CONFIG_COREDUMP=y CONFIG_COMPACTION=y CONFIG_PACKET=m CONFIG_PCI=y CONFIG_PCI_MSI=y CONFIG_PCI_QUIRKS=y CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_ALLOW_DEV_COREDUMP=y CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y CONFIG_NETDEVICES=y CONFIG_INPUT_MOUSEDEV=m CONFIG_INPUT_EVDEV=m CONFIG_INPUT_KEYBOARD=y CONFIG_TTY=y CONFIG_VT=y CONFIG_CONSOLE_TRANSLATIONS=y CONFIG_VT_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_HW_RANDOM=m CONFIG_RANDOM_TRUST_BOOTLOADER=y CONFIG_FB=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_RTC_CLASS=y CONFIG_RTC_HCTOSYS=y CONFIG_RTC_SYSTOHC=y CONFIG_RTC_INTF_SYSFS=y CONFIG_RTC_INTF_PROC=y CONFIG_RTC_INTF_DEV=y CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_FAT_DEFAULT_CODEPAGE=866 CONFIG_FAT_DEFAULT_IOCHARSET="cp1251" CONFIG_PROC_SYSCTL=y CONFIG_NLS_DEFAULT="utf8" CONFIG_NLS_CODEPAGE_866=m CONFIG_NLS_CODEPAGE_1251=m CONFIG_NLS_UTF8=m CONFIG_KERNEL_ZSTD=y CONFIG_64BIT=y CONFIG_PROCESSOR_SELECT=y CONFIG_CPU_SUP_INTEL=y CONFIG_X86_MCE=y CONFIG_X86_MCE_INTEL=y CONFIG_NUMA=y CONFIG_BOOTPARAM_HOTPLUG_CPU0=y CONFIG_ACPI=y CONFIG_PCI_MMCONFIG=y CONFIG_IA32_EMULATION=y CONFIG_PNP_DEBUG_MESSAGES=y CONFIG_SCSI_SCAN_ASYNC=y CONFIG_KEYBOARD_ATKBD=m CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=m CONFIG_HW_RANDOM_INTEL=m CONFIG_RANDOM_TRUST_CPU=y CONFIG_RTC_DRV_CMOS=y CONFIG_RAS=y CONFIG_GENTOO_LINUX=y CONFIG_GENTOO_LINUX_UDEV=y CONFIG_GENTOO_LINUX_PORTAGE=y CONFIG_GENTOO_LINUX_INIT_SYSTEMD=y CONFIG_GENTOO_PRINT_FIRMWARE_INFO=y CONFIG_EXT4_FS=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_HZ_100=y CONFIG_PREEMPT_NONE=y CONFIG_BLK_DEV=y CONFIG_BLK_DEV_SR=m CONFIG_ISO9660_FS=m CONFIG_JOLIET=y CONFIG_UDF_FS=m CONFIG_PCIEPORTBUS=y CONFIG_PCIEASPM=y CONFIG_HID_SUPPORT=y CONFIG_HID=m CONFIG_HIDRAW=y CONFIG_HID_GENERIC=m CONFIG_USB_HID=m CONFIG_USB_HIDDEV=y CONFIG_USB_SUPPORT=y CONFIG_USB=m CONFIG_USB_PCI=y CONFIG_USB_ANNOUNCE_NEW_DEVICES=y CONFIG_USB_DEFAULT_PERSIST=y CONFIG_USB_XHCI_HCD=m CONFIG_USB_EHCI_HCD=m CONFIG_USB_EHCI_TT_NEWSCHED=y CONFIG_USB_OHCI_HCD=m CONFIG_USB_OHCI_HCD_PCI=m CONFIG_USB_UHCI_HCD=m CONFIG_USB_STORAGE=m CONFIG_USB_SERIAL=m CONFIG_TYPEC=m CONFIG_EFI=y CONFIG_EFI_PARTITION=y CONFIG_EFIVAR_FS=m CONFIG_FB_EFI=y CONFIG_MWESTMERE=y CONFIG_CRYPTO_AES_NI_INTEL=m CONFIG_HYPERVISOR_GUEST=y CONFIG_PARAVIRT=y CONFIG_X86_IOPL_IOPERM=y CONFIG_VSOCKETS=m CONFIG_VMWARE_VMCI_VSOCKETS=m CONFIG_VMWARE_BALLOON=m CONFIG_VMWARE_VMCI=m CONFIG_SCSI_LOWLEVEL=y CONFIG_VMWARE_PVSCSI=y CONFIG_ATA=m CONFIG_ATA_VERBOSE_ERROR=y CONFIG_ATA_ACPI=y CONFIG_ATA_SFF=y CONFIG_ATA_BMDMA=y CONFIG_ATA_PIIX=m CONFIG_PATA_ACPI=m CONFIG_VMXNET3=m CONFIG_MOUSE_PS2_VMMOUSE=y CONFIG_I2C_PIIX4=m CONFIG_AGP=y CONFIG_AGP_INTEL=y CONFIG_DRM=y CONFIG_DRM_FBDEV_EMULATION=y CONFIG_DRM_VMWGFX=m CONFIG_DRM_VMWGFX_FBCON=y CONFIG_FUSE_FS=m CONFIG_X86_64_ACPI_NUMA=y CONFIG_ACPI_AC=m CONFIG_ACPI_BATTERY=m CONFIG_ACPI_BUTTON=m CONFIG_ACPI_FAN=m CONFIG_ACPI_PROCESSOR=m CONFIG_ACPI_THERMAL=m CONFIG_ACPI_NUMA=y CONFIG_ACPI_APEI=y CONFIG_X86_PM_TIMER=y CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_GOV_ONDEMAND=m CONFIG_X86_INTEL_PSTATE=y CONFIG_X86_ACPI_CPUFREQ=m CONFIG_INTEL_IDLE=y CONFIG_NETFILTER=y CONFIG_NETFILTER_ADVANCED=y CONFIG_NF_CONNTRACK=m CONFIG_NF_TABLES=m CONFIG_NF_TABLES_INET=y CONFIG_NFT_CT=m CONFIG_NFT_REJECT=m CONFIG_NF_REJECT_IPV4=m CONFIG_NF_REJECT_IPV6=m CONFIG_KEY_DH_OPERATIONS=y CONFIG_AIO=y CONFIG_MAGIC_SYSRQ=y EOF > cd /usr/src/linux > make KCONFIG_ALLCONFIG=~/allno.config allnoconfig > genkernel all --install --kernel-append-localversion="-$(date +%Y-%m-%d-%H-%M-%S)" > grub-mkconfig -o /boot/grub/grub.cfg > reboot Actual Results: Boot hangs Expected Results: System loads
Can you do a git bisect between the last working kernel and the first non-working one ?
Do these symptoms seem similar ? https://lore.kernel.org/lkml/ZO2RlYCDl8kmNHnN@torres.zugschlus.de/
(In reply to Mike Pagano from comment #1) > Can you do a git bisect between the last working kernel and the first > non-working one ? To get new versions of packege I do not use git, I use portage. I do not understand how to use git bisect in my situation.
(In reply to Mike Pagano from comment #2) > Do these symptoms seem similar ? > > https://lore.kernel.org/lkml/ZO2RlYCDl8kmNHnN@torres.zugschlus.de/ Yes, they seem similar, but my problem is CDROM - so they only seem similar.
(In reply to Mike Pagano from comment #1) > Can you do a git bisect between the last working kernel and the first > non-working one ? I found out how to use git bisect, working on it...
Upstream commit 2132df16f53b4f01ab25f5d404f36a22244ae342 should fix the issue. commit 2132df16f53b4f01ab25f5d404f36a22244ae342 Author: Damien Le Moal <dlemoal@kernel.org> Date: Fri Sep 15 11:20:34 2023 +0900 scsi: core: ata: Do no try to probe for CDL on old drives Some old drives (e.g. an Ultra320 SCSI disk as reported by John) do not seem to execute MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES commands correctly and hang when a non-zero service action is specified (one command format with service action case in scsi_report_opcode()). Currently, CDL probing with scsi_cdl_check_cmd() is the only caller using a non zero service action for scsi_report_opcode(). To avoid issues with these old drives, do not attempt CDL probe if the device reports support for an SPC version lower than 5 (CDL was introduced in SPC-5). To keep things working with ATA devices which probe for the CDL T2A and T2B pages introduced with SPC-6, modify ata_scsiop_inq_std() to claim SPC-6 version compatibility for ATA drives supporting CDL. SPC-6 standard version number is defined as Dh (= 13) in SPC-6 r09. Fix scsi_probe_lun() to correctly capture this value by changing the bit mask for the second byte of the INQUIRY response from 0x7 to 0xf. include/scsi/scsi.h is modified to add the definition SCSI_SPC_6 with the value 14 (Dh + 1). The missing definitions for the SCSI_SPC_4 and SCSI_SPC_5 versions are also added. Reported-by: John David Anglin <dave.anglin@bell.net> Fixes: 624885209f31 ("scsi: core: Detect support for command duration limits") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230915022034.678121-1-dlemoal@kernel.org Tested-by: David Gow <david@davidgow.net> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In my environment git bitsect gave such result: 624885209f31eb9985bf51abe204ecbffe2fdeea is the first bad commit commit 624885209f31eb9985bf51abe204ecbffe2fdeea Author: Damien Le Moal <dlemoal@kernel.org> Date: Thu May 11 03:13:41 2023 +0200 scsi: core: Detect support for command duration limits Introduce the function scsi_cdl_check() to detect if a device supports command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32 and WRITE 32 commands are checked using the function scsi_report_opcode() to probe the rwcdlp and cdlp bits as they indicate the mode page defining the command duration limits descriptors that apply to the command being tested. If any of these commands support CDL, the field cdl_supported of struct scsi_device is set to 1 to indicate that the device supports CDL. Support for CDL for a device is advertizes through sysfs using the new cdl_supported device attribute. This attribute value is 1 for a device supporting CDL and 0 otherwise. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-9-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Documentation/ABI/testing/sysfs-block-device | 9 ++++ drivers/scsi/scsi.c | 81 ++++++++++++++++++++++++++++ drivers/scsi/scsi_scan.c | 3 ++ drivers/scsi/scsi_sysfs.c | 2 + include/scsi/scsi_device.h | 3 ++ 5 files changed, 98 insertions(+)
Thanks all! commit 37ee7bd247fcae6c7a17e312250baaf379d80bc8 Author: Damien Le Moal <dlemoal@kernel.org> Date: Fri Sep 15 11:20:34 2023 +0900 scsi: core: ata: Do no try to probe for CDL on old drives commit 2132df16f53b4f01ab25f5d404f36a22244ae342 upstream. -> fixed in 6.5.6
Thank you very much.