I recently upgraded my system drive to a (very fast) SSD. Since then, the consolefont init script would fail with: putfont: KDFONTOP: Invalid argument Running "consolefont start" (or just "rc" for less typing) after full bootup+login would work just fine. With the new SSD, the init script first runs while the kernel video mode switch is still ongoing/settling, so I suspected that there was a race condition between KMS and consolefont. I asked WilliamH on IRC about this and axs suggest I add udev-settle to the sysinit runlevel. The result is twofold: - Font loading now works without trouble - There is a (barely noticeable, maybe 200ms) delay in the boot process. I don't really care about the slowdown, but it may indicate that this is not a real fix for the race condition but one that could have been achieved by something like "sleep 0.2". If it a proper fix, consolefont should have a dep on udev-settle.
(In reply to Tobias Klausmann from comment #0) > I recently upgraded my system drive to a (very fast) SSD. > > Since then, the consolefont init script would fail with: > > putfont: KDFONTOP: Invalid argument > > Running "consolefont start" (or just "rc" for less typing) after full > bootup+login would work just fine. > > With the new SSD, the init script first runs while the kernel video mode > switch is still ongoing/settling, so I suspected that there was a race > condition between KMS and consolefont. > > I asked WilliamH on IRC about this and axs suggest I add udev-settle to the > sysinit runlevel. > > The result is twofold: > > - Font loading now works without trouble > - There is a (barely noticeable, maybe 200ms) delay in the boot process. > > I don't really care about the slowdown, but it may indicate that this is not > a real fix for the race condition but one that could have been achieved by > something like "sleep 0.2". If it a proper fix, consolefont should have a > dep on udev-settle. consolefont can't have a dep on udev-settle,, and reading this bug, I doubt that adding udev-settle to any runlevel is the proper fix. Can you give more info about how setfont works and what it may be looking for?
The fundamental problem here is that the font-loading depends on the framebuffer device in the kernel (and, tangentially, the Kernel Mode Switching infrastructure). The initialization of that happens slow enough that it may still be going on by the time the consolefont init script runs. This is exacerbated by using SSDs and making the framebuffer device a module. The latter specifically means that the functionality consolefont relies on is only available after udev has seen the device, loaded the module and the initialization of the driver+hw has completed. I don't know if scripts like consolefont can rely on specific devices being available, or if availability-of-fb0 is racy with can-load-fonts.
(In reply to Tobias Klausmann from comment #2) > The fundamental problem here is that the font-loading depends on the > framebuffer device in the kernel (and, tangentially, the Kernel Mode > Switching infrastructure). The initialization of that happens slow enough > that it may still be going on by the time the consolefont init script runs. > This is exacerbated by using SSDs and making the framebuffer device a > module. The latter specifically means that the functionality consolefont > relies on is only available after udev has seen the device, loaded the > module and the initialization of the driver+hw has completed. > > I don't know if scripts like consolefont can rely on specific devices being > available, or if availability-of-fb0 is racy with can-load-fonts. Which console driver are you using? There are several: http://lxr.free-electrons.com/source/drivers/video/console/ The EINVAL is coming from the kernel on what should be a valid tty device. The effect of `udev-settle` could be to slow down boot enough to win a race inside the kernel. I will need to know the console driver being used in order to dig deeper. Would you also attach your kernel .config (which includes the kernel version) and specify which kernel source ebuild you are using?
Created attachment 414040 [details, diff] wait-tty.patch I would like it if you could try an experiment for me. Remove udev-settle from the sysinit runlevel, then apply this patch to consolefont. Let me know if this solves the issue. This patch causes the script to wait for /dev/tty* before it tries the setfont. Thanks much, William
(In reply to William Hubbs from comment #4) > Created attachment 414040 [details, diff] [details, diff] > wait-tty.patch > > I would like it if you could try an experiment for me. > > Remove udev-settle from the sysinit runlevel, then apply this patch to > consolefont. Let me know if this solves the issue. This patch causes the > script to wait for /dev/tty* before it tries the setfont. > > Thanks much, > > William Given that the open() in setfont succeeded, I do not think this patch will do what you intend for it to do. It would most likely just add a slightly delay to setfont doing the open()/ioctl() sequence that might enable us to win the race, but that is not a proper fix anymore than `sleep 0.2` would be. By the way, given that we are actively working on this, would it be appropriate to change the status of this to IN_PROGRESS?
The current kernel is a vanilla 4.3.0-rc3, but this has been happening since forever, at least the mid-3.x series I think. I've always run vanilla kernels, either from tgz or git. I'll attach the .config I currently use. Haven't changed the FB/Graphics side in a long time (except for switching between mod drivers vs. baked in, but that never really made a difference). The GRUB console is set to "console" aka text mode, and the GFXPAYLOAD is "preserve". lspci -vvv output for the graphics card: 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM] (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 28 Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at fe620000 (64-bit, non-prefetchable) [size=128K] Region 4: I/O ports at e000 [size=256] Expansion ROM at fe600000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0800c Data: 41d1 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Kernel driver in use: radeon
Created attachment 414048 [details] vanilla kernel config for v4.3.0-rc3
Created attachment 414130 [details] dmesg output, vanilla 4.3.0-rc3, radeon FB support as a module Note the initialization of the radeon drive happens about half a second after dracut switches the rootfs.
C
c
The comment about framebuffer support being a module got my attention when rereading this bug, and I noticed that consolefont doesn't have "after modules" in its dependencies. To test to see if this solves the issue, add the following to /etc/conf.d/consolefont, again without udev-settle being in any runlevel: rc_after="modules"
(In reply to William Hubbs from comment #11) > The comment about framebuffer support being a module got my attention > when rereading this bug, and I noticed that consolefont doesn't have > "after modules" in its dependencies. To test to see if this solves the > issue, add the following to /etc/conf.d/consolefont, again without > udev-settle being in any runlevel: > > rc_after="modules" I tried with this and it reproducibly fixes the matter (and removing it makes the error re-appear.
This is fixed in commit 4cf6b0e which will be in OpenRC-0.19.