Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 559540 - sys-apps/openrc: consolefont needs after modules in its dependencies
Summary: sys-apps/openrc: consolefont needs after modules in its dependencies
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: OpenRC (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: OpenRC Team
URL:
Whiteboard:
Keywords:
Depends on: 560066
Blocks: 523600
  Show dependency tree
 
Reported: 2015-09-03 17:17 UTC by Tobias Klausmann (RETIRED)
Modified: 2015-10-29 16:51 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
wait-tty.patch (wait-tty.patch,435 bytes, patch)
2015-10-08 00:02 UTC, William Hubbs
Details | Diff
vanilla kernel config for v4.3.0-rc3 (kernelconfig-4.3.0-rc3,97.60 KB, text/plain)
2015-10-08 08:31 UTC, Tobias Klausmann (RETIRED)
Details
dmesg output, vanilla 4.3.0-rc3, radeon FB support as a module (ab,51.18 KB, text/plain)
2015-10-08 15:56 UTC, Tobias Klausmann (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Klausmann (RETIRED) gentoo-dev 2015-09-03 17:17:48 UTC
I recently upgraded my system drive to a (very fast) SSD.

Since then, the consolefont init script would fail with:

putfont: KDFONTOP: Invalid argument

Running "consolefont start" (or just "rc" for less typing) after full bootup+login would work just fine.

With the new SSD, the init script first runs while the kernel video mode switch is still ongoing/settling, so I suspected that there was a race condition between KMS and consolefont.

I asked WilliamH on IRC about this and axs suggest I add udev-settle to the sysinit runlevel.

The result is twofold:

- Font loading now works without trouble
- There is a (barely noticeable, maybe 200ms) delay in the boot process.

I don't really care about the slowdown, but it may indicate that this is not a real fix for the race condition but one that could have been achieved by something like "sleep 0.2". If it a proper fix, consolefont should have a dep on udev-settle.
Comment 1 William Hubbs gentoo-dev 2015-10-06 16:58:15 UTC
(In reply to Tobias Klausmann from comment #0)
> I recently upgraded my system drive to a (very fast) SSD.
> 
> Since then, the consolefont init script would fail with:
> 
> putfont: KDFONTOP: Invalid argument
> 
> Running "consolefont start" (or just "rc" for less typing) after full
> bootup+login would work just fine.
> 
> With the new SSD, the init script first runs while the kernel video mode
> switch is still ongoing/settling, so I suspected that there was a race
> condition between KMS and consolefont.
> 
> I asked WilliamH on IRC about this and axs suggest I add udev-settle to the
> sysinit runlevel.
> 
> The result is twofold:
> 
> - Font loading now works without trouble
> - There is a (barely noticeable, maybe 200ms) delay in the boot process.
> 
> I don't really care about the slowdown, but it may indicate that this is not
> a real fix for the race condition but one that could have been achieved by
> something like "sleep 0.2". If it a proper fix, consolefont should have a
> dep on udev-settle.

consolefont can't have a dep on udev-settle,, and reading this bug, I doubt that adding udev-settle to any runlevel is the proper fix.

Can you give more info about how setfont works and what it may be looking for?
Comment 2 Tobias Klausmann (RETIRED) gentoo-dev 2015-10-07 11:30:00 UTC
The fundamental problem here is that the font-loading depends on the framebuffer device in the kernel (and, tangentially, the Kernel Mode Switching infrastructure). The initialization of that happens slow enough that it may still be going on by the time the consolefont init script runs. This is exacerbated by using SSDs and making the framebuffer device a module. The latter specifically means that the functionality consolefont relies on is only available after udev has seen the device, loaded the module and the initialization of the driver+hw has completed.

I don't know if scripts like consolefont can rely on specific devices being available, or if availability-of-fb0 is racy with can-load-fonts.
Comment 3 Richard Yao (RETIRED) gentoo-dev 2015-10-07 23:18:48 UTC
(In reply to Tobias Klausmann from comment #2)
> The fundamental problem here is that the font-loading depends on the
> framebuffer device in the kernel (and, tangentially, the Kernel Mode
> Switching infrastructure). The initialization of that happens slow enough
> that it may still be going on by the time the consolefont init script runs.
> This is exacerbated by using SSDs and making the framebuffer device a
> module. The latter specifically means that the functionality consolefont
> relies on is only available after udev has seen the device, loaded the
> module and the initialization of the driver+hw has completed.
> 
> I don't know if scripts like consolefont can rely on specific devices being
> available, or if availability-of-fb0 is racy with can-load-fonts.

Which console driver are you using? There are several:

http://lxr.free-electrons.com/source/drivers/video/console/

The EINVAL is coming from the kernel on what should be a valid tty device. The effect of `udev-settle` could be to slow down boot enough to win a race inside the kernel. I will need to know the console driver being used in order to dig deeper.

Would you also attach your kernel .config (which includes the kernel version) and specify which kernel source ebuild you are using?
Comment 4 William Hubbs gentoo-dev 2015-10-08 00:02:00 UTC
Created attachment 414040 [details, diff]
wait-tty.patch

I would like it if you could try an experiment for me.

Remove udev-settle from the sysinit runlevel, then apply this patch to
consolefont. Let me know if this solves the issue. This patch causes the
script to wait for /dev/tty* before it tries the setfont.

Thanks much,

William
Comment 5 Richard Yao (RETIRED) gentoo-dev 2015-10-08 08:19:20 UTC
(In reply to William Hubbs from comment #4)
> Created attachment 414040 [details, diff] [details, diff]
> wait-tty.patch
> 
> I would like it if you could try an experiment for me.
> 
> Remove udev-settle from the sysinit runlevel, then apply this patch to
> consolefont. Let me know if this solves the issue. This patch causes the
> script to wait for /dev/tty* before it tries the setfont.
> 
> Thanks much,
> 
> William

Given that the open() in setfont succeeded, I do not think this patch will do what you intend for it to do. It would most likely just add a slightly delay to setfont doing the open()/ioctl() sequence that might enable us to win the race, but that is not a proper fix anymore than `sleep 0.2` would be.

By the way, given that we are actively working on this, would it be appropriate to change the status of this to IN_PROGRESS?
Comment 6 Tobias Klausmann (RETIRED) gentoo-dev 2015-10-08 08:30:40 UTC
The current kernel is a vanilla 4.3.0-rc3, but this has been happening since forever, at least the mid-3.x series I think. I've always run vanilla kernels, either from tgz or git.

I'll attach the .config I currently use. Haven't changed the FB/Graphics side in a long time (except for switching between mod drivers vs. baked in, but that never really made a difference).

The GRUB console is set to "console" aka text mode, and the GFXPAYLOAD is "preserve".

lspci -vvv output for the graphics card:

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM] (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 28
	Region 0: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at fe620000 (64-bit, non-prefetchable) [size=128K]
	Region 4: I/O ports at e000 [size=256]
	Expansion ROM at fe600000 [disabled] [size=128K]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0800c  Data: 41d1
	Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Kernel driver in use: radeon
Comment 7 Tobias Klausmann (RETIRED) gentoo-dev 2015-10-08 08:31:09 UTC
Created attachment 414048 [details]
vanilla kernel config for v4.3.0-rc3
Comment 8 Tobias Klausmann (RETIRED) gentoo-dev 2015-10-08 15:56:08 UTC
Created attachment 414130 [details]
dmesg output, vanilla 4.3.0-rc3, radeon FB support as a module

Note the initialization of the radeon drive happens about half a second after dracut switches the rootfs.
Comment 9 William Hubbs gentoo-dev 2015-10-29 15:45:54 UTC
C
Comment 10 William Hubbs gentoo-dev 2015-10-29 15:46:10 UTC
c
Comment 11 William Hubbs gentoo-dev 2015-10-29 15:55:45 UTC
The comment about framebuffer support being a module got my attention
when rereading this bug, and I noticed that consolefont doesn't have
"after modules" in its dependencies.  To test to see if this solves the
issue, add the following to /etc/conf.d/consolefont, again without
udev-settle being in any runlevel:

rc_after="modules"
Comment 12 Tobias Klausmann (RETIRED) gentoo-dev 2015-10-29 16:15:27 UTC
(In reply to William Hubbs from comment #11)
> The comment about framebuffer support being a module got my attention
> when rereading this bug, and I noticed that consolefont doesn't have
> "after modules" in its dependencies.  To test to see if this solves the
> issue, add the following to /etc/conf.d/consolefont, again without
> udev-settle being in any runlevel:
> 
> rc_after="modules"

I tried with this and it reproducibly fixes the matter (and removing it makes the error re-appear.
Comment 13 William Hubbs gentoo-dev 2015-10-29 16:51:16 UTC
This is fixed in commit 4cf6b0e which will be in OpenRC-0.19.