Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 836367 - Remove /usr/share/man from default inclusion list for docompress
Summary: Remove /usr/share/man from default inclusion list for docompress
Status: UNCONFIRMED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: PMS/EAPI (show other bugs)
Hardware: All All
: Normal normal (vote)
Assignee: PMS/EAPI
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-29 11:40 UTC by Alexis
Modified: 2023-04-09 12:23 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexis 2022-03-29 11:40:08 UTC
At the moment, using the system-man USE flag with the mandoc package results in unusable man pages. This is due to man pages getting compressed with bzip2 by default, which the mandoc binary currently doesn't handle.

The problem can be addressed by the sledgehammer of setting PORTAGE_COMPRESS to the empty string - which will unnecessarily additionally affect documentation installed in /usr/share/doc and /usr/share/info as well - then re-emerging any packages with man pages. Depending on how long the Gentoo installation has been in use, this might involve a substantial number of packages, some of which are heavy builds (e.g. the llvm package). One can, of course, write some shell to manually bunzip2 existing man pages and update all the related symlinks, but this should perhaps not be necessary, and in any event results in file conflict warnings the next time a package is emerge'd.

It seems unlikely that bzip2'ing man pages makes much noticeable difference in disk space usage relevant to the amount available on most systems nowadays. By removing /usr/share/man from the default inclusion list for docompress, the system-man USE flag will be more practically useful, and the sledgehammer of setting PORTAGE_COMPRESS to the empty string will not be required.

Reproducible: Always
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-29 11:45:58 UTC
I'm not thrilled with the idea of dropping it for everybody just because of a niche alternative implementation.

Anyway, does mandoc really support no compression at all? Or just not bzip2?
Comment 2 Alexis 2022-03-29 11:57:16 UTC
No compression at all, as far as i'm aware (and after double-checking the man pages for mandoc(1) and mandoc(3) just now). But i'd be happy to be wrong.

As a data point, when i first installed Gentoo, and installed the mandoc package with the system-man USE flag, i ended up with no man pages accessible, and, being new to Gentoo, no idea on how best to deal with it other than to not use the system-man USE flag with the mandoc package.
Comment 3 Mike Gilbert gentoo-dev 2022-03-29 19:06:59 UTC
I agree with sam here: installing uncompressed man pages by default is not a good solution for the average user.

The default inclusion list is defined in PMS, so changing it would require an update to that document.
Comment 4 Ulrich Müller gentoo-dev 2022-03-30 00:11:18 UTC
Presumably PORTAGE_COMPRESS_EXCLUDE_SUFFIXES in make.conf could be used to control this?
Comment 5 Alexis 2022-03-30 00:32:58 UTC
Okay. So i currently have 215 atoms listed in /var/lib/portage/world, resulting in 2000 entries (including symlinks) in /usr/share/man/man1. tar'ing and then bzip2'ing that directory:

36239360 man1.tar
 8015812 man1.tar.bz2

a saving of ~32M. For my own understanding, are there benefits to bzip2'ing other than this saving, particularly given that the immediate cost to people who enable `system-man` for `mandoc` is unusable man pages, with no indication as to what the problem is (well, until i added the `Mandoc` page to the wiki a couple of days ago), and whose fix requires either re-emerging world or (as i've done) writing some shell+awk?

@Ulrich, thanks for pointing out PORTAGE_COMPRESS_EXCLUDE_SUFFIXES; presumably i'd set that to something like "[1-9]"? That certainly removes the 'sledgehammer' aspect, though not the cost(s) i mention in the preceding para.
Comment 6 Mike Gilbert gentoo-dev 2022-03-30 00:41:47 UTC
(In reply to Alexis from comment #5)

The vast majority of Gentoo users use sys-apps/man-db, which supports several different compression formats.

Perhaps you should file a bug against app-text/mandoc to either add compression support or display a warning message to users upon installation.
Comment 7 Alexis 2022-03-30 01:19:22 UTC
(In reply to Mike Gilbert from comment #6)
> (In reply to Alexis from comment #5)
> 
> The vast majority of Gentoo users use sys-apps/man-db, which supports
> several different compression formats.
> 
> Perhaps you should file a bug against app-text/mandoc to either add
> compression support or display a warning message to users upon installation.

Yes, i assumed that the majority of Gentoo users use man-db. i guess i'm coming from the perspective: if Gentoo is providing `system-man` as a USE flag for `mandoc`, then surely from a QA point of view, the use of that flag should not result in a broken system?

It turns out adding bzip2 support to mandoc was discussed on the mandoc-discuss list in 2020, in the context of Gentoo support for mandoc; the start of the thread is here:

https://marc.info/?l=mandoc-discuss&m=160666427213578&w=2

The reply from Ingo Schwarze, the lead dev, is here:

https://marc.info/?l=mandoc-discuss&m=160668087317110&w=2

Excerpts from that:

> Compressing manual pages makes absolutely no sense to me in 2020.
>
> ...
>
> It would be easier to convince me to delete *.gz support than to
> add *.bz2 support.  Either is unrelated to the purpose of mandoc
> and linking to zlib is ugly and messy, i'd love to get rid of the
> dependency and of the rather ugly and error-prone code needed to
> support *.gz files.
> 
> Regarding *.bz2 in particular, i see very little chance to get
> /usr/lib/libbz2.so.*.* added to the OpenBSD base system.
>
> Regarding options, i think options are evil in general.  If you
> want an option, you need very good arguments why the feature is
> useful to such an ususual degree that it justifies an option.
> Here, i don't see any benefit at all to offset the cost in
> dependencies, complexity, and maintenance.

However, the above led me to actually test whether mandoc still has gzip support, and gzip'ing a man page followed by running `mandoc -a` on the result suggests it indeed does. If man pages are to compressed, is there any possibility of changing the default from bzip2 to gzip?
Comment 8 Mike Gilbert gentoo-dev 2022-03-30 01:40:19 UTC
(In reply to Alexis from comment #7)
> Yes, i assumed that the majority of Gentoo users use man-db. i guess i'm
> coming from the perspective: if Gentoo is providing `system-man` as a USE
> flag for `mandoc`, then surely from a QA point of view, the use of that flag
> should not result in a broken system?

Giving users the power to screw up their system is pretty normal for Gentoo. Sometimes it isn't possible to fully support every option available. We do usually try to give ample warning though.

If the "system-man" USE flag on mandoc is "unsafe", we could always mask or remove the USE flag.
Comment 9 Alexis 2022-03-30 01:58:20 UTC
(In reply to Mike Gilbert from comment #8)

> If the "system-man" USE flag on mandoc is "unsafe", we could always mask or
> remove the USE flag.

Fwiw, masking seems to me like a reasonable option in this case; it keeps the option available for those who really want it (such as myself), but it does at least convey "here [might] be dragons".
Comment 10 Ulrich Müller gentoo-dev 2022-03-30 06:18:47 UTC
(In reply to Alexis from comment #7)
> However, the above led me to actually test whether mandoc still has gzip
> support, and gzip'ing a man page followed by running `mandoc -a` on the
> result suggests it indeed does. If man pages are to compressed, is there any
> possibility of changing the default from bzip2 to gzip?

PORTAGE_COMPRESS="gzip" should do that.


(In reply to Alexis from comment #5)
> @Ulrich, thanks for pointing out PORTAGE_COMPRESS_EXCLUDE_SUFFIXES;
> presumably i'd set that to something like "[1-9]"? That certainly removes
> the 'sledgehammer' aspect, though not the cost(s) i mention in the preceding
> para.

Yes, adding "[1-9] n [013]p" should do that. (You may want to keep the default, see PORTAGE_COMPRESS_EXCLUDE_SUFFIXES in make.conf(5).)


On a general note, I believe it is unlikely that we will change the default of the docompress inclusion list (see PMS section 12.3.11, https://projects.gentoo.org/pms/8/pms.html#x1-13100012.3.11), because this is the ebuild default. The package manager may choose to override it depending on the user's settings (like PORTAGE_COMPRESS or PORTAGE_COMPRESS_EXCLUDE_SUFFIXES). Removing /usr/share/man from the list would make compression of these files impossible, and there are users who are happy with the current default.

The discussion makes me wonder though why this isn't user-configurable in Portage in an easier way? Having a variable like COMPRESS_MASK similar to INSTALL_MASK would help with the problem at hand.

CCing Portage team.
Comment 11 Alexis 2022-03-30 07:23:21 UTC
(In reply to Ulrich Müller from comment #10)
> > If man pages are to compressed, is there any
> > possibility of changing the default from bzip2 to gzip?
> 
> PORTAGE_COMPRESS="gzip" should do that.

Well, yes, for a particular Gentoo installation, and possibly well after the installation of a number of man pages that are bzip2-compressed, which would again result in existing man pages still being unusable once `system-man` is set for `mandoc`. However, by "the default", i meant, "the default used by Gentoo in the absence of any setting of PORTAGE_COMPRESS".

My primary concern at this point is that users aren't left with unusable man pages as a result of enabling `system-man` for `mandoc`. What i'm getting from this discussion is that the most that might be done is to mask this USE flag by default, so as to slightly raise the bar for users to get into the "unusable man pages" situation; and that, regardless, i should add a prominent warning to the Mandoc wiki page, noting the immediate consequences of changing this flag in the absence of the PORTAGE_COMPRESS* variables having already been changed from the defaults (i.e. prior to any installation of man pages on the system).

> Yes, adding "[1-9] n [013]p" should do that.

Okay, thanks, i'll update the Mandoc wiki page.
Comment 12 Alejandro Colomar 2023-04-09 12:22:51 UTC
man-db's man(1) seems to be extremely inefficient with .bz2 pages.
I suggest you reconsider :)

<https://lore.kernel.org/linux-man/53b0f991-7187-07ed-b2f8-4b6d8d7ffc3a@gmail.com/T/#m5e9037f5c562069a8945e346560a7ee8131df6be>

See the tests performed there, and consider running them on your own machine to confirm.