Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 785064 - break out locales from sys-libs/glibc to a separate package
Summary: break out locales from sys-libs/glibc to a separate package
Status: RESOLVED WONTFIX
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-22 13:34 UTC by Joakim Tjernlund
Modified: 2021-05-15 16:11 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joakim Tjernlund 2021-04-22 13:34:27 UTC
glibc's locale part should be split to a separate pkg.
All those locales take a lot of space and is source code to localedef.

A new locale pkg in minimal form should just install the selected locales into
locale-archive.

Wonder if gconv/iconv could be separate/USE flag too?
Comment 1 Sergei Trofimovich (RETIRED) gentoo-dev 2021-04-22 13:55:15 UTC
Locale arechives are binaries that are dependent on glibc version. Splitting locales out to a separate package will introduce the race window when system has new glibc and old (potentially incompatible) locales. glibc used to SIGSEGV on outdated locales.

You can avoid building all the locales by populating /etc/locale.gen with a needed set.
Comment 2 Joakim Tjernlund 2021-04-22 17:56:26 UTC
(In reply to Sergei Trofimovich from comment #1)
> Locale arechives are binaries that are dependent on glibc version. Splitting
> locales out to a separate package will introduce the race window when system
> has new glibc and old (potentially incompatible) locales. glibc used to
> SIGSEGV on outdated locales.

hmm, how long ago was SEGV? Make sure versions match with subslot or similar?  

> 
> You can avoid building all the locales by populating /etc/locale.gen with a
> needed set.

That still takes up a huge amount of disk space
Comment 3 SpanKY gentoo-dev 2021-04-22 19:02:56 UTC
the number of people who want to strip gconv/iconv & locales from their system is quite low.  certainly low enough to not justify the ebuild maintenance cost & stability risk for users.  the vast majority of people will want proper locale & encoding support in their systems.

if you're trying to create an extremely small embedded system, use INSTALL_MASK to strip out the files you don't want.
Comment 4 Joakim Tjernlund 2021-04-22 19:26:27 UTC
(In reply to SpanKY from comment #3)
> the number of people who want to strip gconv/iconv & locales from their
> system is quite low.  certainly low enough to not justify the ebuild
> maintenance cost & stability risk for users.  the vast majority of people
> will want proper locale & encoding support in their systems.

This is more about just installing the locales you need, not rm all locales.
The i18n dir is basically a waste of space( is is src code for building all
possible locales) after locale-gen has run. This space is on my system:
du -sh /usr/share/i18n/*
3,4M	/usr/share/i18n/charmaps
13M	/usr/share/i18n/locales
8,0K	/usr/share/i18n/SUPPORTED

> 
> if you're trying to create an extremely small embedded system, use
> INSTALL_MASK to strip out the files you don't want.

Furthermore, I tried INSTALL_MASK but this just reduces the source and is
very tedious to get right as the src include/copy other files and you end
up with more src code than you think that needs to stay.

locale-gen/localedef is to locales as gcc is to code.

One could consider a new USE flag for glibc where you can specify which
locales you want and together with USE=compile-locales would just install
the selected locales, without any src in i18n.
Comment 5 SpanKY gentoo-dev 2021-04-22 19:41:33 UTC
sorry, but i'm still not seeing the trade-off here being worth it.  i maintain that the number of people who care about this is significantly low, and those handful of people already have the escape hatch (INSTALL_MASK) to delete the things they really don't want.

if you want to pursue things upstream, there's a lot you could get done for a much wider audience.
* make localedef tooling a lot nicer wrt cross-compile & sysroot
* get commitments on stable/versioned archive formats
* optimize the install format of the various locales & charmaps

having a sep ebuild would require a circular dependency which portage does not handle well.  the runtime glibc & locale files must be kept in sync.  just because you can find a few versions that happen to work (i.e. not segfault) is irrelevant: we operate on the guarantees glibc provides, not on our wishful thinking.
Comment 6 Sergei Trofimovich (RETIRED) gentoo-dev 2021-04-22 19:58:15 UTC
(In reply to Joakim Tjernlund from comment #2)
> (In reply to Sergei Trofimovich from comment #1)
> > Locale arechives are binaries that are dependent on glibc version. Splitting
> > locales out to a separate package will introduce the race window when system
> > has new glibc and old (potentially incompatible) locales. glibc used to
> > SIGSEGV on outdated locales.
> 
> hmm, how long ago was SEGV?

https://bugs.gentoo.org/674338#c5 is one of examples. What matters is ABI guarantees for future versions. I don't think we have them.

> Make sure versions match with subslot or similar?  

subslot does not fix a problem if time window when glibc and installed and locales are not installed yet. portage does not have a way to express "build these two packages in sequence as close as possible".

> > 
> > You can avoid building all the locales by populating /etc/locale.gen with a
> > needed set.
> 
> That still takes up a huge amount of disk space

What numbers are we talking about here? glibc is not a small package even without locales. If you need better precision you might need to pick files manually anyway.
Comment 7 Joakim Tjernlund 2021-04-22 21:37:58 UTC
(In reply to SpanKY from comment #5)
> sorry, but i'm still not seeing the trade-off here being worth it.  i
> maintain that the number of people who care about this is significantly low,
> and those handful of people already have the escape hatch (INSTALL_MASK) to
> delete the things they really don't want.

This can be said about many things and INSTALL_MASK only solves part of the
problem and is tricky to calculate.

> 
> if you want to pursue things upstream, there's a lot you could get done for
> a much wider audience.
> * make localedef tooling a lot nicer wrt cross-compile & sysroot

I this localedef already provides this.

> * get commitments on stable/versioned archive formats
> * optimize the install format of the various locales & charmaps

None of that would address this issue.

> 
> having a sep ebuild would require a circular dependency which portage does
> not handle well.  the runtime glibc & locale files must be kept in sync. 
> just because you can find a few versions that happen to work (i.e. not
> segfault) is irrelevant: we operate on the guarantees glibc provides, not on
> our wishful thinking.

OK, so separate pkg is a bit tricky but USE flags should work.

I got an idea, USE=compile-locales with the locales I want in /etc/locale.gen (en_GB.UTF-8 UTF-8) with PKG_INSTALL_MASK="${PKG_INSTALL_MASK}
 /usr/share/i18n/charmaps/* /usr/share/i18n/locales/*" (same for INSTALL_MASK) should work ?

I didn't quite work out, I just got locales C and POSIX (no UTF-8 either)
Comment 8 Joakim Tjernlund 2021-04-22 22:08:43 UTC
(In reply to Joakim Tjernlund from comment #7)
> (In reply to SpanKY from comment #5)

> 
> I got an idea, USE=compile-locales with the locales I want in
> /etc/locale.gen (en_GB.UTF-8 UTF-8) with
> PKG_INSTALL_MASK="${PKG_INSTALL_MASK}
>  /usr/share/i18n/charmaps/* /usr/share/i18n/locales/*" (same for
> INSTALL_MASK) should work ?
> 
> I didn't quite work out, I just got locales C and POSIX (no UTF-8 either)

hmm, USE=compile-locales runs in src_install, not src_compile as I would have guessed. Has PKG_INSTALL_MASK already removed the files when locale runs?
Comment 9 Joakim Tjernlund 2021-04-22 22:42:35 UTC
(In reply to Joakim Tjernlund from comment #8)
> (In reply to Joakim Tjernlund from comment #7)
> > (In reply to SpanKY from comment #5)
> 
> > 
> > I got an idea, USE=compile-locales with the locales I want in
> > /etc/locale.gen (en_GB.UTF-8 UTF-8) with
> > PKG_INSTALL_MASK="${PKG_INSTALL_MASK}
> >  /usr/share/i18n/charmaps/* /usr/share/i18n/locales/*" (same for
> > INSTALL_MASK) should work ?
> > 
> > I didn't quite work out, I just got locales C and POSIX (no UTF-8 either)
> 
> hmm, USE=compile-locales runs in src_install, not src_compile as I would
> have guessed. Has PKG_INSTALL_MASK already removed the files when locale
> runs?

no, I think there is 2 bugs:
1) The hosts is searched for input src, not the build dir.
2) ebuild looks for locale.gen in the build dir, not the host

I figure the problem is here somewhere:
run_locale_gen() {
	# if the host locales.gen contains no entries, we'll install everything
	local root="$1"
	local inplace=""

	if [[ "${root}" == "--inplace-glibc" ]] ; then
		inplace="--inplace-glibc"
		root="$2"
	fi

	local locale_list="${root}/etc/locale.gen"

	pushd "${ED}"/$(get_libdir) >/dev/null
        ...

Not sure exactly where though. Ideas?
Comment 10 Joakim Tjernlund 2021-04-24 15:28:16 UTC
fix for USE=compile-locales in
https://bugs.gentoo.org/785406
Comment 11 Georgy Yakovlev archtester gentoo-dev 2021-05-15 02:17:23 UTC
just my 2 cents, 
this is tested configuration that will leave you effectively only with en locale, working one.

manpage masking is a bonus.

INSTALL_MASK="
/usr/share/locale/*
-/usr/share/locale/en
-/usr/share/locale/en@IPA
-/usr/share/locale/en@boldquot
-/usr/share/locale/en@quot
-/usr/share/locale/en@shaw
-/usr/share/locale/en_GB
-/usr/share/locale/en_GB.UTF-8
-/usr/share/locale/en_US
-/usr/share/locale/en_US.UTF-8
-/usr/share/locale/locale.alias
/usr/share/man/*
-/usr/share/man/cat*
-/usr/share/man/man*
"
Comment 12 Joakim Tjernlund 2021-05-15 16:11:24 UTC
(In reply to Georgy Yakovlev from comment #11)
> just my 2 cents, 
> this is tested configuration that will leave you effectively only with en
> locale, working one.
> 
> manpage masking is a bonus.
> 
> INSTALL_MASK="
> /usr/share/locale/*
> -/usr/share/locale/en
> -/usr/share/locale/en@IPA
> -/usr/share/locale/en@boldquot
> -/usr/share/locale/en@quot
> -/usr/share/locale/en@shaw
> -/usr/share/locale/en_GB
> -/usr/share/locale/en_GB.UTF-8
> -/usr/share/locale/en_US
> -/usr/share/locale/en_US.UTF-8
> -/usr/share/locale/locale.alias
> /usr/share/man/*
> -/usr/share/man/cat*
> -/usr/share/man/man*
> "

It is not that simple in general, you have /usr/share/i18n/{charmaps,locales} too.
Also, /usr/share/i18n/en_GB has:
copy "i18n"
copy "iso14651_t1"
copy "en_US"
copy "i18n"
copy "i18n"
copy "en_US"
include "translit_combining";

So your mask is incomplete and still has locale source in the tree.

Anyhow, I found compromise for now using USE=compile-locales but compile-locales
is a bit buggy as described in https://bugs.gentoo.org/785406 , patch included(hint, hint:)