Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 721088

Summary: [Future EAPI] Don't export A
Product: Gentoo Hosted Projects Reporter: Ulrich Müller <ulm>
Component: PMS/EAPIAssignee: PMS/EAPI <pms>
Status: CONFIRMED ---    
Severity: normal CC: esigra, flow, mattst88, mgorny, sam
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=719202
https://bugs.gentoo.org/show_bug.cgi?id=720180
https://bugs.gentoo.org/show_bug.cgi?id=833567
https://bugs.gentoo.org/show_bug.cgi?id=127560
https://bugs.gentoo.org/show_bug.cgi?id=540146
Whiteboard: distant-future-eapi
Package list:
Runtime testing required: ---
Bug Depends on: 830854    
Bug Blocks: 174380, 830187    

Description Ulrich Müller gentoo-dev 2020-05-05 13:26:22 UTC
Variables exported to the environment are limited to MAX_ARG_STRLEN of Linux, which is 128 KiB. In the case of A this can be a problem for ebuilds with many distfiles.
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2020-05-05 13:48:41 UTC
We should probably make a clear definitive list of which variables are exported and which are not.  We could also kill A altogether, I suppose it's only needed as a convenience variable for default_src_unpack.
Comment 2 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2020-05-05 18:37:17 UTC
Let me expand on this.  In my opinion, having A as 'defined but not exported' is against the principle of the least surprise.  This is something people will simply miss or get wrong.  Someone implementing a PM on top of this is most likely not to care much whether A is exported or not, and fall into the same trap.

In the end, changing this doesn't really solve the problem.  It merely shifts the claim from 'ebuild is making unreasonable assumptions' to 'package manager author did not read our fine print'.  The end result is the same -- instead of making it easier to write compatible and reliable PMs, PMS just serves as an excuse to justify whatever implementation Portage achieved after working around hundred of issues.

In my opinion, this isn't worth the effort.  What we should do instead is set a tree policy preventing this issue from happening again.  We can change PMS to indicate that exporting it is *optional* but saying it *must not* be exported is just asking people to stand head down.
Comment 3 Zac Medico gentoo-dev 2020-05-26 04:59:42 UTC
(In reply to Michał Górny from comment #1)
> We could also kill A altogether, I suppose it's
> only needed as a convenience variable for default_src_unpack.

If we specify that DISTDIR is supposed to contain exactly those files which should be unpacked, then that suffices, and allows us to avoid the need to specify an unexported variable.
Comment 4 Ulrich Müller gentoo-dev 2020-05-26 07:05:07 UTC
(In reply to Zac Medico from comment #3)
> (In reply to Michał Górny from comment #1)
> > We could also kill A altogether, I suppose it's
> > only needed as a convenience variable for default_src_unpack.
> 
> If we specify that DISTDIR is supposed to contain exactly those files which
> should be unpacked, then that suffices, and allows us to avoid the need to
> specify an unexported variable.

unpack ${A} is very common and very convenient. So I guess if the package manager won't define A any longer, then ebuilds would either start defining it themselves, or do horrible things like globbing inside ${DISTDIR}.
Comment 5 Ulrich Müller gentoo-dev 2021-06-05 10:10:40 UTC
(In reply to Michał Górny from comment #2)

Closing.
Comment 6 Zac Medico gentoo-dev 2021-11-23 00:33:53 UTC
I move that we reconsider this bug, in the interest of long-term scalability. There's a working implementation in the patch for bug 720180, and having that, we need only add a new EAPI which does not export A.
Comment 7 Ulrich Müller gentoo-dev 2022-01-03 06:02:11 UTC
Maybe we should have a tree policy limiting the number of distfiles an ebuild can have? If it's several hundred, then it should be repacked?
Comment 8 Ulrich Müller gentoo-dev 2022-01-08 12:29:03 UTC
As discussed in #gentoo-pms, it is unlikely that we go for a special solution that just unexports A. Instead, we shall reconsider the whole list of variables defined in section 11.1:
https://projects.gentoo.org/pms/8/pms.html#x1-10900011.1

One proposal would be to make exporting of all variables listed in that table optional, except HOME, TMPDIR, and possibly ROOT.
Comment 9 Zac Medico gentoo-dev 2022-02-17 21:01:38 UTC
Ultimately, manifest bloat will become a show stopper here, which leads to bug 833567 as an alternative solution.
Comment 10 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2023-02-24 04:56:46 UTC
Env variables of single values: sure
Env variables of large lists: let's avoid this entirely.

As a general solution, instead of ${x}, what if we introduce ${x_ref} which is the name of a file containing all of the entries, one per line.

Also provides clarity for the terrible upstreams that want to use spaces in files.
Comment 11 Ulrich Müller gentoo-dev 2023-02-24 08:43:50 UTC
(In reply to Robin Johnson from comment #10)
> Also provides clarity for the terrible upstreams that want to use spaces in
> files.

AFAICS we cannot have whitespace in filenames for multiple reasons:
- SRC_URI is whitespace-separated
- A space in an URI would have to be escaped by %20, and these files would be downloaded under there literal name
- Manifest fields are separated by spaces, and Portage doesn't support the optional encoding from GLEP 74 (i.e. no \x20 or \u0020)

Also not sure if we would want support for this. SRC_URI arrows exist, so files can be renamed to something sane.
Comment 12 Ulrich Müller gentoo-dev 2023-04-27 08:16:16 UTC
I believe that the variables listed in table 11.1 can be categorised into four main groups:

(1) "Directory type" variables (some of these, like ROOT or EPREFIX, aren't directories themselves, but they are combined with other components to form a directory path):
DISTDIR, WORKDIR, S, ROOT, EROOT, SYSROOT, ESYSROOT, BROOT, T, TMPDIR, HOME, EPREFIX, D, ED

(2) Variables describing the package name and version:
P, PF, PN, CATEGORY, PV, PR, PVR

(3) All others in EAPI 8:
A, USE, EBUILD_PHASE, EBUILD_PHASE_FUNC, MERGE_TYPE, REPLACING_VERSIONS, REPLACED_BY_VERSION

(4) Legacy variables that existed in previous EAPIs:
PORTDIR, ECLASSDIR, DESTTREE, INSDESTTREE, AA, KV

Some of the variables from (1) are used by external commands (e.g. TMPDIR, HOME, ROOT), others are used by ebuild helpers (e.g. D, ED, EPREFIX, T). Presumably we would want to stay consistent there and also follow the KISS principle, i.e. keep exporting all of them.

From group (2) most notably PF is used by dodoc. PN and CATEGORY are used by keepdir. Not entirely sure what to do there, maybe the PM could export __PF as internal variable, or dodoc could get PF via IPC (or is that a crazy idea)?

AFAICS there's no need to export any variable from (3).
Comment 13 Ulrich Müller gentoo-dev 2023-04-27 09:01:35 UTC
(In reply to Ulrich Müller from comment #12)
Alternatively, we could go for the all-out approach, and stop exporting everything except TMPDIR and HOME (not sure about ROOT).

D, T, PF etc. would be retained by the PM as "conceptual variables" (we have that for DESTTREE and INSDESTTREE already). How the PM would pass them to helpers like dodoc would be left to the implementation. (For example, the PM could export a subset, or could write them to a file that would be sourced.)
Comment 14 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2023-04-27 12:56:03 UTC
Unfortunately, "leaving to the implementation" leaves us where we are now, i.e. you have to assume that they may be exported.
Comment 15 Ulrich Müller gentoo-dev 2023-04-27 16:01:34 UTC
(In reply to Michał Górny from comment #14)
> Unfortunately, "leaving to the implementation" leaves us where we are now,
> i.e. you have to assume that they may be exported.

By "implementation defined" I meant either exporting a subset under reserved names (like __E_PF etc.) or whatever other mechanism is suitable.

Portage already does this for conceptual variables like DESTTREE (which is exported as _E_DESTTREE).
Comment 16 Florian Schmaus gentoo-dev 2024-03-26 15:38:12 UTC
(In reply to Robin Johnson from comment #10)
> As a general solution, instead of ${x}, what if we introduce ${x_ref} which
> is the name of a file containing all of the entries, one per line.

I think this is very sensible and a common pattern to use when a list of something (typically files) grows too large to be passed reliably command-line argument (or via the process' environment). For example, javac has @filename arguments, which point javac to a file containing a list of files.

Ideally the title of this bug should read
   
  [Future EAPI] Don't export A, instead export ARCHIVES_FILE pointing to a file containing all entries of A

Once we have ARCHIVES_FILE - I leave the name of the variable up to future bikeshedding - the default implementation of src_unpack() in a future EAPI could be simply

__eapi9_src_unpack() {
  mapfile A "${ARCHIVES_FILE}"
  unpack "${A}"
}
Comment 17 Ulrich Müller gentoo-dev 2024-03-26 16:21:40 UTC
(In reply to Florian Schmaus from comment #16)
> __eapi9_src_unpack() {
>   mapfile A "${ARCHIVES_FILE}"
>   unpack "${A}"
> }

The problem is that the unpack command may run into a related problem depending on ARG_MAX of the system, i.e. we cannot assume unlimited length of command line arguments.
Comment 18 Ulrich Müller gentoo-dev 2024-03-26 16:22:55 UTC
(In reply to Florian Schmaus from comment #16)
>   unpack "${A}"

Also, should be ${A}, not "${A}".
Comment 19 Florian Schmaus gentoo-dev 2024-03-26 18:05:35 UTC
(In reply to Ulrich Müller from comment #17)
> (In reply to Florian Schmaus from comment #16)
> > __eapi9_src_unpack() {
> >   mapfile A "${ARCHIVES_FILE}"
> >   unpack "${A}"
> > }
> 
> The problem is that the unpack command may run into a related problem
> depending on ARG_MAX of the system, i.e. we cannot assume unlimited length
> of command line arguments.

Right, however, this is only problematic if `unpack` is an external binary as opposed to a shell function (which it currently is). And it seems no like something we can not address. The obvious solutions include
- enforcing that `unpack` is a shell function
- define a new command (unpack_archives) or extend unpack to read the files to unpack from a file

Actually, the latter, here shown re-use the '@' notation already used elsewhere for such cases, seems pretty nice:

__eapi9_src_unpack() {
  unpack @"${ARCHIVES_FILE}"
}
Comment 20 Ulrich Müller gentoo-dev 2024-03-26 18:23:09 UTC
(In reply to Florian Schmaus from comment #19)
> Right, however, this is only problematic if `unpack` is an external binary
> as opposed to a shell function (which it currently is).

It isn't: "Except where otherwise noted, they may be internal (shell functions or aliases) or external commands available in PATH; where this is not specified, ebuilds may not rely upon either behaviour."
https://projects.gentoo.org/pms/8/pms.html#x1-12000012.3
Comment 21 Florian Schmaus gentoo-dev 2024-03-26 18:44:27 UTC
(In reply to Ulrich Müller from comment #20)
> (In reply to Florian Schmaus from comment #19)
> > Right, however, this is only problematic if `unpack` is an external binary
> > as opposed to a shell function (which it currently is).
> 
> It isn't: "Except where otherwise noted, they may be internal (shell
> functions or aliases) or external commands available in PATH; where this is
> not specified, ebuilds may not rely upon either behaviour."
> https://projects.gentoo.org/pms/8/pms.html#x1-12000012.3

I am sorry, I was not clear.

Yes, PMS allows `unpack` to be anything that can be invoked via bash.

What I meant is that `unpack` is currently implemented as shell function in portage (unless I am mistaken) and that one option to avoid the large process environment when spawning unpack would be to mandate by PMS that `unpack` has to be a shell function in future EAPIs.

That said, that would not be my preferred way forward. An unpack-helper that takes a file, containing the archives to unpack, as argument seems more sensible.