Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 782841 - Several package license violations
Summary: Several package license violations
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Licenses team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-14 14:35 UTC by orbea
Modified: 2023-08-08 06:30 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description orbea 2021-04-14 14:35:38 UTC
I recently made this PR for samurai.

https://github.com/gentoo/gentoo/pull/20328

And to my surprise I was told, "ypu should never install the license file". As was kindly described to me gentoo installs licenses in '/var/db/repos/gentoo/licenses/' and I later found this link.

https://devmanual.gentoo.org/general-concepts/licenses/index.html

Which claims:

> If the package sources include additional files that are neither installed nor used at build time, their license should not be listed.

However in many (Not all) cases not including the license + copyright notice in its entirety with all copies of the software which includes source code, the installed package and any created binary packages is technically a license violation.

At least in the case of samurai which uses four distinct licenses it is a clear violation of all four.

https://github.com/michaelforney/samurai/blob/master/LICENSE

ISC (Main license)

> Permission to use, copy, modify, and/or distribute this software for any purpose
> with or without fee is hereby granted, provided that the above copyright notice
> and this permission notice appear in all copies.

Apache2 (From ninja)

> You must give any other recipients of the Work or
> Derivative Works a copy of this License; and

MIT (Two licenses from Myrddin and musl)

> The above copyright notice and this permission notice shall be included in
> all copies or substantial portions of the Software.

This includes the customized copyright notices from the ISC and two MIT licenses.

> Copyright © 2017-2020 Michael Forney

> Copyright (c) 2013 Ori Bernstein <ori@eigenstate.org>

> Copyright © 2005-2014 Rich Felker, et al.

I think the only way to assure the license file is always available in all potential cases is to install it as is common practice in distros like Slackware or Fedora.

Also there is the point that the only way to understand that samurai uses four distinct licenses is to actually read the file meaning that its very useful documentation. I am sure it would be similarly useful in many other packages.
Comment 1 Ulrich Müller gentoo-dev 2021-04-14 15:10:45 UTC
Binary packages include license information (it's in the xpak's metadata), and the license itself will be available in the licenses dir of the Gentoo repository. It wouldn't make sense to install thousands of identical copies of e.g. the GPL-2.

So what is the precise scenario where we distribute a package to the user but he doesn't have access to its license? As a source-base distro, we _always_ distribute the package's source code.
Comment 2 orbea 2021-04-14 19:02:27 UTC
> It wouldn't make sense to install thousands of identical copies of e.g. the GPL-2.

In the case of samurai most of the licenses would not be an exact duplicate copy of GPL-2 or any other widely used license.

> So what is the precise scenario where we distribute a package to the user but he doesn't have access to its license? As a source-base distro, we _always_ distribute the package's source code.

Its very hard to assume what a user will do with binary package created on gentoo or how many other people will use it and many binary package could not be distributed without the license included in its entirely.

Although in the case of a license like GPL-2 there may be many exact duplicates out there, but at risk of repeating myself the important part is the license and copyright notice must be included with all copies. Since the package is a copy and the source directory is often not persistent I strongly suggest the license file be explicitly included in the package to avoid any possible issues. Simply including the name of the license is insufficient.
Comment 3 Ulrich Müller gentoo-dev 2021-04-14 21:51:03 UTC
(In reply to orbea from comment #2)
> Its very hard to assume what a user will do with binary package created on
> gentoo or how many other people will use it and many binary package could
> not be distributed without the license included in its entirely.

Right, and there are copyleft licenses that require distribution of the source (or an offer to that end). We have no control over the actions of our users, and it's their responsibility to follow license terms when they redistribute binary packages without the corresponding source.

> package is a copy and the source directory is often not persistent I
> [...] I strongly suggest the license file be explicitly included in the
> package to avoid any possible issues. Simply including the name of the
> license is insufficient.

It is indeed. That's why the license file is available on every Gentoo system in the above-mentioned location.

BTW, Debian does things in a similar way. For example, you won't find the text of the GPL in their packages, but only the following notice in their "copyright" file: "On Debian systems, the complete text of the GNU General Public License can be found in `/usr/share/common-licenses/GPL-3'."
Comment 4 orbea 2021-04-17 14:03:21 UTC
> BTW, Debian does things in a similar way.

This seems entirely untrue.

I found an arbitrary debian package.

http://ftp.br.debian.org/debian/pool/main/p/pkgconf/libpkgconf-dev_1.7.4~git20210206+dcf529b-3_amd64.deb

Extracted it with 'ar x' and inside 'data.tar.xz' I spot:

./usr/share/doc/libpkgconf-dev/copyright

This file has every conceivable license pkgconf could be licensed under.

----

To be entirely honest and without intending offense I made this issue upon request and I don't have the energy to explain what is so wrong about what gentoo is doing, but its outright embarrassing.

I don't really care to fix the world, but can it at least be done in any package that lists my e-mail as one of the maintainers (samurai) at least?
Comment 5 Niklāvs Koļesņikovs 2021-04-17 15:56:11 UTC
IANAL but I think there are two distinct aspects here:
1) The LICENSE text (body) must be included for which the Portage's licenses directory might suffice but don't quote me on that.
2) The Copyright notice with years and authors, as each project lists them, must be reproduced. Many/most projects will have an About dialogue or a --help/--version switch that hopefully takes care of it but other things such as the fluid-soundfont probably need to explicitly provide the original license file to not violate this.

Furthermore the user needs to know that if they distribute the files further (such as binpkgs?) or for some data files even if they made something new that contains a substantial portion of the original, they need to include both the license and the copyright of the used data files (usually granting the recipient the same rights of distribution and re-use they had but in case of non-GPL with modified source code being optional).

Input from an actual lawyer greatly missed.
Comment 6 Niklāvs Koļesņikovs 2021-04-17 16:01:12 UTC
Special care must be taken when distributing pre-built libraries or applications with static linking - the former probably need an accompanying copyright/licence and the later must reproduce not just their but also all statically linked library copyright notices.
Comment 7 Alec Warner (RETIRED) archtester gentoo-dev Security 2021-04-17 18:31:37 UTC
(In reply to Ulrich Müller from comment #3)
> (In reply to orbea from comment #2)
> > Its very hard to assume what a user will do with binary package created on
> > gentoo or how many other people will use it and many binary package could
> > not be distributed without the license included in its entirely.
> 
> Right, and there are copyleft licenses that require distribution of the
> source (or an offer to that end). We have no control over the actions of our
> users, and it's their responsibility to follow license terms when they
> redistribute binary packages without the corresponding source.

As another example, imagine I built samurai, then copied only the binaries to another machine (with rsync lets say) so I don't have the COPYRIGHT file. Is this a license violation? Probably; is there anything Gentoo can do about it? Probably not explicitly. We could, put the COPYRIGHT files in binary packages and encourage people to use them (but see below.)

> 
> > package is a copy and the source directory is often not persistent I
> > [...] I strongly suggest the license file be explicitly included in the
> > package to avoid any possible issues. Simply including the name of the
> > license is insufficient.
> 
> It is indeed. That's why the license file is available on every Gentoo
> system in the above-mentioned location.

The implication here then is that anyone who is distributing built Gentoo images without a copy of the tree inside (containing all of the licenses) has a high probability of violating one or more licenses? I tend to agree with OP at a high level, this coupling of licenses with the repo is unexpected.

Do we ever remove licenses from the licenses/ dir?

Briefly glancing at the git log; it appears we do this frequently c759f4e2403444be9bf0ec8231968dba3e7ddd56 806dd927b589a2bb99f5ebd978fe57478a9a7227
6b86b5a912dbd6bbe8352a4223d92b3daf38c3a3

How does this work when I have a package merged to my livefs; it is removed from ::gentoo, the license is removed from ::gentoo; now its no longer in /var/db/repos ?

I think this is another example of why this coupling is bad.

> 
> BTW, Debian does things in a similar way. For example, you won't find the
> text of the GPL in their packages, but only the following notice in their
> "copyright" file: "On Debian systems, the complete text of the GNU General
> Public License can be found in `/usr/share/common-licenses/GPL-3'."

I think this makes sense for common licenses (its unlikely you will build a system that doesn't use GPL-2 for example) but it makes no sense for specific licenses that are not shared:

antarus@antarus-h8-1437c:/usr/share/common-licenses$ ls
Apache-2.0  Artistic  BSD  CC0-1.0  GFDL  GFDL-1.2  GFDL-1.3  GPL  GPL-1  GPL-2  GPL-3  LGPL  LGPL-2  LGPL-2.1  LGPL-3  MPL-1.1  MPL-2.0

Clearly most of the licenses are elsewhere as the OP notes earlier in the conversation.

-A
Comment 8 Ulrich Müller gentoo-dev 2021-04-17 19:24:46 UTC
(In reply to Niklāvs Koļesņikovs from comment #5)
> IANAL but I think there are two distinct aspects here:
> 1) The LICENSE text (body) must be included for which the Portage's licenses
> directory might suffice but don't quote me on that.
> 2) The Copyright notice with years and authors, as each project lists them,
> must be reproduced. Many/most projects will have an About dialogue or a
> --help/--version switch that hopefully takes care of it but other things
> such as the fluid-soundfont probably need to explicitly provide the original
> license file to not violate this.

Note that for 2) it is not sufficient to include the license file in the image of a binary package, because the source files may have different copyright headers (different years or different authors).


(In reply to Alec Warner from comment #7)
> (In reply to Ulrich Müller from comment #3)
> > Right, and there are copyleft licenses that require distribution of the
> > source (or an offer to that end). We have no control over the actions of our
> > users, and it's their responsibility to follow license terms when they
> > redistribute binary packages without the corresponding source.
> 
> As another example, imagine I built samurai, then copied only the binaries
> to another machine (with rsync lets say) so I don't have the COPYRIGHT file.
> Is this a license violation? Probably; is there anything Gentoo can do about
> it? Probably not explicitly. We could, put the COPYRIGHT files in binary
> packages and encourage people to use them (but see below.)

See above, that won't work as a general solution. Normally you should be on the safe side if you distribute the source package along with the binary (but there are exceptions like bindist-restricted packages).

> > It is indeed. That's why the license file is available on every Gentoo
> > system in the above-mentioned location.

> The implication here then is that anyone who is distributing built Gentoo
> images without a copy of the tree inside (containing all of the licenses)
> has a high probability of violating one or more licenses? I tend to agree
> with OP at a high level, this coupling of licenses with the repo is
> unexpected.

I think that focusing on MIT/ISC/Apache-2.0 is somewhat misleading because it is a special case. In the general case, installing the original copyright/license notice with the package doesn't solve the problem. For example, you couldn't distribute a binpkg of GPL licensed software, even if it included the license text in the image.
Comment 9 Niklāvs Koļesņikovs 2021-04-17 19:50:22 UTC
Of course you can - pretty much every distribution has been doing just that for decades - GPL merely requires that whoever provided the binaries must provide the exact source code upon user's request. GPLv3 further requires providing tooling and documentation to avoid cases where source code is only half the puzzle and/or hardware refuses unsigned software.
Comment 10 Alec Warner (RETIRED) archtester gentoo-dev Security 2021-04-17 21:13:02 UTC
(In reply to Ulrich Müller from comment #8)
> (In reply to Niklāvs Koļesņikovs from comment #5)
> > IANAL but I think there are two distinct aspects here:
> > 1) The LICENSE text (body) must be included for which the Portage's licenses
> > directory might suffice but don't quote me on that.
> > 2) The Copyright notice with years and authors, as each project lists them,
> > must be reproduced. Many/most projects will have an About dialogue or a
> > --help/--version switch that hopefully takes care of it but other things
> > such as the fluid-soundfont probably need to explicitly provide the original
> > license file to not violate this.
> 
> Note that for 2) it is not sufficient to include the license file in the
> image of a binary package, because the source files may have different
> copyright headers (different years or different authors).
> 
> 
> (In reply to Alec Warner from comment #7)
> > (In reply to Ulrich Müller from comment #3)
> > > Right, and there are copyleft licenses that require distribution of the
> > > source (or an offer to that end). We have no control over the actions of our
> > > users, and it's their responsibility to follow license terms when they
> > > redistribute binary packages without the corresponding source.
> > 
> > As another example, imagine I built samurai, then copied only the binaries
> > to another machine (with rsync lets say) so I don't have the COPYRIGHT file.
> > Is this a license violation? Probably; is there anything Gentoo can do about
> > it? Probably not explicitly. We could, put the COPYRIGHT files in binary
> > packages and encourage people to use them (but see below.)
> 
> See above, that won't work as a general solution. Normally you should be on
> the safe side if you distribute the source package along with the binary
> (but there are exceptions like bindist-restricted packages).
> 
> > > It is indeed. That's why the license file is available on every Gentoo
> > > system in the above-mentioned location.
> 
> > The implication here then is that anyone who is distributing built Gentoo
> > images without a copy of the tree inside (containing all of the licenses)
> > has a high probability of violating one or more licenses? I tend to agree
> > with OP at a high level, this coupling of licenses with the repo is
> > unexpected.
> 
> I think that focusing on MIT/ISC/Apache-2.0 is somewhat misleading because
> it is a special case. In the general case, installing the original
> copyright/license notice with the package doesn't solve the problem. For
> example, you couldn't distribute a binpkg of GPL licensed software, even if
> it included the license text in the image.

Sorry I want to avoid speaking generally (trying to solve every license is kind of impossible, as you allude to earlier.)

https://github.com/michaelforney/samurai/blob/master/LICENSE

Samurai appears to be Apache, MIT, and ISC licensed.

For users who install samurai from source, they receive a copy of the LICENSE file when they build the software locally; so presumably in that case this is the system working properly and meeting the license requirements.

I'm not actually aware of any other instance where a user merges a package and literally receives a copy of the source code (and thus a verbatim copy of the LICENSE file.) E.g. if I install a binpkg; the binpkg (and xpak metadata) do *not* include the above LICENSE file verbatim. Instead they include our ebuild metadata (which for samurai-1.2 is 'LICENSE="ISC Apache-2.0 MIT"')

Here your assertion is that it doesn't matter if the verbatim LICENSE file is included or not, because we also don't distribute the source code in the binpkg either, and so the binpkg is simply not sufficient to meeting license requirements for most packages anyway.

Do I have a proper understanding of your reasoning?

I do have to wonder how this works for Gentoo's binary media; we don't directly distribute the source code on the media (but we do make it available.) Do our binary media meet the license requirements then? Do we distribute all the necessary copyrights? The "MIT" license for example in our repository only has generic copyright statements.

Or are we certain that we don't need to do that?
Comment 11 Ulrich Müller gentoo-dev 2021-04-18 08:58:13 UTC
(In reply to Niklāvs Koļesņikovs from comment #9)
> Of course you can - pretty much every distribution has been doing just that
> for decades - GPL merely requires that whoever provided the binaries must
> provide the exact source code upon user's request.

That's exactly what the GPL-2 (section 3) doesn't allow you to do. You must either accompany the binary with the source code, or accompany it with a written offer to give _anyone_ a physical medium with the source code. "Providing the source code upon user's request" isn't enough.


(In reply to Alec Warner from comment #10)
> Here your assertion is that it doesn't matter if the verbatim LICENSE file
> is included or not, because we also don't distribute the source code in the
> binpkg either, and so the binpkg is simply not sufficient to meeting license
> requirements for most packages anyway.
> 
> Do I have a proper understanding of your reasoning?

You do. Including the verbatim license file (and copyright notice) would fulfill the requirement for some packages but not for all. Not even for all free software packages, because e.g. the GPL has stricter requirements. 

> I do have to wonder how this works for Gentoo's binary media; we don't
> directly distribute the source code on the media (but we do make it
> available.) Do our binary media meet the license requirements then? Do we
> distribute all the necessary copyrights? The "MIT" license for example in
> our repository only has generic copyright statements.

IANAL, TINLA, but I believe that we're good for the binary media offered for download. https://www.gentoo.org/downloads/ has a link to our source mirrors, so arguably we make the source available together with the binary (and nothing says that it must be in the same tarball or ISO image). Distributing physical copies of our install media may be more problematic, though.
Comment 12 Niklāvs Koļesņikovs 2021-04-18 10:54:37 UTC
I feel like we're getting deep into the legal weeds but:
GPLv3 section 6 clause b states: "...or (2) access to copy the Corresponding Source from a network server at no charge." This effectively covers any "GPLv2 or later" licenses project. For GPLv2-only projects you might be right but I'm sure this has been debated back and forth somewhere else, so I will differ my judgment to the GPLv2 experts.

As a side observation, both v2 and v3 of GPL probably require the ebuild as well, since it's a script used to configure and control building of the software - we might need to either timestamp/checksum them for future reference or just include them in our source code packages.
Comment 13 Ulrich Müller gentoo-dev 2021-04-18 12:08:53 UTC
(In reply to Niklāvs Koļesņikovs from comment #12)
> I feel like we're getting deep into the legal weeds but:
> GPLv3 section 6 clause b states: "...or (2) access to copy the Corresponding
> Source from a network server at no charge." This effectively covers any
> "GPLv2 or later" licenses project. For GPLv2-only projects you might be
> right but I'm sure this has been debated back and forth somewhere else, so I
> will differ my judgment to the GPLv2 experts.

Indeed I was talking about GPL-2:
(In reply to Ulrich Müller from comment #11)
>> That's exactly what the GPL-2 (section 3) doesn't allow you to do.

> As a side observation, both v2 and v3 of GPL probably require the ebuild as
> well, since it's a script used to configure and control building of the
> software - we might need to either timestamp/checksum them for future
> reference or just include them in our source code packages.

That's not entirely clear. If we follow that argument, we'd have to include a lot of other things, like eclasses, profiles, and the Portage sources.

The GPL-3 says about the "corresponding source":
"However, it does not include [...] generally available free programs which are used unmodified in performing those activities but which are not part of the work." The GPL-2 has a similar provision but is less clear about it.

Ebuilds (as well as the package manager) are generally available, free, and - most importantly - they are _not_ part of the work.
Comment 14 Niklāvs Koļesņikovs 2021-04-18 13:17:33 UTC
The following is my personal opinion and not a legal advice.

My understanding of that would be that e.g. when the build system calls gawk or sed, we do not need to include them, since they are very much are "... generally available free programs ...". Meson and autotools would also fall under that. Whether Portage would count as commonly available is a bit murky but I think it fits the spirit of that, and either way the mere offer to download it should be acceptable.

Furthermore in GPLv2 section 2 there is also "..as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs...". This undoubtedly exempts Portage (at least so long as the binpkgs do not find their way to running beyond Gentoo). ;)

As for including the ebuild, GPLv2 section 3 does state: "...plus the scripts used to control compilation and installation of the executable." This of course means the build system (e.g. meson.build files et al). But since Gentoo binaries are produced by running the ebuild, and it very much does "control compilation and installation", I still remain on the side of counting the ebuild as being part of the source code package when Portage built binaries are being distributed.
Comment 15 Ulrich Müller gentoo-dev 2021-04-18 17:46:38 UTC
(In reply to Niklāvs Koļesņikovs from comment #14)
> As for including the ebuild, GPLv2 section 3 does state: "...plus the
> scripts used to control compilation and installation of the executable."
> This of course means the build system (e.g. meson.build files et al).

Up to this point I tend to agree.

> But since Gentoo binaries are produced by running the ebuild, and it very
> much does "control compilation and installation",

Installation isn't controlled by the ebuild but by the package manager. So if you count the ebuild as part of the source code then you must count the package manager as well (but see above, last two paragraphs of comment #13).

> I still remain on the side of counting the ebuild as being part of the
> source code package when Portage built binaries are being distributed.

Ebuilds (and also the package manager) are separate entities and are not part of the package. They are neither derived works of the package, nor is the package derived from them. They are also not intended to be sent upstream for inclusion with the upstream package.

If such distro-specific metadata was part of the sources then you couldn't have any GNU/Linux distros, because incompatible licenses would prevent many packages from being shipped as binaries. As an example, think of a package licensed under BSD-4 or CDDL (which are GPL incompatible) with a GPL-2 licensed package manager and ebuild (or whatever the equivalent of an ebuild in other distros is). If the resulting binary was a derived work of both, then it couldn't be distributed at all.
Comment 16 Niklāvs Koļesņikovs 2021-04-18 20:18:02 UTC
IANAL but I think that "and" there is meant to be understood as "any script responsible for either configuration or install" but that's beside the point, since I specifically had in mind the src_install and pkg_preinstall ebuild phase functions that can move, edit, remove and outright add new files after the build system's install stage has been run (or before it). In case of data files the ebuild can very much be the installer further qualifying it for being covered by GPLv2 section 2.

In my opinion, the spirit of GPLv2 that's further reinforced in GPLv3 is that how the binary was built is important and, when the ebuild also applies patching, the binary output at least as some point is no longer a mere conveying of the original source code and becomes its own derivative work. This is not even that outrageous of a claim - we already ship vanilla-sources as patched by GKH and gentoo-sources as patched by our Kernel team.

Most of the time we only apply trivial fixes to the build system, config files or backport already upstreamed commits, so the source code is either unchanged or remains under the same copyright, but in principle the ebuild could either via a patches or sed do more significant changes. In the most extreme readings of copyright even a backport by us rather than waiting for upstream to release a new minor version is inherently a derivative work since we're effectively arranging their code in a way that upstream has never released.
Comment 17 Ulrich Müller gentoo-dev 2021-04-19 10:55:46 UTC
(In reply to Niklāvs Koļesņikovs from comment #16)
> IANAL but I think that "and" there is meant to be understood as "any script
> responsible for either configuration or install" but that's beside the
> point, since I specifically had in mind the src_install and pkg_preinstall
> ebuild phase functions that can move, edit, remove and outright add new
> files after the build system's install stage has been run (or before it). In
> case of data files the ebuild can very much be the installer further
> qualifying it for being covered by GPLv2 section 2.

The ebuild itself doesn't install the package on the system; it merely creates an  image in ${D} which is installed ("merged") to ${ROOT} by the package manager. This is even clearer for binary packages about which we're talking here.

> In my opinion, the spirit of GPLv2 that's further reinforced in GPLv3 is
> that how the binary was built is important and, when the ebuild also applies
> patching, the binary output at least as some point is no longer a mere
> conveying of the original source code and becomes its own derivative work.
> This is not even that outrageous of a claim - we already ship
> vanilla-sources as patched by GKH and gentoo-sources as patched by our
> Kernel team.

> Most of the time we only apply trivial fixes to the build system, config
> files or backport already upstreamed commits, so the source code is either
> unchanged or remains under the same copyright, but in principle the ebuild
> could either via a patches or sed do more significant changes. In the most
> extreme readings of copyright even a backport by us rather than waiting for
> upstream to release a new minor version is inherently a derivative work
> since we're effectively arranging their code in a way that upstream has
> never released.

Of course, patches are part of the "corresponding source". The difference is that patches are normally a derived work of the original while the ebuild is not. Also, patches aren't programs and as such the GPL-3 exception won't apply to them.
Comment 18 Ulrich Müller gentoo-dev 2021-04-19 11:31:42 UTC
What problem are we trying to solve here? We distribute everything in the open, especially the source packages, the license files, the package manager sources, ebuilds and eclasses.

Somebody who is redistributing binary packages may have to redistribute some of them in addition (depending on the package's specific license), but I don't believe that anything requires them to be included in the binary packages themselves.

As an interesting side note, Ututo is a binary distro derived from Gentoo. Last time I checked, their install media neither included package sources, nor were license files or copyright notices installed as part of packages. Nevertheless, the FSF approved Ututo as a free GNU/Linux distribution: https://www.gnu.org/distros/free-distros.html
Comment 19 orbea 2021-04-27 17:04:16 UTC
What problem are we trying to solve here?

I was trying to provide useful and even required documentation for the samurai package... Its really discouraging to try to contribute to gentoo when I can't even maintain a package without nonsense like this.
Comment 20 Ulrich Müller gentoo-dev 2021-04-27 18:07:49 UTC
Closing for now, by comment #8.
Comment 21 orbea 2021-04-28 14:50:58 UTC
@Ulrich 

You have not resolved this issue at all.
Comment 22 Ulrich Müller gentoo-dev 2021-04-28 16:03:11 UTC
@Licenses team, any opinion other than what I said above, especially in comment #8 and comment #18?
Comment 23 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2021-04-30 20:36:46 UTC
Further opinion as a License team member, sort of in the middle between OP and ulm's opinions.

This specific bug probably needs closing, and taking the improvements to the GLEP process, following through on implementation etc.

The present policy (specifically: 1. to not install LICENSE files; 2. makes no statement about copyright notices) as it is written does indeed not cover all of the nuanced cases, and COULD stand an improvement. 

The purpose of that improvement would be to enable any downstream providers/intermediaries of binpkgs (e.g. Ututo) to more easily comply with the requirements of licenses. The downstream consumers ALSO frequently do NOT have a portdir present on the system (hence /usr/portage/licenses is not present).

Gentoo as it stands, since it does not generally distribute binpkgs, is not in itself in violation of the licenses in most cases (I'd have to review exactly which licenses go into our current media, I'd probably find at least one violation, statistically).

Note that this distinguishes us from binary distributions, which always distribute binpkgs.

Taking some hypothetical package, it consists of a few pieces:
- LICENSE file, containing the terms & conditions of the license. It generally does NOT contain the names of the copyright holders for this package. It might or might not be installed after the package is built.
- Documentation & non-executable data files: this may or may not contain copyright attributions to the copyright holders, however it IS generally installed after the package is built.
- Build instructions: IS NOT installed after the package is built, but DOES contain license reference and names of copyright holders.
- Core Source code: is in itself directly or indirectly (compiled) installed. Usually contains contain license reference and names of copyright holders in the source, and SOMETIMES contains them in runtime output (e.g. --help).
- Auxiliary source code: is NOT installed, e.g. test cases, might not ever be executed. Usually contains contain license reference and names of copyright holders in the source.

The licenses themselves, as noted in the thread, have two common requirements that are relevant for this discussion:
- distribution of the LICENSE to the end consumer
- distribution of the copyright notice to the end consumer

The licenses don't specify exactly HOW those requirements should be met (e.g. MUST it be same tarball, or simply should general usage cause them to wind up together)

To enable *safer* distribution of binary packages derived from Gentoo ebuilds/repository, we could try hard at compliance with all most variants of licenses (excluding AGPL/SSPL for a moment):

1. Move licenses OUT of Portdir, and into their own repo/package. Old versions of the package would ensure that if a user continues to distribute an old binpkg, that they also distribute the related version of the licenses binpkg, would contain correct licenses.

1.1. This specifically covers the case where a binpkg with license A has continued distribution, and the license A was otherwise removed because no remaining source packages consumed it.

1.2. This also enables downstream binary distributions where PORTDIR is not present to simply install the package and be done.

2. Collated copyright notice files. Debian has these as /usr/share/doc/${PN}/copyright for 99% of their packages.

2.1. The notice files as a rough specification:
2.1.1. - MUST contain a reference to the correct license file.
2.1.2. - MUST contain the unique copyright attribution lines from every file in a source tarball
2.1.3. - MUST NOT contain the full text of a license otherwise available by exact reference
2.1.4. - MUST be included in binpkgs when USE=bindist is true or absent (may be excluded only with USE=-bindist)

2.2. I'd try to produce the copyright notice files with some automation that could roughly grep source files.

2.3. If there is something like a AUTHORS file, that should always be included in the binpkg.
Comment 24 Ulrich Müller gentoo-dev 2021-05-10 23:14:14 UTC
Somewhat related, the following was posted on the debian-legal mailing list today:
https://lists.debian.org/debian-legal/2021/05/msg00002.html

"In practice, shipping the relevant source for the binaries is likely enough to achieve license compliance, so shipping pedantically correct copyright/license info for the binaries is not necessary and shipping source is much easier to do, so that is what Debian tends to do."