Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 550760 - sys-apps/portage-2.2.20 - /usr/bin/ebuild - Missing error message when skipping files with special characters in manifest creation
Summary: sys-apps/portage-2.2.20 - /usr/bin/ebuild - Missing error message when skippi...
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - Ebuild Support (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
Depends on:
Reported: 2015-05-29 18:22 UTC by Michael Seifert
Modified: 2019-07-19 18:12 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Note You need to log in before you can comment on or make changes to this bug.
Description Michael Seifert 2015-05-29 18:22:37 UTC
When calling "/usr/bin/ebuild xyz-1.0.ebuild manifest", files with non-ASCII characters are skipped as discussed in bugs #411127 and #435934. As I see it, these two tickets are feature requests for adding support for Unicode file names.

However, if a file name is not valid, /usr/bin/ebuild should notify the user about that, which it currently does not.

Reproducible: Always

Steps to Reproduce:
Create a manifest of an ebuild where one of the files in $FILESDIR contains an character such as '@'.
Actual Results:  
/usr/bin/ebuild terminates as expected, creating a Manifest which contains an entry for the ebuild, but not for the file in $FILESDIR.

Expected Results:  
/usr/bin/ebuild should exit with an error notification about the invalid file name.
Comment 1 Andrew Miller 2015-05-29 21:12:32 UTC
The "repoman manifest" command also doesn't tell the user that files with non-ASCII characters have been skipped.

I'm using sys-apps/portage-9999 built on May 27 (up to date with git).
Comment 2 Kerin Millar 2019-07-19 12:49:32 UTC
I just encountered this issue. One thing to keep in mind is that pathname components are arbitrary byte sequences, where any byte is legal except for 0x00 and 0x2f. No assumption can be made as to the encoding thereof. They could be UTF-8, ISO-8859-1, pure ASCII or even random bytes churned out by an RNG and would still be legal names. Consulting LC_CTYPE doesn't help either, because there is no guarantee that any name was previously written with the implied encoding.

I'm not suggesting that portage/repoman should not have any constraints. However, as far as encodings are concerned, it either needs to:

a) Dictate an encoding and rigidly enforce it (UTF-8, for example)
b) Not assume any encoding and treat the names as just byte sequences

Just not anything in-between.

P.S. In my case, the offending characters were (ASCII) parentheses, which are sometimes used in kernel patch names.