Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 547778 - sys-apps/portage-2.2.18: emerge --jobs... ignores and hides errors in postinst phase
Summary: sys-apps/portage-2.2.18: emerge --jobs... ignores and hides errors in postins...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - Interface (emerge) (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks: 484436
  Show dependency tree
 
Reported: 2015-04-26 08:56 UTC by Klaus Kusche
Modified: 2017-10-27 18:08 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Klaus Kusche 2015-04-26 08:56:28 UTC
Today, during my "emerge --jobs=4 --keep-going ..." 
two packages failed due to errors in the postinst phase. 

However, emerge has completely ignored and hidden those errors:
* The packages were counted as "complete", not as "failed".
* Both packages were registered as "installed successfully",
emerge continued without recalculating the dependencies.
* emerge did not display any error messages or error logs,
neither when the packages failed nor at the end.

If I hadn't checked the error logs manually afterwards,
I wouldn't have noticed at all that 2 packages failed
(one was just a doc install failure, but the other one
was a more serious problem with the binaries installed).
Comment 1 Zac Medico gentoo-dev 2015-05-02 16:56:09 UTC
We should log the failed postinst phase via elog.
Comment 2 Zac Medico gentoo-dev 2015-05-02 19:31:19 UTC
There's a patch in the following branch:

https://github.com/zmedico/portage/tree/bug_547778
Comment 3 Zac Medico gentoo-dev 2015-05-02 19:32:38 UTC
The patch is posted for review here:

https://archives.gentoo.org/gentoo-portage-dev/message/cd5447f975d6c592f96f94f0a6babc7d
Comment 4 Zac Medico gentoo-dev 2015-05-02 19:43:56 UTC
(In reply to Klaus Kusche from comment #0)
> * emerge did not display any error messages or error logs,
> neither when the packages failed nor at the end.

If it called eerror or die, then you should have seen a corresponding eerror message. It's the responsibility of the ebuild to call eerror or die when appropriate, so you should file a separate bug for the ebuild if it doesn't do that.
Comment 5 Klaus Kusche 2015-05-02 20:06:18 UTC
One was a failing readme.gentoo_print_elog (no readme.gentoo) in postinst,
which definitely *did* call die, 
the other was bug 547752 = 547782, 
where the whole postinst step ends with fail status if I remember correctly.

In both cases, the error was correctly recorded as an error in the elog,
but when emerging silently (e.g. "--jobs=4 --keep-going" or 
"--quiet-build --quiet-fail"), it was neither counted nor displayed 
nor noticeable in any other way during the emerge.

If I remember correctly, the error was shown when emerging "normally"
(not silently), but I'd have to try again to make sure.
Comment 6 Zac Medico gentoo-dev 2015-05-02 23:21:49 UTC
(In reply to Zac Medico from comment #3)
> The patch is posted for review here:
> 
> https://archives.gentoo.org/gentoo-portage-dev/message/
> cd5447f975d6c592f96f94f0a6babc7d

This is in the master branch now.

(In reply to Klaus Kusche from comment #5)
> In both cases, the error was correctly recorded as an error in the elog,
> but when emerging silently (e.g. "--jobs=4 --keep-going" or 
> "--quiet-build --quiet-fail"), it was neither counted nor displayed 
> nor noticeable in any other way during the emerge.

Check your PORTAGE_ELOG_SYSTEM setting. The default setting enables the echo module, which will display the eerror messages just before emerge exits.

> If I remember correctly, the error was shown when emerging "normally"
> (not silently), but I'd have to try again to make sure.

You'll see it when it occurs, but you won't with --jobs or --quiet-build because all of that ebuild output is directed to the build log. Either way, you'll see it before emerge exits if you have PORTAGE_ELOG_SYSTEM="echo" enabled. You can use this command to check:

  portageq envvar PORTAGE_ELOG_SYSTEM
Comment 7 Klaus Kusche 2015-05-03 08:12:49 UTC
> (In reply to Klaus Kusche from comment #5)
> > In both cases, the error was correctly recorded as an error in the elog,
> > but when emerging silently (e.g. "--jobs=4 --keep-going" or 
> > "--quiet-build --quiet-fail"), it was neither counted nor displayed 
> > nor noticeable in any other way during the emerge.
> 
> Check your PORTAGE_ELOG_SYSTEM setting. The default setting enables the echo
> module, which will display the eerror messages just before emerge exits.

I have
PORTAGE_ELOG_CLASSES="info log warn error"
PORTAGE_ELOG_SYSTEM="save"

> > If I remember correctly, the error was shown when emerging "normally"
> > (not silently), but I'd have to try again to make sure.
> 
> You'll see it when it occurs, but you won't with --jobs or --quiet-build
> because all of that ebuild output is directed to the build log. Either way,
> you'll see it before emerge exits if you have PORTAGE_ELOG_SYSTEM="echo"
> enabled. You can use this command to check:
> 
>   portageq envvar PORTAGE_ELOG_SYSTEM

The problem is *not* that I did not see the error messages.

The problem is that portage completely *ignored* the errors:
* It registered the failed packages as "installed" in /var/db/pkg.
* It counted the packages as "completed", not as "failed".
The "failed" count was still displayed as zero at the end of the emerge.
* It did *not* recalculate the dependencies, it just continued emerging
(also packages which depend on the failed packages).

I don't want emerge to display the full error messages, 
I just want to have a list of the names of the failed packages at the end
(as it is done for packages failing in other phases).
Comment 8 Zac Medico gentoo-dev 2015-05-03 08:49:55 UTC
(In reply to Klaus Kusche from comment #7)
> The problem is that portage completely *ignored* the errors:
> * It registered the failed packages as "installed" in /var/db/pkg.

It registers them in /var/db/pkg *before* the postinst phase is executed, because at that point irreversible changes have been made to the installed system. Doing a rollback at that point could be accomplished in a couple of different ways:

1) Unmerging the files that were just installed, and mergw the previously installed version if there was one (a backup binary package would have to have been created by FEATURES=unmerge-backup)

2) Rollback to a previous checkpoint using filesystem snapshot capabilities (see bug 40127).

> * It counted the packages as "completed", not as "failed".
> The "failed" count was still displayed as zero at the end of the emerge.
> * It did *not* recalculate the dependencies, it just continued emerging
> (also packages which depend on the failed packages).

Yes, in that sense, postinst failures currently are not treated as a "rel" failures. If postinst needs to do something that is so critical that we need to count it as a "real" failure, then it's probably doing something that it really should not be doing.

> I don't want emerge to display the full error messages, 
> I just want to have a list of the names of the failed packages at the end
> (as it is done for packages failing in other phases).

For what package(s) is the postinst phase so critical?
Comment 9 Klaus Kusche 2015-05-03 10:08:56 UTC
(In reply to Zac Medico from comment #8)
> (In reply to Klaus Kusche from comment #7)
> > The problem is that portage completely *ignored* the errors:
> > * It registered the failed packages as "installed" in /var/db/pkg.
> 
> It registers them in /var/db/pkg *before* the postinst phase is executed,
> because at that point irreversible changes have been made to the installed
> system. Doing a rollback at that point could be accomplished in a couple of
> different ways:
> 
> 1) Unmerging the files that were just installed, and mergw the previously
> installed version if there was one (a backup binary package would have to
> have been created by FEATURES=unmerge-backup)
> 
> 2) Rollback to a previous checkpoint using filesystem snapshot capabilities
> (see bug 40127).

Ok, makes sense to have them marked as "installed".

> > * It counted the packages as "completed", not as "failed".
> > The "failed" count was still displayed as zero at the end of the emerge.
> > * It did *not* recalculate the dependencies, it just continued emerging
> > (also packages which depend on the failed packages).
> 
> Yes, in that sense, postinst failures currently are not treated as a "rel"
> failures. If postinst needs to do something that is so critical that we need
> to count it as a "real" failure, then it's probably doing something that it
> really should not be doing.
> 
> > I don't want emerge to display the full error messages, 
> > I just want to have a list of the names of the failed packages at the end
> > (as it is done for packages failing in other phases).

I *still* want to be informed in some way that something went wrong
and that I have to check the logs and perhaps correct things.

> For what package(s) is the postinst phase so critical?

For the two errors I really had:
One uncritical (just doc missing).
One quite critical: fcaps failed, executables migth lack needed caps.

Just checked the postinst's of some of the packages installed here:
* Database and cache updates of all kinds 
  (mime cache, icon cache, fonts, texmf update, XML catalog, hwdb, docbook...)
* Certs rehash (!)
* Essential chown, chgrp, fcaps 
  (e.g. in polkit, eselect, openldap, cdrtools, sandbox, man, wireshark)
* eselect updates (e.g. mesa, python, xorg, ...)
* Creating users and groups (systemd)
* Creating / moving dir's and files (openrc, portage, gcc-config),
  creating links (gawk, bash, clang)
* Generating unique ids (dbus, dhcpcd)
* Runlevel manipulation (kmod, udev, ...)
* Restarting daemons

So, postinst is not just writing nice hints to the log.
Postinst is doing essential things which need my attention when they fail.
Comment 10 Klaus Kusche 2015-05-03 10:10:28 UTC
... and which should perhaps be fixed *before* depending packages are emerged.
Comment 11 Zac Medico gentoo-dev 2015-05-03 18:35:46 UTC
(In reply to Klaus Kusche from comment #9)
> I *still* want to be informed in some way that something went wrong
> and that I have to check the logs and perhaps correct things.

I suppose we could make it show a message in the foreground which is similar to a regular failure message, which includes the path of the build log.

For fcaps, I've filed bug 548516 which will allow fcaps to be called before the package is installed, during the (sandboxed) src_install phase.
Comment 12 Zac Medico gentoo-dev 2015-05-03 21:49:38 UTC
There's an additional patch in the following branch:

https://github.com/zmedico/portage/tree/bug_547778

I've posted it for review here:

https://archives.gentoo.org/gentoo-portage-dev/message/936447e45de05184d55668e66d585ae1

This makes postinst failures behave more like "real" failures in terms of output and the final emerge return code. However, in other respects, these packages will be treated as successful installations. This means that they will not cause emerge to immediately exit, and they will not trigger recalculation of dependencies when --keep-going is enabled.
Comment 14 Brian Dolbec (RETIRED) gentoo-dev 2015-05-19 19:47:07 UTC
Released in portage-2.2.19