Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 264130

Summary: PMS should require that file mtimes are preserved on merge
Product: Gentoo Hosted Projects Reporter: Ulrich Müller <ulm>
Component: PMS/EAPIAssignee: PMS/EAPI <pms>
Status: RESOLVED FIXED    
Severity: normal CC: common-lisp, Martin.vGagern, maxi
Priority: High    
Version: unspecified   
Hardware: All   
OS: All   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=906978
Whiteboard: in-eapi-3
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 174380, 263387    
Attachments: Patch for merge.tex
test-timestamps-1.ebuild
Patch for PMS
Patch for PMS
Patch for PMS
Updated patch for PMS.
Patch for PMS
Patch rebased against current PMS master

Description Ulrich Müller gentoo-dev 2009-03-29 01:42:31 UTC
Allowing the package manager to arbitrarily modify timestamps of installed files will lead to problems with several languages:
  - Lisp (*.fasl must be newer than corresponding *.lisp)
  - Emacs (*.elc must not be older than *.el)
  - ghdl/VHDL (libraries check their mtimes)
  - Python (*.pyc vs *.py, not sure if this one is still an issue)

If the PM does not preserve timestamps, there is (AFAICS) no reasonable way to work around this on the ebuild level. (Some horrible hacks exist, see for example impl-*-timestamp-hack in common-lisp-common*.eclass which overwrites its installed files in pkg_postinst.)

Portage preserves timestamps since version 2.1.2.10, see bug 181021.

Therefore, I suggest that PMS commit de6ee9c6ad50d4d130e9ad02f31bddced15293f4 is reverted, as far as regular files are concerned. PMS should explicitely require that file mtimes are preserved.
Comment 1 Ciaran McCreesh 2009-03-29 01:48:22 UTC
If you want this as a guarantee, propose it for EAPI 4. Current EAPIs don't provide this guarantee, and anything requiring it is broken.
Comment 2 Ulrich Müller gentoo-dev 2009-03-29 02:02:06 UTC
(In reply to comment #1)
> If you want this as a guarantee, propose it for EAPI 4. Current EAPIs don't
> provide this guarantee, and anything requiring it is broken.

I don't see why it would be useful that the package manager behaves differently on this depending on EAPI.
Comment 3 Ciaran McCreesh 2009-03-29 02:08:51 UTC
Not behave differently, necessarily. Just not guarantee it for any earlier EAPI.

We can't just make PMS say that mtimes are preserved because that'd be retroactively imposing new restrictions (which not all EAPI-0/1/2 package managers, including earlier Portage versions, meet). And if we go down that route, there's no point to having EAPIs at all.

The Portage change is just a move from one set of uncontrolled behaviour to a different set, and isn't something we can shove backwards into any existing EAPI. If you think you have a convincing case for requiring guaranteed behaviour (either way -- there're strong arguments that mtimes are better not preserved too), it's something that has to be done only starting with a future EAPI.
Comment 4 Ulrich Müller gentoo-dev 2009-03-29 03:14:45 UTC
> If you think you have a convincing case for requiring guaranteed behaviour
> (either way -- there're strong arguments that mtimes are better not preserved
> too),

See comment #0. Upstream packages require that mtimes are preserved, and if the package manager discards them, then this is not fixable in any reasonable way on the ebuild level.

I don't know about Paludis' algorithm, but Emacs packages generally work well with it, since installed *.elc files end up with equal or newer timestamps than *.el. However, I'm not happy with the situation, since it relies on unspecified features of either Portage or Paludis. And there is nothing that I could do to fix it for my packages.

With Lisp, especially SBCL, it is worse since it requires *.fasl to be newer than *.lisp (equal timestamps are not good enough).
Comment 5 Brian Harring (RETIRED) gentoo-dev 2009-03-29 23:53:44 UTC
Reason this is being exempted from eapi3?  It's a simple change, and 2 out of the 3 managers do this already...

This is assuming no serious blowback against it.  Yes it's late in the game, but it's not too late.
Comment 6 Ciaran McCreesh 2009-03-29 23:57:14 UTC
(In reply to comment #5)
> Reason this is being exempted from eapi3?  It's a simple change, and 2 out of
> the 3 managers do this already...

I hate the idea, because for packages distributed as a straight tarball people end up with lots of files installed with wonky timestamps. I'd much rather see future EAPIs require that mtimes be clobbered.
Comment 7 Brian Harring (RETIRED) gentoo-dev 2009-03-30 00:18:15 UTC
> I hate the idea, because for packages distributed as a straight tarball people
> end up with lots of files installed with wonky timestamps. I'd much rather see
> future EAPIs require that mtimes be clobbered.

So... you're not intending on delaying it till eapi4, you're intending on killing it in eapi4? good to know ;)

I personally don't think the "pkgs distributed as a straight tarball" is much of a scenario- presume you're talking about literal "here's the whole thing, drop it into root" since install and friends deal can deal w/ mtime without issue.  Basically view it as a special case w/ already retarded pkgs.

For the other 90% of pkgs that don't have that issue, getting bit in the ass by mtime resets (python and friends) I view as a far more common scenario.  Particularly dislike forced regeneration/.pyc cleanup that lacking mtime preservation leads to.  So... got some example pkgs to look at, or is this a thereotical concern?

Open to alternatives, but killing the .py/.pyc issue is high on my todo- so, examples pkgs or an alternative?
Comment 8 Ciaran McCreesh 2009-03-30 00:22:34 UTC
Not necessarily killing it entirely. Certainly not specifying as default behaviour, since for most packages that care either way it's the wrong thing.

If there's really a need for it, we could go with a dopreservemtime helper function, docompress-style, and make it an error if anything whose mtime is being preserved is older than when the build process started.
Comment 9 Brian Harring (RETIRED) gentoo-dev 2009-03-30 00:32:41 UTC
(In reply to comment #8)
> Not necessarily killing it entirely. Certainly not specifying as default
> behaviour, since for most packages that care either way it's the wrong thing.

Point out a few please (or examples of it that preferably are more then a one time broken release)- reason I ask is that from where I'm sitting most packages get it *right* automatically via their build framework, thus it's not the 'wrong thing' (especially for auto generated files via the ebuild).

> If there's really a need for it, we could go with a dopreservemtime helper
> function, docompress-style, and make it an error if anything whose mtime is
> being preserved is older than when the build process started.

My opinion mind you, but 'dopreservemtime' makes it harder/requires more work 80% to benefit a theoretical 20%.  Yes the ratios are off- point is, the screwed up pkgs that fall into that hypothetical 20% already require handholding- mtime resets wouldn't be anything different (and would fit a fair bit more naturally into said ebuilds due to the other idiocy the ebuild would have to correct).

One thing to keep in mind, pkgcore, as far as I can recall, has *always* preserved mtime- I can't recall the last time I ran down an ebuild incompatibility to that functionality.  It's worth noting in addition portage has had it a long while, and the incompatibilities/issues that have been seen is from *not* preserving mtime, rather then preservation.

Understand you want to make it easier for goofy releases, but don't agree that those releases are worth designing/optimizing for- too annoying of a hit for those that don't have issues (pet peeve being the pyc/pyo cleanup required of ebuilds due to lack of mtime preservation, effectively orphaned files dependant on the managers behaviour).
Comment 10 Ciaran McCreesh 2009-03-30 00:59:00 UTC
tar tjvf stage3-i686-20090325.tar.bz2 | grep -v 2009-03

Whole bunch of screwy mtimes for baselayout and ca-certs. Clearly not desired behaviour.
Comment 11 Ulrich Müller gentoo-dev 2009-03-30 05:50:21 UTC
(In reply to comment #5)
> Reason this is being exempted from eapi3?  It's a simple change, and 2 out
> of the 3 managers do this already...
> 
> This is assuming no serious blowback against it.  Yes it's late in the game,
> but it's not too late.

+1

(In reply to comment #8)
> Not necessarily killing it entirely. Certainly not specifying as default
> behaviour, since for most packages that care either way it's the wrong
> thing

Do you have a concrete example where old mtimes lead to problems? An ebuild that needs updated mtimes can fix it with a simple "find | xargs touch" in src_install. Whereas the other way around (i.e. if the package manager mangles mtimes) the ebuild cannot do anything about it.

> If there's really a need for it, we could go with a dopreservemtime helper
> function, docompress-style, and make it an error if anything whose mtime is
> being preserved is older than when the build process started.

You've got the cart before the horse. An "doupdatemtime" function could be added to some eclass immediately, without any EAPI bump. OTOH, that such a function doesn't exist by now seems to indicate that there is no urgent need for it.

(In reply to comment #10)
> Whole bunch of screwy mtimes for baselayout and ca-certs. Clearly not
> desired behaviour.

Surely you've filed a bug for this?
Comment 12 Ulrich Müller gentoo-dev 2009-03-30 12:54:22 UTC
Let me try to summarise what are the requirements (I hope we agree on these):
  a) Some packages need preserved ordering of file modification times,
     otherwise things will break at run time. Very few packages need exact
     file modification times.
  b) Files in the installed system should not have mtimes that are older than
     the build time of the package. (I still don't see what would be the
     problem with dropping this, but let's have this as a requirement for the
     following.)

Obviously Portage and Pkgcore fulfil only a) while Paludis fulfils only b).
Is it possible to fulfil both? I think that the following procedure would accomplish it:
  1. Record the timestamp when the build process starts, e.g., immediately
     before calling pkg_setup.
  2. After src_install, scan ${D} for files having modification times older
     than your timestamp, and update their mtimes to this timestamp.
  3. Preserve modification times when merging ${D} to ${ROOT}.

Steps 1 and 2 would be optional for the package manager; 3 should be mandatory.

This way, any files generated during the build process (*.pyc, *.fasl, *.elc) would end up with timestamps newer than their corresponding source files (*.py, *.lisp, *.el).

Problems remain for packages with pre-compiled files, or for packages requiring exact mtimes (like ghdl). But I think this affects only very few packages, and it could be fixed on the ebuild level. For example, a Lisp package with both *.lisp and precompiled *.fasl could touch its *.fasl files at the end of src_install.
Comment 13 Ciaran McCreesh 2009-03-30 14:43:22 UTC
(In reply to comment #12)
> Steps 1 and 2 would be optional for the package manager; 3 should be mandatory.

If we're going for defined behaviour at all, we should be doing it properly. So 1 and 2 would have to be mandatory if 3's mandatory.

I'd be inclined to suggest we require fixing 'future' mtimes too for consistency...
Comment 14 Ulrich Müller gentoo-dev 2009-03-30 14:57:13 UTC
(In reply to comment #13)
> If we're going for defined behaviour at all, we should be doing it properly.
> So 1 and 2 would have to be mandatory if 3's mandatory.

I won't object, but let's ask Portage and Pkgcore devs what they think.
It should not be difficult to implement this.

> I'd be inclined to suggest we require fixing 'future' mtimes too for
> consistency...

I had thought about this, but then dismissed the possibility as too far-fetched. Here we go:
  1. Record two timestamps:
     before calling pkg_setup, timestamp1;
     after src_install has completed, timestamp2.
  2. After src_install and before the phase function following it, scan ${D}
     for files having modification times older than timestamp1 or newer than
     timestamp2. Update their mtimes to timestamp1 or timestamp2, respectively.
  3. Preserve modification times when merging ${D} to ${ROOT}.
Comment 15 Ciaran McCreesh 2009-03-30 15:03:40 UTC
Mmm, another thing: do we really want to say specifically that mtimes are modified after src_install, and then preserved? Shouldn't we be saying that mtimes are fixed as part of the merge process instead?

You can tell the difference in pkg_preinst, so wording matters.
Comment 16 Ulrich Müller gentoo-dev 2009-03-30 15:33:35 UTC
(In reply to comment #15)
> Mmm, another thing: do we really want to say specifically that mtimes are
> modified after src_install, and then preserved? Shouldn't we be saying that
> mtimes are fixed as part of the merge process instead?
> 
> You can tell the difference in pkg_preinst, so wording matters.

I've chosen that wording intentionally. If mtimes are updated before pkg_preinst, then there is still an "emergency escape" for the ebuild to adjust things if there is really no other way. Hopefully it will not be needed, or only in very seldom cases (maybe for ghdl?). But if we don't provide this possibility, then people may resort to much worse "solutions" like [1] or [2].

[1] <http://sources.gentoo.org/viewcvs.py/gentoo-x86/eclass/common-lisp-common.eclass?r1=1.3&r2=1.4>
[2] Bug 83877 comment #36 (especially the fourth paragraph ;-)
Comment 17 Ciaran McCreesh 2009-03-30 15:39:31 UTC
(In reply to comment #16)
> I've chosen that wording intentionally.

That opens up a whole new can of worms then. If we start specifying exactly when one of the things we do around that time happens, we probably have to specify all of it. Which means tightening up specifications for strip behaviour, symlink fixing behaviour etc.
Comment 18 Maximilian Grothusmann 2009-03-30 16:00:44 UTC
Also have a look at #264308. python.eclass is broken with package managers that preserve mtime.
Comment 19 Ulrich Müller gentoo-dev 2009-03-30 16:12:50 UTC
(In reply to comment #17)
> If we start specifying exactly when one of the things we do around that time
> happens, we probably have to specify all of it. Which means tightening up
> specifications for strip behaviour, symlink fixing behaviour etc.

Do package managers presently do these things at different times?

If steps 1 and 2 were implemented in Portage, I'd guess that they must be done after src_install and before pkg_preinst. Otherwise, binary packages would not contain the updated timestamps.

(In reply to comment #18)
> Also have a look at #264308. python.eclass is broken with package managers
> that preserve mtime.

I think this problem would be addressed by the procedure outlined in comment #14.
Comment 20 Ciaran McCreesh 2009-03-30 16:17:08 UTC
(In reply to comment #19)
> Do package managers presently do these things at different times?

Probably. At least, we didn't try to copy Portage's behaviour when we implemented it for Paludis.

As I recall, Portage fixes symlinks as a process, whereas Paludis fixes them as part of the merge.
Comment 21 Marijn Schouten (RETIRED) gentoo-dev 2009-03-30 16:20:20 UTC
(In reply to comment #18)
> Also have a look at #264308. python.eclass is broken with package managers that
> preserve mtime.

That's misleading; preserving mtimes is a perfectly valid way of fulfilling no requirements on the mtimes of to-be-installed files. The problem is that the pyc/pyo files aren't even under control of the package manager because mtimes weren't preserved in the past.
Comment 22 Ulrich Müller gentoo-dev 2009-03-30 16:24:32 UTC
(In reply to comment #20)
> > Do package managers presently do these things at different times?
> 
> Probably. At least, we didn't try to copy Portage's behaviour when we
> implemented it for Paludis.

*Sigh* Then leave the exact time for fixing timestamps unspecified, too. And hope that all packages can live with it. For Lisp and Python it should be fine.
Comment 23 Ciaran McCreesh 2009-03-30 16:28:02 UTC
Heh, you should be glad we didn't copy Portage's behaviour exactly. Us copying Portage's behaviour exactly with mtime preservation, and then Portage changing it, is what lead to this whole mess.
Comment 24 Ulrich Müller gentoo-dev 2009-03-30 23:20:14 UTC
(In reply to comment #22)
> Then leave the exact time for fixing timestamps unspecified, too. And hope
> that all packages can live with it. For Lisp and Python it should be fine.

I've asked calchan about ghdl, and he told me that the libraries in question are created during build process. Since he didn't remember the details, I've verified it myself and there are no files in ${D} with old timestamps. So, ghdl should be fine.
Comment 25 Brian Harring (RETIRED) gentoo-dev 2009-04-06 02:45:49 UTC
Looking through this bug, the tally for preservation vs not (or weird alternatives that do forced resets) seems to be in favor of preservation, at least from a technical arg standpoint (including which is the common case, rather then the special case).

So... strengthen the anti-preservation case please- else I'd prefer to move forward so eapi3 can continue/finalize.  Realize it may not be a perfect magic bullet, but tend to believe there is no such thing either.
Comment 26 Ulrich Müller gentoo-dev 2009-04-07 12:44:24 UTC
(In reply to comment #25)
> Looking through this bug, the tally for preservation vs not (or weird
> alternatives that do forced resets) seems to be in favor of preservation, at
> least from a technical arg standpoint (including which is the common case,
> rather then the special case).

As I see it, there are three options:
  A. Always preserve timestamps when merging from D to ROOT, what was my
     original suggestion. Portage and Pkgcore already comply with this.
  B. Preserve timestamps, but optionally allow the package manager to update
     "old" ones. This is the suggestion from comment #12. Again, Portage and
     Pkgcore would be compliant already (since updating mtimes would be
     optional).
  C. As B, but with mandatory updating of "old" mtimes. For this, all three
     package managers would have to be changed.

> So... strengthen the anti-preservation case please- else I'd prefer to move
> forward so eapi3 can continue/finalize.

I still don't see any problems with A, but I'd also be fine with B or C. The important thing is that we get rid of the current undefined behaviour which leads to the known problems with Lisp and Python.
Comment 27 Ulrich Müller gentoo-dev 2009-04-09 20:50:32 UTC
Seems that the council doesn't want this, so it's sort of pointless to keep this bug open. Do what you want, I'm not going to waste any more time on this.
Comment 28 Marijn Schouten (RETIRED) gentoo-dev 2009-04-09 22:16:52 UTC
Ulrich is feeling a little melodramatic but we are still interested in this (and so are other parties), so I'm reopening this.
Comment 29 Ulrich Müller gentoo-dev 2009-07-03 09:23:02 UTC
This was already discussed by the council for EAPI 3 (but postponed). Therefore changing whiteboard status.
Comment 30 Ulrich Müller gentoo-dev 2009-08-22 20:26:14 UTC
Just for reference:
<http://archives.gentoo.org/gentoo-dev/msg_298f0ee6c441d33d5f52b901c8189235.xml>
Comment 31 Zac Medico gentoo-dev 2009-10-09 20:34:32 UTC
(In reply to comment #25)
> Looking through this bug, the tally for preservation vs not (or weird
> alternatives that do forced resets) seems to be in favor of preservation

For the record, I'm in favor of unconditional preservation of mtimes. If the package manager assumes a role in changing mtimes then that's taking control away from the ebuild and that seems like an unnecessary potential source of conflict.
Comment 32 Fabio Erculiani (RETIRED) gentoo-dev 2009-10-12 19:26:05 UTC
Entropy always preserved mtimes and is working fine on 18000 pkgs. So I think it's fine from binpkgs POV too.
Comment 33 Ulrich Müller gentoo-dev 2009-10-12 19:54:46 UTC
Included in EAPI 3 by today's council decision.
Comment 34 Ciaran McCreesh 2009-10-12 20:00:36 UTC
What did the Council say about handling files that are modified by the package manager?
Comment 35 Ulrich Müller gentoo-dev 2009-10-12 20:30:58 UTC
> What did the Council say about handling files that are modified by the package
> manager?

AFAICS, this mainly concerns behaviour of the prepall* functions (mainly prepallstrip), which PMS doesn't say much about anyway?

Comment 36 Ciaran McCreesh 2009-10-12 20:36:23 UTC
No, it concerns things like package manager handled compression and QA fixing. If we simply make PMS say that mtimes must be preserved, then Portage isn't compliant because it sometimes changes mtimes of things. On the other hand, if you say that the package manager can change the mtime of any file it wants to, we're back to ebuilds not being able to rely upon anything.
Comment 37 Ulrich Müller gentoo-dev 2009-10-12 23:12:18 UTC
The council's vote was "Always preserve timestamps when merging from D to ROOT." So this is about PMS chapter "Merging and unmerging", section "Regular files".

What happens to the timestamps in ${D} before merging is a different story.
IMO, we should simply document Portage's established behaviour (assuming that it hasn't recently changed), since I'm not aware of any problems with it.
Comment 38 Ciaran McCreesh 2009-10-12 23:13:44 UTC
Portage's established behaviour is to preserve mtimes except where it doesn't. Not exactly suitable language for a specification...
Comment 39 Ulrich Müller gentoo-dev 2009-10-12 23:32:17 UTC
It's not surprising that the _modification_ time of a file changes if the file's contents is _modified_, e.g. by stripping it. For Portage, this does not happen during merge, but between the src_install and pkg_preinst phases, and currently PMS has very little to say about it. I am not against specifying it better, but it's a different topic, maybe even independent of the EAPI.

When merging from D to ROOT (and this and nothing else is what this bug is about), recent Portage versions do preserve mtimes.
Comment 40 Ciaran McCreesh 2009-10-12 23:38:13 UTC
Well that's my point. If you say "mtimes are preserved", then Portage doesn't comply, since it modifies mtimes for some things. So there has to be a loophole.

But the loophole can't just be "except for any file the package manager potentially modifies" (not *actually* modifies, since some of the mtime changing ops Portage does can end up not changing anything), because if you say that then it's perfectly legal for a package manager to ignore all mtimes and say that it "potentially modifies" every file.

So what's the wording? Which files must the package manager preserve mtimes for, and which can it clobber?
Comment 41 Ulrich Müller gentoo-dev 2009-10-13 06:18:29 UTC
Created attachment 206929 [details, diff]
Patch for merge.tex

(In reply to comment #40)
> So what's the wording? Which files must the package manager preserve mtimes
> for, and which can it clobber?

See attachment.
Comment 42 David Leverton 2009-10-13 17:13:53 UTC
(In reply to comment #41)
> See attachment.

"Future versions of Paludis will perform two passes of ROT13 encryption on all merged files, for security purposes.  Naturally, the modification times will be updated accordingly."
Comment 43 Ulrich Müller gentoo-dev 2009-10-13 17:20:53 UTC
(In reply to comment #42)
> "Future versions of Paludis will perform two passes of ROT13 encryption on all
> merged files, for security purposes.  Naturally, the modification times will be
> updated accordingly."

Get a life.
Comment 44 David Leverton 2009-10-13 17:36:30 UTC
(In reply to comment #43)
> Get a life.

Write a proper spec, or let someone else do so.
Comment 45 Ulrich Müller gentoo-dev 2009-10-13 17:53:32 UTC
(In reply to comment #44)
> Write a proper spec, or let someone else do so.

Please first explain why a package manager should do such utter nonsense as in comment #42. Besides, the spec covers it, since that operation wouldn't modify the file contents.

If you think that the wording could be improved, how about contributing something constructive, like a patch with a better wording?

Otherwise, just shut up. I'm really tired of comments like yours.

Comment 46 Ciaran McCreesh 2009-10-13 17:56:22 UTC
(In reply to comment #45)
> (In reply to comment #44)
> > Write a proper spec, or let someone else do so.
> 
> Please first explain why a package manager should do such utter nonsense as in
> comment #42. Besides, the spec covers it, since that operation wouldn't modify
> the file contents.

It's a legitimate objection. The way you've worded it, it's entirely legal for package managers to clobber every single mtime. That's clearly not the intent, but we can't implement "what Ulrich means". The specification has to be specific, since it's impossible to code something that reads your mind.

As an example: is it legal to run a QA fixing script on any file that looks suspicious, where looking suspicious includes having an mtime from before the start of the build process?
Comment 47 Ulrich Müller gentoo-dev 2009-10-13 18:10:43 UTC
(In reply to comment #46)
> It's a legitimate objection. The way you've worded it, it's entirely legal
> for package managers to clobber every single mtime. That's clearly not the
> intent, but we can't implement "what Ulrich means". The specification has to
> be specific, since it's impossible to code something that reads your mind.

I think that you know perfectly well what is the intent here. Please help to find a better wording, if you think that the current one has loopholes.

> As an example: is it legal to run a QA fixing script on any file that looks
> suspicious, where looking suspicious includes having an mtime from before the
> start of the build process?

That would be option B of comment #26, and the council voted against it.
Comment 48 Ciaran McCreesh 2009-10-13 18:20:20 UTC
(In reply to comment #47)
> I think that you know perfectly well what is the intent here. Please help to
> find a better wording, if you think that the current one has loopholes.

The intent is to say "exactly what Portage does". The problem with that is that "exactly what Portage does" is modify mtimes on a pretty much arbitrary set of files, just not the ones you happen to want. Thus, the intent isn't something suitable for specification.

Either you need to tighten this up to forbid some of what Portage currently does (for some arbitrary version of Portage, since QA fixers change between versions), or you need to change your intent. We're not trying to be difficult here -- it's just that you haven't thought through what you're asking, and if we implement "don't change mtimes except where we have a good reason to do so", we'll almost certainly run into problems in the future.

> > As an example: is it legal to run a QA fixing script on any file that looks
> > suspicious, where looking suspicious includes having an mtime from before the
> > start of the build process?
> 
> That would be option B of comment #26, and the council voted against it.

But it *is* legal to run 'strip' on arbitrary files that look like they're probably strippable, even if they actually aren't, right?

Another example: is it legal to fix dodgy shebang lines automatically (for example, to support Prefix-installed Python that isn't in /usr/bin/), even if by doing so it will modify the mtimes of files that Python has byte-coded?
Comment 49 David Leverton 2009-10-13 19:00:15 UTC
(In reply to comment #45)
> Please first explain why a package manager should do such utter nonsense
> as in comment #42.

Not the point, it was a demonstration of how the wording is flawed.

> Besides, the spec covers it, since that operation wouldn't modify the
> file contents.

Yes it would, twice.

> If you think that the wording could be improved, how about contributing
> something constructive, like a patch with a better wording?

If people aren't going to insist on unreasonable requirements, sure.

> Otherwise, just shut up. I'm really tired of comments like yours.

Oh, there are lots of things I'm really tired of, but let's not go there.
Comment 50 Mark Loeser (RETIRED) gentoo-dev 2009-10-13 19:35:14 UTC
(In reply to comment #49)
> > Otherwise, just shut up. I'm really tired of comments like yours.
> 
> Oh, there are lots of things I'm really tired of, but let's not go there.

David,

Please take this nonsense somewhere else.  There were much easier and more constructive ways to make your original point without coming across like you did.

Comment 51 Ciaran McCreesh 2009-10-13 19:40:36 UTC
(In reply to comment #50)
> Please take this nonsense somewhere else.  There were much easier and more
> constructive ways to make your original point without coming across like you
> did.

Pick on the right person, please. Ulrich was the one who started the abuse in comment #43. Comment #42 was merely a demonstration that the wording Ulrich suggested was completely inadequate.
Comment 52 Mark Loeser (RETIRED) gentoo-dev 2009-10-13 19:42:34 UTC
(In reply to comment #51)
> Pick on the right person, please. Ulrich was the one who started the abuse in
> comment #43. Comment #42 was merely a demonstration that the wording Ulrich
> suggested was completely inadequate.
> 

Sorry, this isn't a discussion.  There was no reason for comment #42 to be made the way it was.  There were much easier ways to express the point, as I already stated.  If you have a problem with it, take it somewhere else, and not on this bug.  
Comment 53 David Leverton 2009-10-14 17:25:39 UTC
(In reply to comment #50)
> David,
> 
> Please take this nonsense somewhere else.  There were much easier and more
> constructive ways to make your original point without coming across like you
> did.
> 

Mark,

After discussing this on IRC with both you and UserRel, the only thing suggested that I might have done wrong is not get my point across clearly enough.  I apologise for this, and will try to use simpler arguments in future.

However, I have difficulty imagining how this could justify comments such as "get a life" or "take this nonsense somewhere else".  Might I suggest that if we all made an effort to be more civil to each other, Gentoo would be a lot more fun to contribute to for everyone.
Comment 54 Ulrich Müller gentoo-dev 2009-10-14 18:30:34 UTC
(In reply to comment #53)
> However, I have difficulty imagining how this could justify comments such as
> "get a life" [...]

I may have overreacted, in which case I apologise.

> Might I suggest that if we all made an effort to be more civil to each other,
> Gentoo would be a lot more fun to contribute to for everyone.

+1

The style in this bug was factual and civil until you entered the discussion with your maddening comment #42. After that, things escalated.
Comment 55 David Leverton 2009-10-14 19:54:14 UTC
(In reply to comment #54)
> (In reply to comment #53)
> > However, I have difficulty imagining how this could justify comments such as
> > "get a life" [...]
> 
> I may have overreacted, in which case I apologise.

Accepted, and I also apologise if I overreacted to any of your overreactions. ;-)

> The style in this bug was factual and civil until you entered the discussion
> with your maddening comment #42. After that, things escalated.
> 

That comment was not intended to be uncivil, but it seems it came across that way to some people, possibly due to language barriers, cultural differences or just different personalities.  As I say, I will do my best to avoid that happening in future.
Comment 56 Ciaran McCreesh 2009-10-15 22:56:31 UTC
Looking at implementing this, my current list of questions is:

* Preserve mtimes on what? Definitely just regular files?

* Preserve how? Is second-resolution enough, or are we expected to preserve nanosecond-resolution on systems that support it? What are the implications on nanosecond-resolution systems where some files are merged by a rename and some by a copy?

* Which files is the package manager allowed to modify mtimes for?

So far as I can see the Council didn't attempt to address any of these questions when mandating the EAPI 3 change. Without answers to these, I don't see the proposal as being implementable either as a standard or in code.
Comment 57 Ulrich Müller gentoo-dev 2009-12-08 16:15:36 UTC
From the preliminary summary of yesterday's council meeting:

| ACTIONS
| ========
| * ulm to talk with Zac Medico about defining current portage mtime
|   preservation behaviour for documenting in EAPI-3

| mtime preservation
| ==================
| The council majority voted to document precisely the current behavior of
| portage and what can be relied upon as part of the upcoming EAPI-3 (prefix
| support EAPI), so that since EAPI-3 the current portage behaviour can be
| relied upon from all compliant package managers. The exact behaviour needs
| to still get documented, however.

I've talked with Zac, and we come up with the following wording:

"In EAPIs listed in table [...], the package manager must preserve modification
times of regular files. This includes files being compressed before merging.
Exceptions to this are files newly created by the package manager and binary
object files being stripped of symbols. [1]

The modification time is guaranteed to be preserved with at least one-second
precision. If the package manager chooses to preserve sub-second timestamps,
then the full precision available with the underlying operating system,
filesystems, and programming language [2] must be used. Timestamps of all
files installed in the same destination filesystem must be handled in a
consistent way, especially their ordering [3] must be preserved. [4]

In other EAPIs, the behaviour with respect to file modification times is
undefined."


Remarks (not to appear in the final specification):

[1] A previous wording contained a "maybe others" placeholder, but we think
    that the above list is complete. Quoting zmedico:
    > I'm not sure about the "maybe others" part. You might just omit that for
    > now and we can update it later if we happen to notice something else.

[2] The "Python clause".
    Exact Portage behaviour is as follows (description from zmedico):

    If higher mtime precision is supported by both the filesystem used for
    $PORTAGE_TMPDIR and filesystem where the files will finally be installed,
    the floating-point mtime in units of seconds is preserved with the
    precision of a IEEE 754 double precision floating-point number which has
    a 53 bit significand. This gives 2^53 combinations which is equivalent to
    approximately 16 significant decimal digits.

    Given the current unix time which exceeds 10^9 seconds, with 10 digits to
    the left of the decimal and 6 to the right, the precision is 10^-6 seconds
    (1 microsecond). For mtimes after the year 2300 or so [the exact date is
    Sat, 20 Nov 2286 17:46:40 UTC], the unix time will exceed 10^10 seconds,
    which will leave only 5 digits of precision to the right of the decimal,
    giving a precision of 10^-5 seconds (10 microseconds).

[3] "Ordering must be preserved" means: if for any two timestamps the relation
    t1 <= t2 holds before merging, then it must also hold after merging. But I
    think we need not specify it in such detail.

[4] Thanks to Brian Harring for his comments on this paragraph.


As always, your comments are welcome. I would much prefer if we could reach
consensus with PMS and Paludis authors on the wording, which should of course
be based on the above mentioned council decision. In any case, we (zmedico and
myself) will proceed with this one week from now, even if no such consensus
could be reached.
Comment 58 Ciaran McCreesh 2009-12-08 16:25:34 UTC
Initial observations:

Portage doesn't get even single second precision right when the build filesystem supports sub-second resolutions.

Portage also doesn't preserve ordering, going by POSIX rules.

Some versions of Portage record the wrong mtime in VDB when sub-second resolutions are in effect. VDB's not part of the spec, but we need to consider this...
Comment 59 Ulrich Müller gentoo-dev 2009-12-08 16:49:33 UTC
(In reply to comment #58)
> Portage doesn't get even single second precision right when the build
> filesystem supports sub-second resolutions.

Then please provide us with an example where:
|t(before merge) - t(after merge)| >= 1 second

Comment 60 Ciaran McCreesh 2009-12-08 17:53:48 UTC
(In reply to comment #59)
> (In reply to comment #58)
> > Portage doesn't get even single second precision right when the build
> > filesystem supports sub-second resolutions.
> 
> Then please provide us with an example where:
> |t(before merge) - t(after merge)| >= 1 second

timestamp before merge: 123456789s 999999900ns, which as seen by a C app written to older POSIX standards, is 123456789s.

timestamp after merge: 123456790s 000000000ns, which as seen by a C app written to older POSIX standards, is 123456790s.

POSIX handles legacy code by defining st_mtime to be st_mtim.tv_sec. So what you're looking for is any case where st_mtim.tv_sec before merge != st_mtim.tv_sec after merge, which can variously consist of times with a sub-second part > 500 000 000ns, or with a second part approximately > 999 999 900 ns, depending upon which version of Portage you look at and which code-path is taken.

There are two distinct issues with Portage's code here.

The first is that rounding the float timestamp to get the second-resolution is wrong. It should instead take the seconds part, and ignore the remainder.

The second is that thanks to 754 floats being insufficiently accurate, timestamps in the form x.999 999 900 or higher will get rounded to (x + 1).000 000 000, so even if you discard the sub-second part as POSIX does, you still end up with the seconds part being wrong.
Comment 61 Martin von Gagern 2009-12-08 18:10:55 UTC
(In reply to comment #57)
> [3] "Ordering must be preserved" means: if for any two timestamps the relation
>     t1 <= t2 holds before merging, then it must also hold after merging. But I
>     think we need not specify it in such detail.

That definition would be bad, as it could be fulfilled by simply adjusting the
timestamp of all files to the same timestamp. As the Lisp example from comment
#0 suggests that differences need to be preserved, you'd have to either give
even more detail, or keep the "preserve order" formulation without the note and
rely on common sense and mathematical education. Should be enough.

How about rounding effects? If the file was created at time 12345.999, could
there be filesystems or operating systems reporting the integer resolution of
this file as 12346? If so, do you perhaps require complying package managers to
preserve the integral representation of the mtime on that os, however that
representation might be defined with respect to rounding?

I assume that all os would simply truncate timestamps, i.e. use floor to get
the integer representation. With a nanosecond resolution os and a millisecond
(e.g. python-based) package manager, rounding could still lead to trouble, as a
os timestamp could be rounded up by python, leading to a different integral
representation.

If mtimes are closely spaced, such slight rounding issues could impact order as
well, so I assume the definition should take them into account for that reason.

I just read that comment #60, wubmitted while I was reading, goes in a similar direction.
Comment 62 Ciaran McCreesh 2009-12-08 18:26:38 UTC
(In reply to comment #61)
> How about rounding effects? If the file was created at time 12345.999, could
> there be filesystems or operating systems reporting the integer resolution of
> this file as 12346? If so, do you perhaps require complying package managers to
> preserve the integral representation of the mtime on that os, however that
> representation might be defined with respect to rounding?

Timestamps aren't ever rounded. So far as POSIX is concerned, and so far as any C application using stat.st_mtime is concerned, Xs 000000000ns through Xs 999999999ns are all Xs. Filesystems that don't support nanosecond resolution timestamps similarly just set the seconds part to the number of whole seconds since the epoch, and the nanoseconds part is treated as being 0, even if you're one nanosecond away from the next whole second.

Using the newer stat.st_mtim approach, you get given the seconds and the nanoseconds as a pair of integer-ish values. Applications can choose to just use the tv_sec part of that pair, or to use both tv_sec and tv_nsec.

> I assume that all os would simply truncate timestamps, i.e. use floor to get
> the integer representation.

Sort of correct. There's no floor involved in the OS, however, since there aren't any floating point values. The nanosecond part is simply discarded or set to 0.

> With a nanosecond resolution os and a millisecond (e.g. python-based) package
> manager, rounding could still lead to trouble, as a os timestamp could be 
> rounded up by python, leading to a different integral
> representation.

Yup. Rounding is *always* wrong, because POSIX doesn't round. Truncation is also wrong if the resolution of your value isn't high enough to hold 123456890.999999999 without loss of precision, which doubles aren't.
Comment 63 Ulrich Müller gentoo-dev 2009-12-08 18:59:36 UTC
(In reply to comment #60)
> timestamp before merge: 123456789s 999999900ns
> timestamp after merge: 123456790s 000000000ns

I see only a difference of 100 nanoseconds here, which is well below the one second required by the spec.
Comment 64 Ciaran McCreesh 2009-12-08 19:01:41 UTC
(In reply to comment #63)
> (In reply to comment #60)
> > timestamp before merge: 123456789s 999999900ns
> > timestamp after merge: 123456790s 000000000ns
> 
> I see only a difference of 100 nanoseconds here, which is well below the one
> second required by the spec.

Because you conveniently ignored the entire rest of the comment. To a C application using the old stat.st_mtime interface, which as you pointed out is currently most things, there's a difference of one second there.
Comment 65 Ulrich Müller gentoo-dev 2009-12-08 19:26:59 UTC
(In reply to comment #64)
> > > timestamp before merge: 123456789s 999999900ns
> > > timestamp after merge: 123456790s 000000000ns
> > 
> > I see only a difference of 100 nanoseconds here, which is well below the
> > one second required by the spec.
> 
> Because you conveniently ignored the entire rest of the comment. To a C
> application using the old stat.st_mtime interface, which as you pointed out
> is currently most things, there's a difference of one second there.

If the real timestamp is 123456789.9999999 s, the old C application will get a time that is off by -0.9999999 s for the file before merging, and by +0.0000001 s for the file after merging. Both are below the maximum error of 1 s that is allowed.

I fully agree with you that usage of floating point numbers for timestamps sucks. However, this bug is not the place for such fundamental criticism of Python's concepts.

Please keep in mind that our task (by council's decision) is to document Portage's current behaviour, and I believe that the proposed wording covers the case of your example. And it's formulated in a way ("the full precision available with the underlying [...] programming language must be used") that should force Portage to upgrade to a better method as soon as one is available with Python.
Comment 66 Ciaran McCreesh 2009-12-08 19:37:02 UTC
(In reply to comment #65)
> If the real timestamp is 123456789.9999999 s, the old C application will get a
> time that is off by -0.9999999 s for the file before merging, and by +0.0000001
> s for the file after merging. Both are below the maximum error of 1 s that is
> allowed.

No, it won't. It will get the correct, as defined by POSIX, second-resolution time before merging, and an incorrect, as defined by POSIX, second-resolution time after merging. Any program that simply stores a list of (filename, second-resolution timestamp) pairs and checks that list for equality will break.

> Please keep in mind that our task (by council's decision) is to document
> Portage's current behaviour, and I believe that the proposed wording covers the
> case of your example. And it's formulated in a way ("the full precision
> available with the underlying [...] programming language must be used") that
> should force Portage to upgrade to a better method as soon as one is available
> with Python.

What you're documenting is behaviour that will sometimes break the very thing you're trying to make work. Let's have a look at what Python does:

	time_t mtime = srcstat->st_mtime;
	/* ... */
	PyMarshal_WriteLongToFile((long)mtime, fp, Py_MARSHAL_VERSION);

As you can see, it writes the second-part of the mtime to the compiled file. Then, to check:

static FILE *
check_compiled_module(char *pathname, time_t mtime, char *cpathname)
{
	/* ... */
	pyc_mtime = PyMarshal_ReadLongFromFile(fp);
	if (pyc_mtime != mtime) {
		if (Py_VerboseFlag)
			PySys_WriteStderr("# %s has bad mtime\n", cpathname);
		fclose(fp);
		return NULL;
	}

It reads the value back out, and again compares it to mtime using equality.

Thus, if the time at build is 1234567890.999999999, when writing the file, Python will write 1234567890. If the time after the merge is 1234567891.000000000, Python will compare 1234567891 to 1234567890, see that they are not equal, and barf.

Is it really the Council's decision to mandate a behaviour that cannot be used for bytecode-compiling Python modules?
Comment 67 Ulrich Müller gentoo-dev 2009-12-08 19:45:57 UTC
(In reply to comment #61)
> > [3] "Ordering must be preserved" means: if for any two timestamps the
> >     relation t1 <= t2 holds before merging, then it must also hold after
> >     merging. But I think we need not specify it in such detail.
> 
> That definition would be bad, as it could be fulfilled by simply adjusting
> the timestamp of all files to the same timestamp.

You must not take this definition alone. The spec also says: "The modification time is guaranteed to be preserved with at least one-second precision." So any timestamps that differ by >= 1 s cannot be adjusted to the same value. (And for sub-second differences there is no guarantee that different timestamps will stay different, because the target filesystem may truncate them.)
Comment 68 Martin von Gagern 2009-12-08 21:42:34 UTC
Would it make sense to require exactly second resolution timestamps, i.e. drop all sub-second information upon merge, but require the integral portion to be passed without any rounding effects? This would ensure maximum portability (as all os and most fs support it, with the exception of FAT iirc), maximum reproducability (as there are hopefully no bugs related to fractions of a second and unlikely rounding scenarios), and any packages or languages depending on sub-second resolution will require special care in any case (e.g. by the ebuild touching all files to integer second timestamps), and can probably be considered broken by design.

OK, this is not current portage behaviour, but PMS is not OOXML, we shouldn't be specifying buggy behaviour for the sake of compatibility with one specific app. And I assume the council's decision wasn't specifically about subsecond behaviour.
Comment 69 Ulrich Müller gentoo-dev 2009-12-08 22:07:38 UTC
Maybe the first sentence of the second paragraph can be tightened to:

  "The integer seconds part of the modification time is guaranteed to
   be preserved."

It should be easy to change Portage to preserve the integer seconds part (assuming for now that the example in comment #60 is real). We already have an idea how to do it. But I think the definite answer has to come from Zac.
So, stay tuned.
Comment 70 Ulrich Müller gentoo-dev 2009-12-11 06:48:43 UTC
Created attachment 212677 [details]
test-timestamps-1.ebuild

(In reply to comment #66)
> What you're documenting is behaviour that will sometimes break the very
> thing you're trying to make work.
> [...]
> Thus, if the time at build is 1234567890.999999999, when writing the file,
> Python will write 1234567890. If the time after the merge is
> 1234567891.000000000, Python will compare 1234567891 to 1234567890, see that
> they are not equal, and barf.

Yes, and the problem occured with a probability of 2^-23 = 1.19 * 10^-7. Anyway, it has been fixed in Portage 2.1.7.12 and 2.2_rc57.

Attached is an ebuild for testing.
Comment 71 Ulrich Müller gentoo-dev 2009-12-11 06:56:51 UTC
Created attachment 212678 [details, diff]
Patch for PMS

Looks like all points have been addressed, so here is the patch for PMS. The wording has been approved by zmedico.

Changes to previous version:
a) the seconds part is guarateed to be preserved, and
b) Support of sub-second timestamps is mandatory if supported by all filesytems.

(I didn't bother to clutter the patch with KDE conditionals, because kdebuild isn't part of the official PMS version. Add them yourself if you care.)
Comment 72 Ulrich Müller gentoo-dev 2009-12-11 07:12:37 UTC
(In reply to comment #60)
> The second is that thanks to 754 floats being insufficiently accurate,
> timestamps in the form x.999 999 900 or higher will get rounded to (x + 1).000
> 000 000, so even if you discard the sub-second part as POSIX does, you still
> end up with the seconds part being wrong.

Thank you for pointing this out, BTW.
Comment 73 Ulrich Müller gentoo-dev 2009-12-11 08:51:53 UTC
Created attachment 212680 [details, diff]
Patch for PMS

(In reply to comment #71)
> b) Support of sub-second timestamps is mandatory if supported by all
>    filesytems.

Can't do this, binpkgs don't preserve them (they are not mentioned in the spec, but I think we have to consider this). Thanks to ferringb for pointing this out.
Comment 74 David Leverton 2009-12-11 12:58:25 UTC
(In reply to comment #70)
> Anyway, it has been fixed in Portage 2.1.7.12 and 2.2_rc57.

I thought the goal of this exercise was to avoid changing Portage's behaviour?  If we're changing Portage after all, why do we need the exemption for stripped files?
Comment 75 Ulrich Müller gentoo-dev 2009-12-11 13:36:09 UTC
(In reply to comment #74)
> I thought the goal of this exercise was to avoid changing Portage's
> behaviour?

Right, but fixing of bugs is not forbidden. And I'm sure that the problem pointed out by Ciaran would have been fixed earlier if it only had happened more often. Remember, the chance to be hit by this issue was only one out of 8.4 millions (and even this only in the case that both ${D} and ${ROOT} were on different filesystems, but with support for nanosecond timestamps).
Comment 76 Ciaran McCreesh 2009-12-11 18:42:01 UTC
So far as I can see, Portage still doesn't give "the full precision available" with Python, since lopping 9s off the end isn't the most precise available algorithm. I think a less strict wording is required there.
Comment 77 Ciaran McCreesh 2009-12-11 18:50:39 UTC
Actually... Bigger problem. The code as-is still rounds mtimes up sometimes. This will lead to files looking newer than they really are if programs use st_mtim.

I really think the only viable solution here is to say something like:

Package managers must exactly preserve the seconds part of the mtime. The sub-second part must either be perfectly preserved or set to 0, and this must be done consistently across any particular destination filesystem.

Which again would need Portage changes, but Portage is already changing what it does...
Comment 78 Ulrich Müller gentoo-dev 2009-12-11 19:02:48 UTC
(In reply to comment #76)
> So far as I can see, Portage still doesn't give "the full precision
> available" with Python, since lopping 9s off the end isn't the most precise
> available algorithm.

Not sure if I understand you correctly, but this is about the branch that happens in 2^-23 of all cases? Without the change introduced in Portage 2.1.7.12, timestamps were changed upon merging as follows:

   1234567890.999999000 -> 1234567890.999999000
   1234567890.999999880 -> 1234567890.999999000
   1234567890.999999881 -> 1234567891.000000000
   1234567890.999999900 -> 1234567891.000000000

With that change, the last two behave as follows (first two are same as above):

   1234567890.999999881 -> 1234567890.999999000
   1234567890.999999900 -> 1234567890.999999000

So it is 1 µs off. Since all timestamps that I've seen in my tests with Python had the last three digits equal to zero, I don't see how we could do any better.

> I think a less strict wording is required there.

Then provide one please, if you still think that it's an issue.
Comment 79 Ulrich Müller gentoo-dev 2009-12-11 19:08:45 UTC
(In reply to comment #77)
> Actually... Bigger problem. The code as-is still rounds mtimes up sometimes.
> This will lead to files looking newer than they really are if programs use
> st_mtim.

And if you truncate to the next integer they will look older than they really are, and the difference to the real time will be much larger (namely 0.5 seconds, on average).

> Which again would need Portage changes,

Yes, and it would be an inferior solution, so we are not going to do this.
Comment 80 Ciaran McCreesh 2009-12-11 19:48:37 UTC
Have you verified that every package listed in comment #0 doesn't use sub-second resolutions if available?
Comment 81 Zac Medico gentoo-dev 2009-12-11 20:01:36 UTC
I've added this code to search for another digit of precision:

another_digit = None
for i in range(1, 9):
	i_str = str(i)
	if int_mtime != long(float(mtime_str + i_str)):
		break
	else:
		another_digit = i_str
if another_digit is not None:
	mtime_str += another_digit
	newmtime = float(mtime_str)
Comment 82 Ciaran McCreesh 2009-12-11 20:07:30 UTC
Hang on. Someone please explain:

How is partially preserving sub-second mtimes in a way that can round them up better than always setting them to 0?

As I see it, we're likely to encounter four possible behaviours:

* Code that uses st_mtime, and compares using <. In this case, any solution is fine.
* Code that uses st_mtime, and compares using =. In this case, newer Portage is fine, but the behaviour the Council voted on most recently isn't.
* Code that uses st_mtim, and compares it using <. In this case, what Portage currently does will sometimes break. Setting tv_nsec to 0, however, will have to always work.
* Code that uses st_mtim, and compares it using =. In this case, what Portage currently does will usually break. Setting tv_nsec to 0, however, will have to always work.

Why not just say that if tv_nsec can't be preserved exactly, it must be set to 0?
Comment 83 Ulrich Müller gentoo-dev 2009-12-11 20:33:01 UTC
(In reply to comment #80)
> Have you verified that every package listed in comment #0 doesn't use
> sub-second resolutions if available?

Yes:

(In reply to comment #0)
>   - Lisp (*.fasl must be newer than corresponding *.lisp)
>   - Emacs (*.elc must not be older than *.el)

All affected Lisp versions use st_mtime only. And it's not really relevant, since they use mtimes for comparison only. But sub-second resolution might be an advantage, since it might avoid equality of timestamps in rare cases.

>   - ghdl/VHDL (libraries check their mtimes)

AFAICS, creates a Time_Stamp_Id for labelling its libraries from a broken-down time (whatever is the equivalent of a "struct tm" in ADA). Seconds precision only.

>   - Python (*.pyc vs *.py, not sure if this one is still an issue)

See your own comment #66.
Comment 84 Ulrich Müller gentoo-dev 2009-12-11 20:53:31 UTC
(In reply to comment #83)
> All affected Lisp versions use st_mtime only. And it's not really relevant,
> since they use mtimes for comparison only. But sub-second resolution might be
> an advantage, since it might avoid equality of timestamps in rare cases.

I have to correct myself here. Of course, sub-second resolution doesn't make any difference in this case because the time is read with second precision only.
Comment 85 Ciaran McCreesh 2009-12-11 20:58:37 UTC
Ok, so what's our plan for the future, when those packages start using st_mtim as POSIX suggests?
Comment 86 Ulrich Müller gentoo-dev 2009-12-11 21:11:06 UTC
(In reply to comment #85)
> Ok, so what's our plan for the future, when those packages start using st_mtim
> as POSIX suggests?

Lisp/Emacs: Use mtimes for less-than comparison only, so it won't be an issue.

Python: Will read the timestamp with the precision given by the language, so it can't reasonably do equality comparison until it will support full precision, at which point the issue will go away.

Leaves us with GHDL which would have to change its timestamp format in libraries. And as it's a sci-* package I think they have other things to care about. ;-)
Comment 87 Ciaran McCreesh 2009-12-11 21:15:25 UTC
(In reply to comment #86)
> Lisp/Emacs: Use mtimes for less-than comparison only, so it won't be an issue.

Except that the current code will sometimes round timestamps up, and sometimes down.

> Python: Will read the timestamp with the precision given by the language, so it
> can't reasonably do equality comparison until it will support full precision,
> at which point the issue will go away.

Er. That's not even the case currently. Python uses C code to do mtime comparisons, and Python 3 is using st_mtim in various places in the C code already.

This whole thing reeks of "auto space like Word 95"... Seriously, why don't we just fix it properly?
Comment 88 Ulrich Müller gentoo-dev 2009-12-11 21:41:39 UTC
(In reply to comment #87)
> > Lisp/Emacs: Use mtimes for less-than comparison only, so it won't be an
> > issue.
> 
> Except that the current code will sometimes round timestamps up, and sometimes
> down.

So what? It doesn't change relative ordering and that's all what matters.

> > Python: Will read the timestamp with the precision given by the language,
> > so it can't reasonably do equality comparison until it will support full
> > precision, at which point the issue will go away.

> Er. That's not even the case currently. Python uses C code to do mtime
> comparisons, and Python 3 is using st_mtim in various places in the C code
> already.

Still, it's a hypothetical scenario if (and in which order) such changes in Python will take place.

> This whole thing reeks of "auto space like Word 95"... Seriously, why don't
> we just fix it properly?

Seems to me that we should use the maximum precision that is available in a reasonable way, and not arbitrarily truncate timestamps. Microsecond precision is better than second precision.
Comment 89 Ciaran McCreesh 2009-12-11 21:48:34 UTC
No, microsecond precision is not better than truncating to 0. Code that uses st_mtim must support truncation to 0, since some filesystems do that, but it is entirely reasonable for that code to use equality comparison for tv_nsec.

So far as everything except Python is concerned, times are either an integer-ish or a pair of integer-ishes, and they can be compared using either less than or equality. Since none of the programs we're discussing use Python code for any of the timestamp checking, why do we think that any of those programs could possibly use an "approximately equal" comparison algorithm?
Comment 90 Ulrich Müller gentoo-dev 2009-12-11 22:06:44 UTC
Again, let me remind you that we document Portage behaviour, and there are no known bugs with its current behaviour.
Comment 91 Ciaran McCreesh 2009-12-11 22:09:49 UTC
The Council said to document Portage behaviour. So does that mean Portage behaviour when the vote was taken? If so, that's not what's being documented now, because of the change to avoid corrupting the seconds part.

So I guess the Council wants us to document Portage's behaviour once the bugs are fixed, in which case I consider corruption of tv_nsec to be a bug, and would like you to explain why it is in fact a feature.
Comment 92 Ciaran McCreesh 2009-12-11 22:32:00 UTC
I had a closer look at the log, rather than the summary. Looks like what was voted on was what Portage did on the day of the meeting:

22:23:20 <Calchan>	we'll define mtime preservation based on what portage does today in EAPI3 onwards
22:23:36 <ulm>	right
22:23:50 <dertobi123>	ack

So it looks like we have to go back to the Council anyway unless we want to document that the second part can be clobbered by up to one second...
Comment 93 Ulrich Müller gentoo-dev 2009-12-11 22:32:36 UTC
(In reply to comment #91)
> So I guess the Council wants us to document Portage's behaviour once the bugs
> are fixed, in which case I consider corruption of tv_nsec to be a bug,

I don't. At least preserving it at µs level is less a bug than replacing it by zero. Average error is 500 ns in the first case vs 500000000 ns in the second case. (Well, in fact it's square distributions, so there are some additional factors of sqrt(12) for the standard deviation here, but I'm too lazy to work it out now. You get the idea.)
Comment 94 Ciaran McCreesh 2009-12-11 22:35:14 UTC
It's not an average error, though. It's either completely right or it's wrong. There's no "almost correct" here.
Comment 95 Ciaran McCreesh 2009-12-11 22:48:44 UTC
Having reviewed the log carefully, here's wording that I think matches the Council decision:

In EAPIs listed in (table), for all regular files except binary files that have been stripped of symbols, the package manager must ensure that the seconds part of the mtime of the merged file is either equal to or one second higher than the seconds part of the mtime of the file in the destdir, and that the full resolution mtime of the merged file is no more than two seconds different from the full resolution mtime of the file in the destdir.

Is everyone happy with this?
Comment 96 Ulrich Müller gentoo-dev 2009-12-11 22:53:38 UTC
(In reply to comment #92)
> Looks like what was voted on was what Portage did on the day of the meeting:

That's quibbling. And what was actually voted on was topic 5.3 of the agenda.

> So it looks like we have to go back to the Council anyway

There's no use in iterating this thing endlessly.

The current discussion on nanoseconds is purely academic anyway, since there's not a single known bug caused by current Portage behaviour.

> unless we want to document that the second part can be clobbered by up to
> one second...

That would be nonsensical, now that the bug in Portage has been fixed. And I'm sure that the council is of the same opinion.

And as has been pointed out before, the probability of hitting this was 2^-23, which is well below the chance of being killed by a stroke of lightning. ;-)
Comment 97 Ulrich Müller gentoo-dev 2009-12-11 22:57:41 UTC
(In reply to comment #95)
> the package manager must ensure that the seconds part of the mtime of the
> merged file is either equal to or one second higher than the seconds part of
> the mtime of the file in the destdir, [...]

> Is everyone happy with this?

Stop it, please.
Comment 98 Ciaran McCreesh 2009-12-11 23:00:11 UTC
The probability of it hitting is around one in ten million per file, but the average user probably has ten thousand such files on their system...

And no, the discussion on nanoseconds is ensuring we won't have to go over this whole thing again for EAPI 5. If you're saying that the Council decision allows Portage to fix its buggy behaviour for seconds before we write up the spec (which makes things a lot easier for us), then that same decision allows Portage to fix its buggy behaviour for nanoseconds before we write up the spec. Why don't we do that?

Why are we specifying something that we know is broken, and that we know will come back to haunt us later?
Comment 99 Ulrich Müller gentoo-dev 2009-12-11 23:04:27 UTC
(In reply to comment #98)
> The probability of it hitting is around one in ten million per file, but the
> average user probably has ten thousand such files on their system...

Then show me a real bug that was reported because of this.
Comment 100 Ciaran McCreesh 2009-12-11 23:06:44 UTC
(In reply to comment #99)
> Then show me a real bug that was reported because of this.

We don't know, because such failures are typically silently ignored. That does not mean that the bug was not genuine, however, nor that it did not affect people.
Comment 101 David Leverton 2009-12-11 23:08:53 UTC
(In reply to comment #99)
> Then show me a real bug that was reported because of this.

Really, why is it so difficult to just do things properly, and make sure that there won't ever be any such "real" bugs in future?
Comment 102 Zac Medico gentoo-dev 2009-12-11 23:13:17 UTC
Everyone can agree that the integral seconds portion needs to be preserved, and nanoseconds portion isn't used by any known packages. So, the spec doesn't need to be any tighter than that.
Comment 103 Ciaran McCreesh 2009-12-11 23:15:32 UTC
Isn't used by any known package *currently*. But since POSIX considers st_mtim to be the way to go, we can either hope that everyone ignores POSIX so we won't have to fix all this again next year, or we can fix it properly now.
Comment 104 Ulrich Müller gentoo-dev 2009-12-11 23:23:04 UTC
(In reply to comment #103)
> Isn't used by any known package *currently*. But since POSIX considers st_mtim
> to be the way to go, we can either hope that everyone ignores POSIX so we won't
> have to fix all this again next year, or we can fix it properly now.

The spec says "The package manager *may* also preserve sub-second timestamps" (my emphasis) so PM authors are free to implement it in a way that can truncate it to seconds precision.
Comment 105 Zac Medico gentoo-dev 2009-12-11 23:25:17 UTC
Even code that migrates to st_mtim may only use st_mtim.tv_sec and ignore st_mtim.tv_nsec, so it's possible that it won't be an issue. We can tighten the spec later if it becomes necessary.
Comment 106 Ciaran McCreesh 2009-12-11 23:26:33 UTC
Yes, but if it's only a may, ebuild authors cannot rely upon it happening. Thus, as soon as, say, Python 3.2 starts using st_mtim throughout (rather than just in some places, like 3.1 does), ebuild authors will no longer be able to use mtime preservation.

In order to prevent this from hitting us later on, we need to guarantee that tv_nsec won't be corrupted.

Why do we have to leave this to screw us over later, rather than just fixing it now?
Comment 107 Ulrich Müller gentoo-dev 2009-12-11 23:29:04 UTC
(In reply to comment #106)
> In order to prevent this from hitting us later on, we need to guarantee that
> tv_nsec won't be corrupted.

It's not being corrupted, it's a slight loss of precision.
Comment 108 Zac Medico gentoo-dev 2009-12-11 23:29:26 UTC
(In reply to comment #106)
> Yes, but if it's only a may, ebuild authors cannot rely upon it happening.
> Thus, as soon as, say, Python 3.2 starts using st_mtim throughout (rather than
> just in some places, like 3.1 does), ebuild authors will no longer be able to
> use mtime preservation.

They'll still be able to rely on it to the extent that the spec guarantees.
Comment 109 Ciaran McCreesh 2009-12-11 23:31:53 UTC
(In reply to comment #107)
> (In reply to comment #106)
> > In order to prevent this from hitting us later on, we need to guarantee that
> > tv_nsec won't be corrupted.
> 
> It's not being corrupted, it's a slight loss of precision.

No, for any application written in C, it's corruption. Any application doing, for example, what Python currently does, except with st_mtim instead of st_mtime, will break.

(In reply to comment #108)
> They'll still be able to rely on it to the extent that the spec guarantees.

In other words, not at all, since the spec does not guarantee that equality will hold true.

Seriously. What's so wrong about fixing this properly now as a precaution so we can guarantee we're not going to get bitten by hard to notice and weird bugs later on?
Comment 110 Ulrich Müller gentoo-dev 2009-12-11 23:35:49 UTC
(In reply to comment #109)
> > It's not being corrupted, it's a slight loss of precision.
> 
> No, for any application written in C, it's corruption. Any application doing,
> for example, what Python currently does, except with st_mtim instead of
> st_mtime, will break.

And if the timestamp is arbitrarily truncated to full seconds that won't happen? 
Can you predict how future programs implement time handling?
Comment 111 Ciaran McCreesh 2009-12-11 23:37:31 UTC
Programs will have to be written to deal with truncation for as long as there are truncating filesystems out there.
Comment 112 Ulrich Müller gentoo-dev 2009-12-11 23:40:06 UTC
I don't see much progress on this bug, so I suggest that we take the wording attached with comment #73 to the council for approval. Alternative wordings haven't been proposed (ignoring #95 which I believe wasn't serious).

If you have problems with it, then please escalate it to the PMS project lead.
Comment 113 Ulrich Müller gentoo-dev 2009-12-12 01:19:21 UTC
Also spreading false facts like:

  "the seconds part of mtimes may sometimes be one higher than they should be"

<http://ciaranm.wordpress.com/2009/12/11/eapi-3-to-specify-auto-space-like-word-95/>

while we are in the middle of a decision process, is not helpful at all.

You disqualify yourself by such postings.
EOD from my side.
Comment 114 Ciaran McCreesh 2009-12-12 01:31:16 UTC
I'm merely reporting what the Council voted upon. The Council did, after all, vote to document "the current behaviour of Portage", and at the time they voted that, the behaviour of Portage involved corrupting the seconds part. I've checked the log and the summary carefully, and the wording in comment #95 describes what the Council voted in.

The Council specifically did not vote to document "the behaviour of Portage, after various bugs had been fixed". The Council had already been told that Portage corrupted the seconds part of the mtime, and they *still* voted in favour of doing what Portage did back then, knowing that such corruption occurred.

If you want to go back to the Council yet again and get them to revise that decision, feel free to try. But given the number of times I've asked them to do this right, and the number of times they haven't, I honestly don't think I can face sending it back for yet another vote.
Comment 115 David Leverton 2009-12-12 01:56:44 UTC
(In reply to comment #113)
> You disqualify yourself by such postings.

Funny, I would think people disqualify themselves by doing things like thinking that floating point timestamps are a sensible idea, or demanding that a new EAPI should match existing behaviour exactly, no matter how flawed.  But each to his own.
Comment 116 Ulrich Müller gentoo-dev 2009-12-13 21:06:50 UTC
New wording, dropping the "Python clause", and corresponding to Portage 2.1.7.14 and 2.2_rc59 (please note that both Portage versions are also still covered by the wording attached with comment #73, i.e. the spec has been tightened):

   The integer seconds part of the modification time is guaranteed
   to be preserved. The package manager may also preserve sub-second
   timestamps; in this case, the full precision available with the
   underlying operating system and filesystems must be used, and
   timestamps must not be increased. Timestamps of all files installed
   in the same destination filesystem must be handled in a consistent
   way, especially their ordering must be preserved.

It's a bit verbose, but I think we can't simply say "preserve nanoseconds exactly" if we consider Prefix where there may be all sorts of strange OSs and FSs.

Happier with this?
Comment 117 Ciaran McCreesh 2009-12-13 21:13:28 UTC
Has Portage changed in the past couple of days to satisfy "must not be increased"? 754's default rounding mode is to round, not to go towards zero.

I find the last sentence to be fairly hard to read. I'm not sure I can concoct a clearer alternative just now, but I think we'd benefit from someone coming up with one.
Comment 118 Ulrich Müller gentoo-dev 2009-12-13 21:29:54 UTC
(In reply to comment #117)
> Has Portage changed in the past couple of days to satisfy "must not be
> increased"? 754's default rounding mode is to round, not to go towards zero.

Please note that I've dropped "programming language" from the second sentence, which in all normal cases implies that nanoseconds either must be exactly preserved or dropped entirely. Therefore any properties of IEEE 754 floats are not relevant any more.
Comment 119 Ciaran McCreesh 2009-12-13 21:43:01 UTC
Ah, Portage has changed, so we're now specifying what I've been after all along. Goodgood. So it's just that last sentence then -- can someone either suggest an alternative wording for it or just tell me it's only me who thinks it's really confusing please?
Comment 120 Ulrich Müller gentoo-dev 2009-12-13 21:48:47 UTC
(In reply to comment #119)
> So it's just that last sentence then -- can someone either suggest an
> alternative wording for it or just tell me it's only me who thinks it's
> really confusing please?

Probably a native English speaker is asked for, but what about:
"The package manager must treat all timestamps in the same destination filesystem consistently, and especially preserve their ordering from before the merge."
Comment 121 Ciaran McCreesh 2009-12-13 22:06:52 UTC
How about:

When preserving, the seconds part of every regular file's mtime must be preserved exactly. The sub-second part must either be set to zero, or preserved as precisely as is supported by the underlying operating system and filesystem, with any unpreservable precision being truncated (not rounded).

For any given destination filesystem, the package manager must ensure that there are no two preserved files a, b in that filesystem such that the comparison mtime(a) <= mtime(b) does not give the same result as the same comparison would have given under the original image directory.

That's really what we're trying to enforce, isn't it?
Comment 122 Ulrich Müller gentoo-dev 2009-12-13 22:20:44 UTC
(In reply to comment #121)
> When preserving,

This seems redundant?

> the seconds part of every regular file's mtime must be preserved
> exactly. The sub-second part must either be set to zero,

O.K.

> or preserved as precisely as is supported by the underlying
> operating system and filesystem, with any unpreservable precision
> being truncated (not rounded).

Replace by:
"or set to the greatest value supported by the operating system and
filesystem that is not greater than the original time."

> For any given destination filesystem, the package manager must
> ensure that there are no two preserved files a, b in that filesystem
> such that the comparison mtime(a) <= mtime(b) does not give the same
> result as the same comparison would have given under the original
> image directory.

O.K.

> That's really what we're trying to enforce, isn't it?

Yes.
Comment 123 Ciaran McCreesh 2009-12-13 22:23:21 UTC
Ok on all your changes. I think the "When preserving" is necessary to make it absolutely clear that we're excluding the stripped things we mentioned in an earlier paragraph.
Comment 124 Ulrich Müller gentoo-dev 2009-12-13 22:36:05 UTC
Created attachment 212928 [details, diff]
Patch for PMS

Consensus? I can't believe it. :-)

New patch for PMS attached.
Comment 125 Ulrich Müller gentoo-dev 2009-12-13 22:41:28 UTC
Created attachment 212930 [details, diff]
Updated patch for PMS.

(In reply to comment #124)
> New patch for PMS attached.

... with a typo, therefore another update.
Comment 126 Ciaran McCreesh 2009-12-13 22:42:26 UTC
What! Of course you get consensus when you do what I've been asking for all along!

Do we need to send this via the Council since it's not what they voted on, or do we just shove it in and tell them when we send EAPI 3 for review?

Also, you're missing a blank line after the subject in your format-patch.
Comment 127 Ulrich Müller gentoo-dev 2009-12-13 22:48:16 UTC
(In reply to comment #126)
> Do we need to send this via the Council since it's not what they voted on, or
> do we just shove it in and tell them when we send EAPI 3 for review?

I think the latter should be enough. I don't want do delay this any longer.

> Also, you're missing a blank line after the subject in your format-patch.

No, it's just a long subject. ;-) But split the second sentence off if you want.
Comment 128 Ulrich Müller gentoo-dev 2009-12-14 05:54:10 UTC
(In reply to comment #122)
> > For any given destination filesystem, the package manager must
> > ensure that there are no two preserved files a, b in that filesystem
> > such that the comparison mtime(a) <= mtime(b) does not give the same
> > result as the same comparison would have given under the original
> > image directory.

Another minor point: Do we really need the double negation here?
"For any given destination filesystem, the package manager must ensure that for any two preserved files a, b in that filesystem the comparison mtime(a) <= mtime(b) gives the same result as the same comparison would have given under the original image directory."
Comment 129 Ulrich Müller gentoo-dev 2009-12-14 06:12:29 UTC
Even worse, the definition is flawed: If mtime(a) = 1234567890.1 and mtime(b) = 1234567890.0 and both are truncated to seconds, then the <= comparison is false in ${D} but true in ${ROOT}.

So we go back to footnote [3] in comment #57:
"For any given destination filesystem, the package manager must ensure that for any two preserved files a, b in that filesystem the relation mtime(a) <= mtime(b) still holds if it would have hold under the original image directory."
Comment 130 Ulrich Müller gentoo-dev 2009-12-14 18:41:15 UTC
Created attachment 213015 [details, diff]
Patch for PMS

Updated patch with the wording from my previous comment.
Comment 131 Christian Faulhammer (RETIRED) gentoo-dev 2009-12-16 22:48:44 UTC
(In reply to comment #130)
> Created an attachment (id=213015) [details]
> Patch for PMS
> 
> Updated patch with the wording from my previous comment.

 We can go with that.
Comment 132 Ulrich Müller gentoo-dev 2010-01-10 16:56:02 UTC
Created attachment 215997 [details, diff]
Patch rebased against current PMS master
Comment 133 Christian Faulhammer (RETIRED) gentoo-dev 2010-01-11 08:30:38 UTC
(In reply to comment #132)
> Created an attachment (id=215997) [details]
> Patch rebased against current PMS master

 I think we reached consensus about this and it will be applied tonight.
Comment 134 Christian Faulhammer (RETIRED) gentoo-dev 2010-01-11 20:39:26 UTC
Committed...do we close the bug?
Comment 135 Ulrich Müller gentoo-dev 2010-01-11 22:14:52 UTC
(In reply to comment #134)
> Committed...do we close the bug?

I think the recommended procedure is to leave it open until final approval of EAPI 3.
Comment 136 Ulrich Müller gentoo-dev 2010-01-18 22:13:02 UTC
EAPI 3 approved by council.