168748 – genlop -c should take better weighted average

Bug 168748 - genlop -c should take better weighted average

Summary: genlop -c should take better weighted average

Status:	CONFIRMED

Alias:	None

Product:	Portage Development
Classification:	Unclassified
Component:	Unclassified (show other bugs)
Hardware:	All Linux

Importance:	High normal
Assignee:	Portage Tools Team

URL:
Whiteboard:
Keywords:

Depends on:	297517
Blocks:
	Show dependency tree

Reported:	2007-02-28 15:38 UTC by Marijn Schouten (RETIRED)
Modified:	2023-07-18 21:58 UTC (History)
CC List:	7 users (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Marijn Schouten (RETIRED) gentoo-dev

2007-02-28 15:38:48 UTC

enlop -c takes average of all previous merge times which seems like reasonable behaviour but isn't.

$ genlop -t lilypond
 * media-sound/lilypond

     Tue Jan 23 15:16:00 2007 >>> media-sound/lilypond-2.10.13
       merge time: 5 minutes and 23 seconds.

     Tue Feb  6 15:26:01 2007 >>> media-sound/lilypond-2.10.16-r5
       merge time: 5 minutes and 22 seconds.

     Fri Feb 16 13:44:39 2007 >>> media-sound/lilypond-2.10.17
       merge time: 24 minutes and 41 seconds.

     Wed Feb 28 16:13:03 2007 >>> media-sound/lilypond-2.10.20
       merge time: 25 minutes and 16 seconds.                                                             

$ genlop -c

 * media-sound/lilypond-2.10.20

       current merge time: 12 minutes and 35 seconds.
       ETA: 2 minutes and 35 seconds.
genlop -c

 * media-sound/lilypond-2.10.20

       current merge time: 22 minutes and 10 seconds.
       ETA: any time now.

The large time difference is due to forcing -j1 and building docs, which are good for approximately a factor 2 more mergetime each on my box.

The weight of an exact version match should be increased significantly.

Comment 1 Dan 2007-03-01 04:06:22 UTC

This does not really make sense to me.  You could change the useflags on a package and reinstall the same version, as well as all sorts of other flags, which could skew the results even if the current one was weighted much higher.  its a prediction, there really is no way for it tot be correct.

Comment 2 Marijn Schouten (RETIRED) gentoo-dev

2007-03-01 10:30:21 UTC

Yes, you are correct that use flags can make a huge difference, as they do in this case. Could not the use flags be added to what goes into emerge.log so exact matches can be weighted higher?

Comment 3 Jon Malachowski 2008-01-16 23:57:05 UTC

(In reply to comment #1)
> This does not really make sense to me.  You could change the useflags on a
> package and reinstall the same version, as well as all sorts of other flags,
> which could skew the results even if the current one was weighted much higher. 
> its a prediction, there really is no way for it tot be correct.
> 

Thats not the point. The point is the best prediction of a future emerge time is the most recent emerge times (without using more information than genlop is currently gathering).  Whether it is a use flag, or a ram upgrade, it will come out in the wash much faster by changing the averaging coef from 1 to something else.

A regressive average wouldn't be too hard and wouldn't drastically increase calculation time as you are already gathering all the necessary information except perhaps the order.

Comment 4 DEMAINE Benoît-Pierre, aka DoubleHP 2008-10-23 00:43:27 UTC

(In reply to comment #3)
> Thats not the point. The point is the best prediction of a future emerge time
> is the most recent emerge times (without using more information than genlop is
> currently gathering).  Whether it is a use flag, or a ram upgrade, it will come
> out in the wash much faster by changing the averaging coef from 1 to something
> else.

No, this is wrong. I had this problem in both ways: genlop heavily under- and over- estimating the remaining time. It really depends on many things. One example is when having several slots of the same package; for example, when deps require both QT3 and QT4.

Merge time is, as you say, affected by -j, but, also with distcc, and the actual free CPU time. Example: when an heavy page make my Firefox use 30 to 50% CPU, real compile time will be way longer than estimated. If first merge of an ebuild is done in this case, then, the second merge will be the faster one.

In short, i say that real life make happen many case that make prediction really complicated, some times toward under-estimations, other-times to over-estimation. I have been thinking about this since late 2005, and never opend a bug because I alternatively meet both cases. We can not expect genlop to be clever enough at that level, to probe available machines for distCC, check which process are running, how many CPU, if they are used by other process, if those process are emerge (worst case: emerge OOo at the same time as Firefox: FF emerge time will double, but this will have low impact on OOo merge time) ...

I ask for "CANTFIX". Developing AI for this feature would use more CPU time than the merge on way itself :) so please, just close this bug. Actual light way to do has imprecision, but making things "a bit" more accurate would take "lot more" CPU time.

QT is an example of ebuild for which genlop would need to check the version; but taking care of version is not significant for all ebuilds => would need to check major of version for "selected" ebuild ... please no => forget it.

Comment 5 Luca Lesinigo 2008-11-16 23:29:57 UTC

changing system load, USE flags, parallel builds, distributed builds, etc. are out of control for genlop and/or too difficult to use in the prediction, so I think the correct solution is the current one: ignore them. genlop predictions will be better for similar system conditions, and that's all.

on the other hand, IMHO giving more weight to exact and/or 'near' version matches is a good idea. keeping the same hardware, 0 load, same compiler, etc... the only thing that remains is the actual source code being compiled. Large packages show wildly varing merge times between major releases, and sometimes even between minor ones.

On systems with a year or more of merge history this could give better results and I honestly don't see any possible drawbacks.

Also, I hadn't actually looked into the code, but if genlop does the usual arithmetic mean sum(merge_times)/merge_count it would be very vulnerable to outliers: think about a merge during heavy system load that takes forever, or a single merge with a large distcc cluster behind. A simple improvement would be to discard merge times too "far", eg. those that are ouside +/- 20% of the median.

Just my two cents.

Comment 6 DEMAINE Benoît-Pierre, aka DoubleHP 2008-11-17 20:09:42 UTC

(In reply to comment #5)

20% is not enough for things like Qt or Tcl; stupid example: you emerge a first time apache with no flag, by mistake, then, merge again with many flags (likely to gain 30% or 40% time). If you trigger at 20%, third compile time will have wrong estimation. Trigger should rather be 50 to 70%.

The good ideas I see in this comment are:
- give bigger weight to the closest version
- if downgrading, ignore merge time of later versions, or weight them with a very low coef. This would also avoid having to perform any kind of "profile"/major I mentioned in comment 4.

Once the application get a table containing versions and times, this kind of discrimination and coefs becomes easy to implement.

Comment 7 Vincent de Phily 2009-12-19 12:44:54 UTC

Taking use-flags or load-average into account is probably too much work for the improvement it could bring, but I think giving a bigger weight to more recent builds would be worth it. Taking slot numbers into account is a tougher call.

I suggest adding bug 297517 as a dependency haven't got permission myself), as I think it is related to this while being less controversially a bug.

Comment 8 Christer Ekholm 2016-12-27 21:23:59 UTC

A relatively simple approach would be to limit the number of old emerges to count for the mean emerge time calculation.

I have made a pull-request for that on https://github.com/gentoo-perl/genlop