Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 482170

Summary: [Future EAPI] Functions for version comparison and version component expansion
Product: Gentoo Hosted Projects Reporter: Ulrich Müller <ulm>
Component: PMS/EAPIAssignee: PMS/EAPI <pms>
Status: RESOLVED FIXED    
Severity: enhancement CC: esigra, mgorny, pacho
Priority: Highest    
Version: unspecified   
Hardware: All   
OS: Linux   
URL: https://gitweb.gentoo.org/repo/gentoo.git/tree/eclass/eapi7-ver.eclass
Whiteboard: in-eapi-7
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 174380    
Attachments: Sample implementation in bash

Description Ulrich Müller gentoo-dev 2013-08-23 07:11:11 UTC
This would replace version_compare from versionator.eclass.

   version_test [VERSION1] OP VERSION2

In the two argument form, ${PVR} would be compared with VERSION2.
OP would be the binary arithmetic operators from test(1), i.e. -eq, -ne, -lt, -le, -gt, or -ge. (Alternatively, C style ==, !=, <, <=, >, >= could be used, but they suffer from quoting issues.)
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2013-08-23 10:58:06 UTC
I'd go for two-argument only, one-argument is going to be visually confusing. Let's not make PMS any worse that it is now, while there's no real gain from not writing ${PVR}.

As for operators, I think we should stick with the ones used in dep/atom strings for the sake of consistency. We are used to having to quote has_version/best_version arguments, so I guess we'll get used to quote this.

And my +1 in general for the idea. While it's not something used widely, this is a logic that's necessary in the PM anyway, and indirectly exposed partially best_version / has_version.
Comment 2 Ulrich Müller gentoo-dev 2013-08-23 11:23:29 UTC
(In reply to Michał Górny from comment #1)
> I'd go for two-argument only, one-argument is going to be visually
> confusing. Let's not make PMS any worse that it is now, while there's no
> real gain from not writing ${PVR}.

How about putting the operator first then (i.e. Polish notation)?

   version_test OP VERSION1 VERSION2

> As for operators, I think we should stick with the ones used in dep/atom
> strings for the sake of consistency. We are used to having to quote
> has_version/best_version arguments, so I guess we'll get used to quote this.

That would be =, <, <=, >, and >= then. No != operator?

Should ~ be included too? It is asymmetric:
"Equal to the specified version, except the revision part of the matching package may be greater than the revision part of the specified version (-r0 is assumed if no revision is explicitly stated)."
Comment 3 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2013-08-23 11:48:18 UTC
(In reply to Ulrich Müller from comment #2)
> (In reply to Michał Górny from comment #1)
> > I'd go for two-argument only, one-argument is going to be visually
> > confusing. Let's not make PMS any worse that it is now, while there's no
> > real gain from not writing ${PVR}.
> 
> How about putting the operator first then (i.e. Polish notation)?
> 
>    version_test OP VERSION1 VERSION2

No, IMO that's going to be not natural to most of the people.

We should stick with something simple and explicit:

  if version_test $(getversion) '>=' 3.4; then
    ...
  fi

well, when i look at it, -ge would feel fine there too, as bash syntax.

> > As for operators, I think we should stick with the ones used in dep/atom
> > strings for the sake of consistency. We are used to having to quote
> > has_version/best_version arguments, so I guess we'll get used to quote this.
> 
> That would be =, <, <=, >, and >= then. No != operator?

Tricky, but there could !* as in blocker syntax :).

> Should ~ be included too? It is asymmetric:
> "Equal to the specified version, except the revision part of the matching
> package may be greater than the revision part of the specified version (-r0
> is assumed if no revision is explicitly stated)."

Hmm, maybe it should.

  if version_test ${llvm-version} ~ 3.4; then
    ...
  fi

which makes me also think, what about wildcards. Will:

  if version_test ${llvm-version} == '3.4*'; then
    ...
  fi

be allowed?
Comment 4 Ciaran McCreesh 2013-08-23 11:53:23 UTC
All of versionator should be in the PM really.
Comment 5 Ulrich Müller gentoo-dev 2013-08-23 12:52:17 UTC
(In reply to Michał Górny from comment #3)
> well, when i look at it, -ge would feel fine there too, as bash syntax.

That's why it was my first suggestion. ;-)

> which makes me also think, what about wildcards. Will:
> 
>   if version_test ${llvm-version} == '3.4*'; then
>     ...
>   fi
> 
> be allowed?

Please don't, or it will get complicated quickly. Next question will be if we also include slots and USE flags.


(In reply to Ciaran McCreesh from comment #4)
> All of versionator should be in the PM really.

Really, all of its two line bash wrapper functions? Let's see, what can be done better in the package manager:

- get_all_version_components / get_version_components
  (but only one function for it, with a flag like -a or --all)
- version_compare / version_is_at_least

version_sort would also be more efficient if implemented in the PM. However, only two packages (dev-util/nvidia-cuda-toolkit and dev-python/pycuda) are using this function, so presumably it's not worth it.
Comment 6 Ciaran McCreesh 2013-08-23 13:01:01 UTC
> (In reply to Ciaran McCreesh from comment #4)
> > All of versionator should be in the PM really.
> 
> Really, all of its two line bash wrapper functions?

*shrug* Would avoid people having to use an eclass for some but not all of them. But I don't really mind, so long as...

> - get_all_version_components / get_version_components
>   (but only one function for it, with a flag like -a or --all)

...at least that isn't done in bash.

> version_sort would also be more efficient if implemented in the PM. However,
> only two packages (dev-util/nvidia-cuda-toolkit and dev-python/pycuda) are
> using this function, so presumably it's not worth it.

I wrote that for eselect, rather than tree use. Oh well.
Comment 7 Ulrich Müller gentoo-dev 2013-08-23 14:44:53 UTC
(In reply to Ciaran McCreesh from comment #6)
> *shrug* Would avoid people having to use an eclass for some but not all of
> them. But I don't really mind, so long as...
> 
> > - get_all_version_components / get_version_components
> >   (but only one function for it, with a flag like -a or --all)
> 
> ...at least that isn't done in bash.

O.K., so we have:

   version_test VERSION1 OP VERSION2  # OP is -eq, -ne, etc. as in test(1)
   version_components [-a] VERSION

Or is "version_expand" a better name for the second function?
Comment 8 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2013-08-27 20:35:27 UTC
I've grepped the tree for uses of versionator.eclass functions. Here are the stats for *.ebuild (numbers are no. of packages, not ebuilds):

get_version_component_range 447
replace_version_separator 149
replace_all_version_separators 58
get_major_version 52
delete_all_version_separators 31
version_is_at_least 29
delete_version_separator 27
get_version_components 11
get_after_major_version 8
version_format_string 7
get_version_component_count 6
version_compare 4
version_sort 1
get_last_version_component_index 1
get_all_version_components 1



This way obtained using:

$ for a in $(grep '() {' eclass/versionator.eclass  | cut -d'(' -f1); do echo -n "$a "; find -name '*.ebuild' -exec grep $a {} + | cut -d/ -f2-3 | uniq | wc -l; done | sort -k2 -n -r
Comment 9 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2013-08-27 21:20:10 UTC
Ok, so let's state my point here.

First of all, I believe there could be two reasons for including a particular function/feature in PM here. Either:

1. it is used commonly, or

2. it 'strongly' belongs in PMS. For example, it is something that PM does anyway so there's no point in having two implementations around.


The version comparison function fits (2). It's not used that frequently but it's something that PM does all the time. There's no point in keeping two implementations of it in two different places.


Some of the other functions fit (1). Though, I don't feel like adding half a dozen functions is the best thing to do. If possible, I would rather prefer, well, trimming them to one or two functions that can do the most common tasks.

get_version_component_range() seems to be the most commonly used function. It is used both with ${PV} and other versions. That's probably the function we should target first. We can squash at least get_major_version() and get_after_major_version() into it.

Then there's replace_version_separator() (which can be used with pattern or index) and replace_all_version_separators(). We could supposedly squash them into another function. Then, we can also squash delete_all_version_separators() and delete_version_separator() into it.

And I think we shouldn't go further than that. This leaves get_version_components(), version_format_string(), get_version_component_count(), get_last_version_component_index() and get_all_version_components() in the field. But those are used quite rarely, to be honest.


Another thing is that most of versionator.eclass' function names don't make real sense. For example, it's not immediately clear what's the difference between get_version_component_range() and get_version_components(). We should avoid repeating the same confusion within EAPI.
Comment 10 Ulrich Müller gentoo-dev 2013-10-11 10:03:08 UTC
(In reply to Michał Górny from comment #9)
> The version comparison function fits (2). It's not used that frequently but
> it's something that PM does all the time. There's no point in keeping two
> implementations of it in two different places.

Going from there:

1. version_test VERSION1 OP VERSION2
   (as outlined in comment #7)

> Some of the other functions fit (1). Though, I don't feel like adding half a
> dozen functions is the best thing to do. If possible, I would rather prefer,
> well, trimming them to one or two functions that can do the most common
> tasks.

I agree that we shouldn't have ten or more functions.

> get_version_component_range() seems to be the most commonly used function.
> It is used both with ${PV} and other versions. That's probably the function
> we should target first. We can squash at least get_major_version() and
> get_after_major_version() into it.

2. version_part RANGE [VERSION]
   RANGE is either a single number or START-[END], with END defaulting to
   the total number of components.
   VERSION defaults to ${PV}.

> Then there's replace_version_separator() (which can be used with pattern or
> index) and replace_all_version_separators(). We could supposedly squash them
> into another function. Then, we can also squash
> delete_all_version_separators() and delete_version_separator() into it.

3. version_separator_replace RANGE REPLACEMENT [VERSION]
   RANGE is a single number or START-[END]; in the latter case all separators
   from START to END are replaced.

Pattern substitution can be done in bash (like ${PV/./-}) so maybe we need not implement it.
Comment 11 Ulrich Müller gentoo-dev 2015-05-22 10:43:44 UTC
> 2. version_part RANGE [VERSION]
>    RANGE is either a single number or START-[END], with END defaulting to
>    the total number of components.
>    VERSION defaults to ${PV}.

Coming back to this. It seems that PMS wording and versionator.eclass don't agree on what is part of components and what is a separator.

PMS says "followed by zero or more of the suffixes _alpha, _beta, _pre, _rc or _p, which themselves may be suffixed by an optional integer" and "followed by the suffix -r followed immediately by an integer". So, 3.0_p2-r1 would have components "3", "0", "_p2", and "-r1" with separators ".", "", and "".

Whereas versionator.eclass interprets it as having components "3", "0", "p2", and "r1" with separators ".", "_", and "-".

IMHO we should follow versionator.eclass here, which would also agree with ${PR} being "r1" (but _not_ "-r1").

> 3. version_separator_replace RANGE REPLACEMENT [VERSION]
>    RANGE is a single number or START-[END]; in the latter case all separators
>    from START to END are replaced.

I wonder if empty separators should be counted here? For example, "3.0c-r1" would have three separators ".", "", and "-". (This would be different from replace_version_separator() of versionator.eclass which counts only the nonempty ones.)
Comment 12 Ulrich Müller gentoo-dev 2015-10-24 22:49:29 UTC
Recently, it was clarified that a suffix and the integer following it count as separate version components:
https://gitweb.gentoo.org/proj/pms.git/diff/names.tex?id=cb99e4dcb5837626320b1fba0277d0fa7c1c9829

I wonder if the proposed replacements for get_version_component_range() and replace_version_separator() should do any validation of their version argument, i.e. should they check if it is valid version syntax under PMS?

Without such validation, things would boil down to simple pattern matching:
- Strings of digits and strings of letters are version components.
- Anything between such components (possibly including the empty string?) is
  a version separator.

For example, 12.3d_pre5 would have the five components "12", "3", "d", "pre", and "5" with the four separators ".", "", "_" and "".
Comment 13 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2015-11-21 14:01:32 UTC
Well, version_test is quite clearly useful. However, for the other things I think we should do some in-repo research on how things are used and how they could be replaced without making it PITA and thoroughly confusing ;-).
Comment 14 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2015-11-22 07:40:22 UTC
Updated stats:

get_version_component_range 485
replace_version_separator 142
replace_all_version_separators 59
get_major_version 54
version_is_at_least 37
delete_all_version_separators 29
delete_version_separator 17
get_version_components 9
get_version_component_count 9
version_format_string 8
get_after_major_version 5
version_compare 4
get_all_version_components 4
get_last_version_component_index 3
version_sort 2
Comment 15 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2015-11-22 08:21:30 UTC
So, after some brainstorming yesterday, /me and ulm came to the following conclusions (@ulm, please correct me if I recall wrong).

There will be three functions which should cover almost all of versionator.eclass (names subject to change): version_test, version_replace_sep, version_get_comp.


version_test [V1] OP V2
-----------------------

- tests whether V1 is in OP relation to V2, following PMS version comparison algorithm.

- V1 is optional and defaults to ${PVR}.

- OP is one of -gt, -ge, -lt, -le, -eq, -ne, with the 5 first being equivalent to dep spec operators >, >=, <, <=, = respectively and the last one would provide simple negation of =.

- We probably won't provide equivalent for '~' since it's easy enough to provide version without revision (${PV} or ${SOMEVERSION%-r*}).

- Though we probably should provide '*' for -eq & -ne equivalent to the one supported in PM version syntax.

- V1 & V2 must be valid PMS versions.


Version splitting
-----------------

For the two remaining functions, we introduce simplified and more flexible version splitting rules that are intended to work better with non-Gentoo versions (and transforming Gentoo versions into non-Gentoo versions).

A version string is composed of one or more version components, delimited by (possibly empty) version separators. Each component is a sequence made purely of letters [a-zA-Z] or purely of digits [0-9]. Separators are sequences of characters outside [a-zA-Z0-9]. A version string must start and end with a version component.

In other words, version like:

  1.2.3b_alpha4

would be split into:

  1  .  2  .  3  '' b  _  alpha '' 4

  ^  ^  ^  ^  ^  ^  ^  ^  ^     ^  ^
  c1 s1 c2 s2 c3 s3 c4 s4 c5    s5 c6

(c being component indexes, s being separator indexes)


replace_version_sep RANGE REPL [RANGE REPL...] [VERSION]
--------------------------------------------------------

- replaces version separators by index. Replacement by string can be accomplished using regular ${PV/_/-} syntax.

- RANGE = START[-[END]]. One digit means specific index, digit + '-' means specific index and all separators following it, full range means separators from START to END, both START and END included.

- START and END both must not be larger than the last separator index.

- If RANGE includes empty separators, they are replaced as well. IOW, $(replace_version_sep 3 - 1.2.3b) == '1.2.3-b'.

- REPL is a replacement string. It can be any string, and will be repeated for each replaced separator.

- Multiple pairs of RANGE & REPL can be provided for better replacement efficiency. The specified RANGEs must be disjunctive.

- VERSION is optional and defaults to ${PV} (I don't think we want ${PVR} here).


get_version_comp RANGE [VERSION]
--------------------------------

- gets part of version string, containing components from specified RANGE along with their separators.

- RANGE = START[-[END]]. One digit means specific index, digit + '-' means specific index and all components following it, full range means components from START to END, both START and END included.

- START and END both must not be larger than the last component index.

- All separators contained between requested version components are included. Separator preceding and following them are not.


Final notes
-----------

Other versionator.eclass functionalities can be easily obtained by specific use of one or two of the functions:

* get_version_component_range -> version_get_comp

* replace_version_separator -> version_replace_sep or ${PV/...}

* replace_all_version_separators -> version_replace_sep 1-

* get_major_version -> version_get_comp 1

* version_is_at_least -> version_test [...] -ge ...

* delete_all_version_separators -> version_replace_sep 1- ''

* delete_version_separator -> version_replace_sep N ''

* get_version_components -> version_replace_sep 1- ' '

* get_version_component_count -> above into array, ${#comps[@]}

* version_format_string -> some combination of version_replace_sep and version_get_comp.

* get_after_major_version -> version_get_comp 2-

* version_compare -> version_test

* get_all_version_components -> no replacement since it both includes components in separators for no good reason ;-)

* version_sort -> no replacement, that's just a crazy thing
Comment 16 Ulrich Müller gentoo-dev 2015-11-28 14:36:53 UTC
Created attachment 418072 [details]
Sample implementation in bash

A sample implementation is attached, which shows that this can be coded in bash in a reasonably compact form.

Some differences to what we had previously:

(In reply to Michał Górny from comment #15)
> A version string is composed of one or more version components, delimited by
> (possibly empty) version separators. Each component is a sequence made
> purely of letters [a-zA-Z] or purely of digits [0-9]. Separators are
> sequences of characters outside [a-zA-Z0-9]. A version string must start and
> end with a version component.

If it doesn't, then an implicit first (or last) empty component will be added. This is more in line with what cut(1) does. IMHO we should only error out when range syntax is malformed, but allow an arbitrary version string.

> replace_version_sep RANGE REPL [RANGE REPL...] [VERSION]
> --------------------------------------------------------

> - START and END both must not be larger than the last separator index.

Don't error out here, but silently ignore anything beyond the last separator.

> - Multiple pairs of RANGE & REPL can be provided for better replacement
> efficiency. The specified RANGEs must be disjunctive.

Last range wins.

> get_version_comp RANGE [VERSION]
> --------------------------------

> - START and END both must not be larger than the last component index.

Again, silently ignore anything beyond the last separator. This is in line with cut(1) behaviour.
Comment 17 Ulrich Müller gentoo-dev 2016-01-29 16:43:38 UTC
IMHO, all this hinges on the version comparison function being implemented in a way that is easily and efficiently accessible from ebuilds. Using IPC or duplicating the implementation in bash would render this proposal meaningless. 

The version splitting functions will have to be implemented in bash in any case, and IIUC the conclusion was _not_ to use PMS version rules there, because the functions are intended to convert from and to non-Gentoo versions. So also there the argument why this should go into the package manager (as opposed to an eclass) is weakened.

The work already done could be used as the basis of a versionator-r1.eclass, of course.
Comment 18 Ciaran McCreesh 2016-01-29 16:55:13 UTC
> IMHO, all this hinges on the version comparison function being implemented in a > way that is easily and efficiently accessible from ebuilds. Using IPC or
> duplicating the implementation in bash would render this proposal meaningless. 

IPC implemented correctly is plenty fast enough.

> The version splitting functions will have to be implemented in bash in any 
> case, and IIUC the conclusion was _not_ to use PMS version rules there, because 
> the functions are intended to convert from and to non-Gentoo versions.

I suspect this argument doesn't hold up to scrutiny: where do ebuilds take non-Gentoo versions as inputs?
Comment 19 Ulrich Müller gentoo-dev 2016-01-29 17:54:08 UTC
(In reply to Ciaran McCreesh from comment #18)
> IPC implemented correctly is plenty fast enough.

Even in global scope? Several popular eclasses call version_is_at_least in global scope.

> I suspect this argument doesn't hold up to scrutiny: where do ebuilds take
> non-Gentoo versions as inputs?

There are examples in the tree, e.g. ebuilds doing pattern substitution in PV and using the result as argument for get_version_component_range.
Comment 20 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-09-27 07:27:35 UTC
The current reference implementation is provided for testing purposes as eapi7-ver.eclass in ::gentoo.
Comment 21 Larry the Git Cow gentoo-dev 2018-03-11 11:44:11 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=08d405f72e85125b64f5074e49714e759d85eaf6

commit 08d405f72e85125b64f5074e49714e759d85eaf6
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2018-02-19 15:38:28 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2018-03-11 11:43:41 +0000

    Add EAPI 7 version comparison & manipulation functions
    
    The function code is copied from eapi7-ver.eclass which has been written
    by Ulrich Müller and me.
    
    Bug: https://bugs.gentoo.org/482170

 bin/eapi.sh                                   |   4 +
 bin/eapi7-ver-funcs.sh                        | 191 ++++++++++++++++++++
 bin/isolated-functions.sh                     |   4 +
 pym/portage/tests/bin/test_eapi7_ver_funcs.py | 240 ++++++++++++++++++++++++++
 4 files changed, 439 insertions(+)}
Comment 22 Larry the Git Cow gentoo-dev 2018-04-30 22:14:34 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/pms.git/commit/?id=c9c5a2f0cf7671d73a78a544d28c361ecd5a2b3c

commit c9c5a2f0cf7671d73a78a544d28c361ecd5a2b3c
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2017-09-27 21:23:22 +0000
Commit:     Ulrich Müller <ulm@gentoo.org>
CommitDate: 2018-04-05 16:48:17 +0000

    EAPI 7 has version manipulation and comparison functions
    
    Bug: https://bugs.gentoo.org/482170

 eapi-differences.tex |  4 +++
 pkg-mgr-commands.tex | 80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+)}