Summary: | Missing formal specification of ebuild naming policy | ||
---|---|---|---|
Product: | Documentation | Reporter: | Walter <walter> |
Component: | [OLD] Portage Documentation | Assignee: | Package Manager Specification <pms> |
Status: | RESOLVED INVALID | ||
Severity: | normal | ||
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Walter
2013-02-12 04:30:58 UTC
Everthing should be in app-doc/pms. Is this bug about the formal spec (PMS), or about policy (devmanual)? The latter imposes additional restrictions, like "uppercase characters are strongly discouraged" or "no integer part of the version may be longer than 18 digits". PMS: http://dev.gentoo.org/~ulm/pms/5/pms.html#x1-160003 devmanual: http://devmanual.gentoo.org/ebuild-writing/file-format/index.html As a general long-term gentoo user, I wasn't even aware the app-doc/pms project existed. Maybe someone should look at putting that in the aforementioned handbook link? If the two differ, then that's ... well, it's probably best viewed as another separate bug, unless it's documented and explained clearly. I seek only a formal spec, where I can go "is this a valid X?" (where X is package atom, version string, etc.) and system goes yay or nay. That is all. Right now, it sounds like there's at least two standards (and the book version suggests some packages in portage actually don't conform, so make that three or more standards). Apparently, none of these have a formal, machine-parseable spec I can point at and go "yes, that's the truth version X, to which things should conform". That's all I need. This bug is about the fact that bit's missing, not the other problems. Related food for thought (just to give you some idea of the approach).. I just did some testing with various variants of the version string and found suddenly different SRC_URI interpretation requirements vs. package version string in the event that the '-r0' suffix was used. (ie. I can name my tarball the same version but not if I am using -r# .. and maybe .. untested as yet ... _pre ...) This sort of thing should optimally be explicitly formally, machine-readably defined somewhere, not just in text. It seems formal enough. If you are left with [a-zA-Z0-9\-]+/[a-zA-Z0-9\-]+, after checking with that sed script, then the stripped version could be valid. This is at least enough to check that there is a folder (and also the .ebuild version). eg: [[ -d /usr/portage/`stripVersion dev-php/PEAR-MDB2_Driver_mysql-1.5.0_beta3` ]] && echo could be still valid With some effort, I can probably think of a couple examples where this would shave off parts of the package name, but last I checked (whole tree), there were no such packages. So perhaps what you're missing is a clear syntactically unambiguous representation or the relatively rigorous naming specification (convention)? Here is how I interpret the naming convention: # Naming policy # http://www.gentoo.org/proj/en/devrel/handbook/handbook.xml?part=2&chap=1 # # atom. pkg ver _suf # # pkg. the package name, which should only contain lowercase letters, # the digits 0-9, and any number of single hyphen (-), underscore (_) # or plus (+) characters. # ver. The version is normally made up of two or three (or more) numbers # separated by periods, such as 1.2 or 4.5.2, and may have a single letter # immediately following the last digit; e.g., 1.4b or 2.6h. # The package version is joined to the package name with a hyphen # _suf{#}. #.#.# < _alpha < _beta < _pre < _rc < (no suffix) < _p. function stripVersion { package=`echo $1 | sed -E ' s/_(alpha|beta|pre|rc|p)?([0-9]*)?$//; s/-r([0-9]*)?$//; s/_(alpha|beta|pre|rc|p)?([0-9]*)?$//; s/-[0-9.]*([a-z]*)?$//; s/^(=)?([<>])?(=)?// '` echo $package } if [ -f "$1" ] ; then while read atom; do stripVersion "$atom" done < "$1" else stripVersion "$1" fi Your interpretation seems great but it's still just that - an interpretation - of one of three sources of truth identified (PMS docs, portage codebase, actual tree). Without a version number. It's close though. What would be really ideal would be taking the package name format interpretation you have made, converting it so that it was defined declaratively (eg. using IETF standard ABNF from https://tools.ietf.org/html/rfc5234) and publishing the result as the standard. That way its interpretation is fixed and it can be viably used for code generation in arbitarary languages. In addition, versioning would make changes clear and insta-documentable (and on that note: for the love of god, please use github instead of some weird private gentoo devs only repo. Right now the friction involved in contributing changes to some areas of gentoo is ridiculous. It would be a great chance to set a good example with this standards effort!) Thanks for your further consideration of this real issue. It's areas like this that can be very hard to address with open source efforts... but total domination is within reach! The formal spec is here: http://dev.gentoo.org/~ulm/pms/head/pms.html#x1-160003 Nothing to fix here, therefore closing. That page is neither machine parseable nor concise. It describes simultaneously both algorithms (in multiple) and the identifier itself. It is very measurably not a good, versioned, machine-useful specification for the identifier. But if you choose to ignore its flaws, ignore its flaws. Sometimes this community is so hard to contribute to. Restatement of problem: "I seek only a formal spec, where I can go "is this a valid X?" (where X is package atom, version string, etc.) and system goes yay or nay. That is all." Problem NOT solved, problem VALID. This boils down to the general question if PMS should use EBNF when describing syntax. Which needs to be discussed on mailing lists, not in a bug report. But in fact, it _was_ discussed on gentoo-dev several times, as early as 2008. It boils down to not having a formal (== rigidly defined, machine parseable) spec for the most basic element of the system even after over a decade of development. As a normal Gentoo user I had no idea about PMS. I do not think this bug should be considered PMS-centric. However, PMS could be the right way to solve it. Unfortunately, it looks like you are saying PMS has chosen a non machine parseable solution written in elevated technical English by a series of Germans. Honestly, if I was a second language speaker I'd just give up. As it happens I'm part German so I can poke fun all I like :) Really, whatever is going on right now with whatever Gentoo projects, and whatever people feel about the matter having been touched upon in the past, this bug is valid and should be resolved. It should remain open until it is resolved. It is not resolved. There's a reason you don't find many machine-parseable specs out there... I will have a crack at converting the interpretation here to ABNF and post the results. I have invested some time and implemented an ABNF spec for these items. Right now there is a bug where capitalization is screwy because I haven't bothered to go through and change every string literal like "A" to %xx ... but you get the gist. https://github.com/globalcitizen/gentoo-pms-abnf It would be nice if this was picked up and used for testing and formal specification. Sample output: [repo-name] '0_Xywgh_V6Zve_Kx-F_f_W9Piv3COJ_tpgf_-U-_Adu__5k2cno-__-a__7__-sw4-IGI-8_-_j-P1H-h_67_6Ta------' '1iZ__piy' '7-U_Ioml9-----''_R9juGk5_1_GJ780F6J_L' '5U_KW_tSPyHi_Q' 'c_Z2xoi' 'KS_4k_Gg3O_I__f6lz' 'a-------------' 'b' 'g' '5cQ-j-_-_A37_92_CQ_dW_1__rP_0__6VK_8_4-Am_-dH36-5_-_9__-av-R_JN-_-_2_-my-u_h-----------' 'o-_-------' 'v-7_pQ4x_v_o_Q_2s_-rp93_TqjlH1-u86j-_Jw5-Z-V-eP0X-_-_-__-7_SYc4-_o-Kn_ya1-_-_Gd-__m-_-9e_R-_-__3-_B_85---------' 'M--------' 'U_c6GcA4q1aZ_pd0--------' '_gd823H6A9_qg__OW--------------------' '0-ASb_CUWu7_BHEV_z-8f-EK-_n_y-__-M-O-_Y-guN-mJ-bL-_-_sI3-M-15KG6_k-U-aBF-4-ZX-' 'fb4_-kHe1_H__tG0569_8U-' '0' 'vV_2__-t_X6cRC_4E1x_7_u_3t-9-v-__-j-8-wq-uNs-__50-0-Y-cr-_op-_UG-n-E-x_I-U7_-Y_-_3-Q_0-' '_' 'B-P--------------' [version] '4.5213078_BETA6' '78352461905' '7a_RC' '750931842695931_beta6' '2_alpha' '4.9278350613965.0J' '01394275.86088784.3.751.043.4.7.0.5.15.47.95m' '138954267061395348I' '256390478180932078.5.70.678.66.6.57.2502.89.19138326.20.8.12.4.998.024757A_alpha' '3.750269814709922216.0.23.88.7.0.7.5770.27.99.9.10.5g' '91862734057.532931339.9.56057.42.1.39.23.2.37.5.3.9359.751.089493' '09258317644.1.49608709589952725.8063.78.5639.2.01.302.5.4.60.5.232224.0.158334.98.59.08.5' '8.1.03624.5.978.3.73526.1.663E_pre' '35721604897738.4.668328019.07.1.148.88.05.821.99.1.7170.384.2_beta7-r3' '2.49751.8.630.9X_alpha' '321540968732219878.6679720715.9.5.9109S''0U' '2.19307.6.8.543528.943.1.5.124.5.74.99.5.05.679.67.72.47.4D_BETA' '210637549862619203881.3.5135198750.9.9.60764.27.767124.67.01.7.0.2.2.4.02.8.40.387.754c' '294160583738309782164.59185721.4.98.8.2.1.86254.518.2.124.905.843_beta0-r3' '23869574013.0.009.835.2.5.9173.8.6.2638.4_beta' '7.5964018322.5.2.12.2648.12890572.27.89616.7.7.94.3.6.1.0.36.7.53991.0391.366C_BETA' Same link now includes code to generate regex from the ABNF. |