Two things really: First, I'd like to get the QA_TEXTRELS filtering into the 2.0 stable branch since people have started RESTRICTing "stricter", which we'd like to discourage. Second, the QA_TEXTRELS stuff in the 2.1 branch currently fails if the QA_TEXTRELS string is written on more than one line (gawk bails). To be attached - patches against 2.0.54-r1 and 2.1_pre9-r5.
Created attachment 85819 [details, diff] Replace use of gawk with bash for loops Use of bash 'for' loops rather than gawk to process QA_TEXTRELS against the scanelf output allows QA_TEXTRELS to be specified on more than one line.
Created attachment 85820 [details, diff] Backport to 2.0.54-r1 Exact same code, but in ebuild.sh rather than misc-functions.sh.
It's in SVN both branches right now. thanks for the updates.
i dont like this patch at all ... issues: - you dont support regexes - your version makes the output ugly and really unreadable if your only gripe with the current version is that the QA vars cant be declared on one line, then i can fix that also, if you're going to change/backport QA_TEXTRELS to stable, you should have also done QA_EXECSTACKS since the code is pretty much exactly the same
(In reply to comment #4) > i dont like this patch at all ... issues: > - you dont support regexes > - your version makes the output ugly and really unreadable ha. I and I just had copied the portage-patches-2.0.54-2 to the distfiles-local and was about 2mins away from commiting. I logged in and deleted the files before it had a chance to be mirrored I think. > then i can fix that .. > QA_EXECSTACKS since the code is pretty much exactly the same SpanKY: Can you fix that like soonish?
(In reply to comment #4) > i dont like this patch at all ... issues: > - you dont support regexes I hadn't noticed regex support was there, as it hadn't occurred to me it would be necessary. It's easy enough though - replace: [[ ${t} == ${s} ]] && continue 2 with [[ ${s} =~ ${t} ]] && continue 2 > - your version makes the output ugly and really unreadable I think that's putting it a little strongly, but writing: for t in ${f[@]}; do echo "TEXTREL ${t}" done is straightforward. > if your only gripe with the current version is that the QA vars cant be > declared on one line, then i can fix that Whatever happens, I think that needs to be fixed. It's normal for portage vars to support multiple lines, and if such support is missing it gets ugly trying to maintain sensible line lengths if there are several entries in the variable. > also, if you're going to change/backport QA_TEXTRELS to stable, you should have > also done QA_EXECSTACKS since the code is pretty much exactly the same Fairy snuff. Let me know if you want me to provide updated patches; otherwise I'll assume you're going to do it your way. Either is fine by me, as long as it gets into stable and supports multi-line values.
Created attachment 85978 [details, diff] Reworked patch for 2.1 supporting regexes, old style output and QA_EXECSTACK Now supports regexes, multi-line values, displays the output as provided by scanelf, includes same for QA_EXECSTACK.
Created attachment 85979 [details, diff] Reworked backport for 2.0 in line with patch #85978
SpanKY.. Waiting on you now.. please review the follow patches
the array stuff is unnecessary, just use '#t%p' for the scanelf format you can use '<<-EOF' and then indent the code with tabs so you dont break formatting as for the while/read/EOF stuff, i dislike that ;)
(In reply to comment #10) > the array stuff is unnecessary, just use '#t%p' for the scanelf format ok; if you don't want the previous scanelf format retained, it's a lot easier. > you can use '<<-EOF' and then indent the code with tabs so you dont break > formatting useful to know, thanks. > as for the while/read/EOF stuff, i dislike that ;) heh - thought you wouldn't, you gawk fiend ;) That was the only way I could retain the output from scanelf unchanged; doing: for s in $(scanelf -qyRF '"%t %p"' ${D}); do... or s=( $(scanelf -qyRF '"%t %p"' ${D}) ) splits '"TEXTREL fred"' into two tokens, '"TEXTREL' and 'fred"', which is not what I had expected; I had hoped it would do the same as if the output of scanelf was pasted in place of the command, i.e. tokenising as "TEXTREL fred", "TEXTREL jim" etc. If you're happy with just a list of files rather than the "%t %p" and "%e %p" formats, the while loop can be dropped for a simpler for loop.
Created attachment 86077 [details, diff] rework 2.1 - regex, vertical list output, filenames with spaces Here we go. No arrays, no while read loops :) Found a way to get the shell to split the output of scanelf '"#t%p"' to support filenames with spaces, using eval to expand the variable before the for loop is parsed. Note; to solve the newline problem (which is where I started), it's enough to do: QA_TEXTRELS=$(echo ${QA_TEXTRELS}) prior to the call to gawk; so if you don't care about filenames with spaces you could just do that. I couldn't see an easy way to get gawk to parse names with spaces as it doesn't treat quotes in any special way.
Created attachment 86079 [details, diff] rework 2.0 - regex. vertical list output, filenames with spaces
2.0.54 (svn revision 3314) updated with "rework 2.0 - regex." attachment #86079 [details, diff]
Kevin, can you please svn up and repost the 2.1 patch.
Created attachment 86099 [details, diff] 2.1 patch against svn 3316 Added vprintf() to isolated-functions.sh, to provide support for printing strings containing formatting, in the style of the vecho(). I've tested vprintf() on its own, but haven't tested svn portage as such. Rest of the patch tested as part of 2.1-pre9-r5, but it's self-contained so should be ok.
yeah this just keeps getting worse what about using original gawk script but with this change: - BEGIN { split("'"${QA_TEXTRELS}"'", ignore); } + BEGIN { split("'$(set -o noglob;echo ${QA_TEXTRELS})'", ignore); }
actually, that's hackish ... what about this: -gawk ' - BEGIN { split("'"${QA_TEXTRELS}"'", ignore); } +gawk -v QA_TEXTRELS="${QA_TEXTRELS}" ' + BEGIN { split(QA_TEXTRELS, ignore); }
(In reply to comment #17) > yeah this just keeps getting worse If you say that sort of thing, you should say _why_ you think it's worse. The reasons why I think your gawk solution is worse: 1) Doesn't deal with filenames containing spaces 2) Involves a second language (gawk) when it does not provide anything that can't be done easily and effectively in the first (bash) From where I'm standing, my code is simple and clear enough - with comments explaining the two bits whose presence isn't self-documenting (the echo to remove newlines, and the eval to get bash to process quoted tokens in the variable correctly). I'll update it with the 'set -o noglob' to avoid expansion in the echo. Further, using bash instead of gawk means the reader only has to know bash, not gawk. In my opinion, using languages other than bash in shell scripts should be restricted to cases where the other language provides a clear benefit. There's no benefit to using gawk here. As I said in comment #12, if you don't care about spaces in filenames then the echo to remove the newlines is enough. However if you want to support spaces in filenames you need more.
Created attachment 86124 [details, diff] 2.1 patch against svn 3316, with noglob fix.
Comment on attachment 86124 [details, diff] 2.1 patch against svn 3316, with noglob fix. Actually the noglob isn't necessary on the scanelf results; reverting.
> 1) Doesn't deal with filenames containing spaces and ? neither does the proposed bash version, and really it isnt doable unless we specify an unintuitive separator for use in the QA variables ... then again, the gawk would be able to handle the separator just fine with '-F' whereas a bash solution would involve saving/restoring IFS > 2) Involves a second language (gawk) when it does not provide anything that > can't be done easily and effectively in the first (bash) we disagree on the "easily and effectively" part the current bash code is more complicated than needs be and imo the gawk is simpler ... plus the bash code needs to make sure the stuff in the QA var is not shell expanded whereas no such concern exists with the gawk version > Actually the noglob isn't necessary on the scanelf results; reverting. noglob is needed, otherwise you can get wrong results
(In reply to comment #22) > > 1) Doesn't deal with filenames containing spaces > > and ? neither does the proposed bash version, Yes it does. What did you try? It works for me - QA_TEXTRELS='fred "jim bob"' or QA_TEXTRELS="fred \"jim bob\"" works fine. > and really it isnt doable unless > we specify an unintuitive separator for use in the QA variables ... then again, > the gawk would be able to handle the separator just fine with '-F' whereas a > bash solution would involve saving/restoring IFS As you can see above, none of that is necessary. Simply double-quote the filename that contains spaces - as you have to do when using such filenames anyway. You can't parse that easily in gawk; it'll split on the separator but won't respect the quotes, so you'd have to change the separator as you say - ':' would be a sensible choice if you go that route, as used in path variables (and what we did for the GCC_SPECS variable). Where bash is concerned, it doesn't respect the quotes if you write: for file in ${QA_TEXTRELS}; do... but it does if you do: eval "for file in ${QA_TEXTRELS}; do..." which is why I used the eval. > > 2) Involves a second language (gawk) when it does not provide anything that > > can't be done easily and effectively in the first (bash) > > we disagree on the "easily and effectively" part Obviously. "effectively" was a reference to spaces in filenames. "easily" is more subjective, but I certainly found it easy enough to do it in bash. > the current bash code is more complicated than needs be and imo the gawk is > simpler ... As far as complexity goes, there's nothing in it. Whether one is clearer than another is somewhat subjective, but I still think it's less effort to grok the double for loop than it is to grok the pipe and gawk for loop. > plus the bash code needs to make sure the stuff in the QA var is > not shell expanded whereas no such concern exists with the gawk version > > > Actually the noglob isn't necessary on the scanelf results; reverting. > > noglob is needed, otherwise you can get wrong results scanelf reports a list of files, not a pattern match, so noglob is unnecessary on the scanelf output, which is what I did before. For some inexplicable reason I thought the eval wouldn't need it either, but of course it does. I'll update my patch and attach it.
Created attachment 86246 [details, diff] 2.1 against svn 3316 - regex fix (noglob)
Created attachment 86247 [details, diff] 2.0 - noglob fix against svn 3323 Not sure this is a real problem anyway - without this change, when QA_TEXTRELS is expanded by the shell the current working directory is ${D} anyway. So the only difference is that the inner loop has more iterations. Incidentally, this means my version could be simplified by removing the subshell, the "set -o noglob" and reverting the comparison to just [[ \${s} == \${t} ]]
Created attachment 86248 [details, diff] 2.1 simplified patch against svn 3323
Created attachment 86249 [details, diff] 2.0 simplified patch against svn 3323
So now, our discussion centers around which is better: s=$(scanelf -qyRF '"#t%p"' "${D}" | grep -v 'usr/lib/debug/') s=$(echo ${s}) # strip newlines f=$(eval " for s in ${s}; do for t in ${QA_TEXTRELS}; do [[ \${s} == \${t} ]] && continue 2 done printf \"\${s}\n\" done") with the double-quote method of coping with spaces, or f=$(scanelf -qyRF '%t %p' "${D}" | grep -v ' usr/lib/debug/' | \ gawk -v QA_TEXTRELS="${QA_TEXTRELS}" ' BEGIN { split("'"${QA_TEXTRELS}"'", ignore, ":"); } { for (idx in ignore) if ($NF ~ "^ *"ignore[idx]"$") next; print; }') with the ':'-separator method (doesn't cope with filenames that start with spaces, but although that's valid it's pathological).
you didnt include my changes ... revised gawk would be: f=$(scanelf -qyRF '#t%p' "${D}" | grep -v ' usr/lib/debug/' | \ gawk -v QA_TEXTRELS="${QA_TEXTRELS}" ' BEGIN { split(QA_TEXTRELS, ignore, ":"); } { for (idx in ignore) if ($0 ~ "^"ignore[idx]"$") next; print "TEXTREL " $0; }') once you start getting into eval and embedded quotes, things get to be a pita to maintain ... there is no question of what does and doesnt need to be escaped when using gawk
(In reply to comment #29) > you didnt include my changes ... revised gawk would be: I missed a couple of the changes; sorry about that. I tried the gawk version, but it still doesn't work. Firstly: gawk -v QA_TEXTRELS="${QA_TEXTRELS}" '... doesn't remove newlines so the matches fail (although it does stop gawk bailing), and if it's replaced with: gawk -v QA_TEXTRELS="$(set -o noglob; echo ${QA_TEXTRELS})" '... the newlines become spaces so if ($0 ~ "^"ignore[idx]"$") doesn't match because $0 has a leading space (my earlier suggestion of if ($0 ~ "^ *"ignore[idx]"$") is garbage as well, fwiw). I can get it to work by doing: q=$(printf "${QA_TEXTRELS} | sed -e 's:^[ \t]*::') q=$(printf "${q// }") f=$(scanelf -qyRF '#t%p' "${D}" | grep -v ' usr/lib/debug/' | \ gawk -v QA_TEXTRELS="${q}" ' BEGIN { split(QA_TEXTRELS, ignore, ":"); } { for (idx in ignore) if ($0 ~ "^"ignore[idx]"$") next; print "TEXTREL " $0; }') to allow the definition of QA_TEXTRELS to be: QA_TEXTRELS="usr/bin/fred*: usr/bin/bob: usr/bin/j i m: usr/lib/harry.so" although people are bound to forget the ':'s. My version accepts: QA_TEXTRELS='usr/bin/fred* usr/bin/bob "usr/bin/j i m" usr/bin/harry.so' or QA_TEXTRELS="usr/bin/fred* usr/bin/bob \"usr/bin/j i m\" usr/bin/harry.so" which I think is more natural for portage variables. > once you start getting into eval and embedded quotes, things get to be a pita > to maintain ... there is no question of what does and doesnt need to be > escaped when using gawk I still feel that when you compare the two methods, my code is clearer (and hence easier to maintain), and as for embedded quotes in QA_TEXTRELS, that seems perfectly natural to me. However, you do whichever way you think is best - you have authority legitimately derived from the volume of work you do. I really only care that the variable definition should support a sensible way of avoiding overly-long lines. If you go the ':'-separator route, it might be reasonable to forbid extra whitespace for formatting (including newlines), and require the ebuild author to write things like: QA_TEXTRELS="usr/bin/fred*:usr/bin/bob:usr/bin/j i m" QA_TEXTRELS="${QA_TEXTRELS}:usr/bin/harry.so" if they wish to split stuff over more than one line - the reason being that the no-extra-whitespace rule applies to other ':'-separated variables like PATH (and GCC_SPECS, for example), so it would be consistent. Then f=$(scanelf -qyRF '#t%p' "${D}" | grep -v ' usr/lib/debug/' | \ gawk -v QA_TEXTRELS="${QA_TEXTRELS}" ' BEGIN { split(QA_TEXTRELS, ignore, ":"); } { for (idx in ignore) if ($0 ~ "^"ignore[idx]"$") next; print "TEXTREL " $0; }') is sufficient. We'll have to bump the 2.0.54 series again to do that, of course.
this seems like it's becoming over complex. Spanky/Kevin: How about we shift the logic into scanelf itself and have it ignore internally using fnmatch() ?
Certainly could do. We would still need to decide how to pass the relevant information from the ebuild to the invocation of scanelf, of course. Other tools do this sort of thing via options; the tar '--exclude PATTERN' option seems sensible, which can be specified multiple times. The '--exclude-from FILE' option might be worth considering; it has the advantage that it doesn't have to worry about separators or spaces, although the disadvantage is that it necessitates creating more files (e.g. "${FILESDIR}/${PF}-textrels.qa"). BTW after thinking about c#30 some more, I'm more convinced that using the PATH-style format for the environment variable makes most sense in that approach. It's de-facto standard for specifying a list of filesystem entities, and whatever can't be supported in that style will fail other stuff as well anyway. The colon-separated environment variable could be used equally well by scanelf, if you want to go that way rather than adding yet another option. Actually, a bit of both would be good, in that invocation would become: scanelf -qyRF "%t %p" "${D}" --exclude="${QA_TEXTRELS}" (i.e. such that scanelf wouldn't read the environment variable itself)
(In reply to comment #32) > Certainly could do. We would still need to decide how to pass the relevant > information from the ebuild to the invocation of scanelf, of course. Heh.. To keep it real simple I was thinking the env variable QA_TEXTRELS :)
Pseudo Example: Index: scanelf.c =================================================================== RCS file: /var/cvsroot/gentoo-projects/pax-utils/scanelf.c,v retrieving revision 1.141 diff -u -b -B -w -p -r1.141 scanelf.c --- scanelf.c 23 Apr 2006 15:24:38 -0000 1.141 +++ scanelf.c 7 May 2006 14:49:52 -0000 @@ -276,8 +276,18 @@ static const char *scanelf_file_textrel( static const char *ret = "TEXTREL"; unsigned long i; + char *env = getenv("QA_TEXTRELS"); + if (!show_textrel && !show_textrels) return NULL; + if (env != NULL) { + char e[__PAX_UTILS_PATH_MAX] = ""; + FOREACH(e, env) { + if (fnmatch(e, elf->filename, 0) == 0) + return NULL; + } + } + if (elf->phdr) { #define SHOW_TEXTREL(B) \ if (elf->elf_class == ELFCLASS ## B) { \
> I tried the gawk version, but it still doesn't work. that's because you're accepting garbage from the QA_TEXTREL input variable it's either whitespace delimited or : delimited, mixing and matching just complicates things and going with a : delimited variable doesnt add enough benefit to make it worth it > I still feel that when you compare the two methods, my code is clearer as i said, using eval and trying to escape everything properly just makes things a pita, no two ways about it
(In reply to comment #35) > > I tried the gawk version, but it still doesn't work. > > that's because you're accepting garbage from the QA_TEXTREL input variable > > it's either whitespace delimited or : delimited, mixing and matching just > complicates things > > and going with a : delimited variable doesnt add enough benefit to make it > worth it So if we stick with whitespace delimited, how is your gawk going to work? As it stands it only supports single-line definitions, as I explained in my first point on c#30. > > I still feel that when you compare the two methods, my code is clearer > > as i said, using eval and trying to escape everything properly just makes > things a pita, no two ways about it Using arrays is the natural way of expressing sets of strings in bash: QA_TEXTRELS=( "usr/bin/fred*" usr/bin/bob "usr/bin/j i m" usr/lib/harry.so ) for q in "${QA_TEXTELS[@]}"; do ... so how about that?
Created attachment 86470 [details, diff] scanelf modification to do the qa stuff (re c#34) Splits by ":" - if you want whitespace separator, just replace ":" with " \t\n" in the calls to strtok_r (obviously won't support spaces in filenames, fwiw). Prepends search_path when doing fnmatch (file_matches_list()), so that the match uses the same text that scanelf reports. Tweaked scanelf_dir to stop it generating '//' in the middle of paths when concatenating the path and entry. Maybe file_matches_list() should check whether the filename starts with '/' before prepending search_path, but I'm not sure. Does the execstack filtering only on the PT_GNU_STACK check - not sure if that's sensible (depends what QA_EXECSTACK means), but it shows the filtering can be more specific if we want. The reason for moving the utility functions was to allow the inline to work; I got warnings that it couldn't be inlined because the body wasn't available (might have been with -ggdb3 only, though).
Kevin,SpanKY: Is whitespace in the QA_TEXTREL expression so vitual? I'd really rather use the expression string of QA_TEXTREL="usr/lib/*.so next_item last_item" to follow suit with the existing INSTALL_MASK and STRIP_MASK's. So can we replace the ":" with " "?
Created attachment 86474 [details, diff] scanelf patch, whitespace separator (In reply to comment #38) > So can we replace the ":" with " "? With " \t\n", yes (so people can format stuff nicely in ebuilds). Since those _MASK variables don't cater for spaces in filenames, there's no point worrying about them here. Need to twiddle things a bit; revised patch attached (I did think about switching from strtok_r to strsep as that's used elsewhere, but strsep returns empty strings for every pair of separator characters, which is not what we want).
(In reply to comment #39) > Created an attachment (id=86474) [edit] > scanelf patch, whitespace separator I like this patch. Feel free to commit it to pax-utils cvs Kevin.
(In reply to comment #40) > I like this patch. Feel free to commit it to pax-utils cvs Kevin. Done.
(In reply to comment #41) > (In reply to comment #40) > > I like this patch. Feel free to commit it to pax-utils cvs Kevin. > > Done. Great. Now feel free to attach patches here to ebuild.sh that can drop all the gawk, eval etc. I'm guessing you/we will need to export QA_FOO if set.
Created attachment 86667 [details, diff] Remove processing from misc-functions.sh that's now done in scanelf (2.1) (In reply to comment #42) > Now feel free to attach patches here to ebuild.sh that can drop all the > gawk, eval etc. I'm guessing you/we will need to export QA_FOO if set. Here we are :) export is indeed necessary - it's unconditional since I don't think it matters if it's set or not prior to the export. QA_*_${ARCH} => QA_*, and QA_STRICT_* handling is left to the scripts.
Created attachment 86668 [details, diff] Remove processing from ebuild.sh that's now done in scanelf (2.0)
One more issue arose; the -e option rejects both W|X PT_GUN_STACK headers and W|X PT_LOAD headers. The patches above only deal with the first. There are two ways to deal with this; either: 1) Widen QA_EXECSTACK back to what it was before (i.e. skip the whole -e check) 2) Add QA_WX_LOAD to filter the W|X PT_LOAD headers To follow, patches for both possibilities.
Created attachment 86696 [details, diff] Adds QA_WX_LOAD to scanelf for the PT_LOAD W|X check
Created attachment 86697 [details, diff] Alternative: make QA_EXECSTACK filter everything in scanelf -e
Created attachment 86698 [details, diff] Remove processing from ebuild.sh that's now done in scanelf (2.0), including QA_WX_LOAD support
Created attachment 86699 [details, diff] Remove processing from misc-functions.sh that's now done in scanelf (2.1), including QA_WX_LOAD support
(In reply to comment #49) > Created an attachment (id=86699) [edit] > Remove processing from misc-functions.sh that's now done in scanelf (2.1), > including QA_WX_LOAD support Kevin, The line that reads this does not match up with what you added to scanelf. + export QA_EXECSTACK QA_WXLOAD scanelf.c: qa_wx_load = get_split_env("QA_WX_LOAD"); scanelf.c: "\t- 1 per QA_TEXTRELS/QA_EXECSTACK/QA_WX_LOAD");
Created attachment 87360 [details, diff] Remove processing from ebuild.sh that's now done in scanelf (2.0), including QA_WX_LOAD support Sorry about that - also the two previous lines had WX_LOAD instead of QA_WX_LOAD.
Created attachment 87361 [details, diff] Remove processing from misc-functions.sh that's now done in scanelf (2.1), including QA_WX_LOAD support Same for 2.1
(In reply to comment #52) > Created an attachment (id=87361) [edit] > Remove processing from misc-functions.sh that's now done in scanelf (2.1), > including QA_WX_LOAD support commited to misc-functions.sh (hopefully last commit before 2.1 final) With regards to 2.0.x we might as well leave it alone now that 2.1 in is the rc phases
Kevin, Care to provide a man entry for expected usage? Maybe set some ground rules on when it should and should not be used (to avoid abuse by lazy devs)
Created attachment 87405 [details, diff] update man pages (and a little fix :/ ) How's this? Additions to make.conf(5) and ebuild(5). Adds documentation of the 'stricter' FEATURE. Also includes a little fix QA_STRICT_EXECSTACK => QA_STRICT_WX_LOAD, that I noticed while writing the docs.
It looks like solar committed this in svn r3409 and r3410.
This has been released in 2.1_pre2-r3.
I don't understand... With portage 2.1_rc4 and pax-utils 0.1.13, my QA_TEXTRELS statements seem to get ignored. I made an example ebuild, which only needs fglrx_dri.so (taken from ati-drivers) copied into files/: ============================= # Copyright 1999-2006 Gentoo Foundation # Distributed under the terms of the GNU General Public License v2 # $Header: $ DESCRIPTION="" HOMEPAGE="" SRC_URI="" LICENSE="" SLOT="0" KEYWORDS="x86" IUSE="" DEPEND="" RDEPEND="" QA_TEXTRELS="/opt/test/fglrx_dri.so" QA_TEXTREL_x86="/opt/test/fglrx_dri.so" RESTRICT="nostrip" src_install() { insinto /opt/test doexe "${FILESDIR}/fglrx_dri.so" } ======================== I still get: TEXTREL fglrx_dri.so
Ok, it seems I took something incorrect as a "reference ebuild", sorry. As I see now, ati-drivers-8.25.18 uses pathnames in "QA_" variables, and that doesn't work.
Drop the leading / The path is relative to the image directory. Also don't define both QA_FOO and QA_FOO_${ARCH}
Yes, I'm aware that QA_FOO_${ARCH} just overrides QA_FOO... but I was trying various things for a moment :-) These are working: QA_TEXTRELS="fglrx_dri.so" QA_TEXTRELS="*" But really, this doesn't work: QA_TEXTRELS="opt/test/fglrx_dri.so"