This bugreport is a consequence of a comment to bug #23179 regarding the speed of rpm2targz. The short summary is: The rpm2targz package only includes a script that converts an rpm package to a tar.gz file. This behavior may be useful in Slackware, but it is not really useful in Gentoo, because what Gentoo folks need is files, not tar.gz files. The rpm2targz script wastes resources on a "gzip -9" command which could take a really long time and which is not really useful for us. I suggest the rpm2targz ebuild includes an extra file or two: # cat /usr/bin/rpm2cpiocat #!/bin/sh dd ibs=`rpmoffset < "$1"` skip=1 if="$1" 2> /dev/null | gzip -dc and also # cat /usr/bin/rpmextract #!/bin/sh dd ibs=`rpmoffset < "$1"` skip=1 if="$1" 2> /dev/null | gzip -dc | cpio --extract --make-directories --unconditional This is just a basic suggestion. Of course these scripts can be made much prettier, but they would be good enough for ebuilds even like this, and I only wanted to make a working suggestion, not give a complete solution.
we've been doing this for a while (check out dev-lang/ccc), i know some developers have previously suggested an extension to unpack() that will recognise rpm format and use a method like this.
I really like this unpack() suggestion. What happened? Should someone submit a new bugreport in the portage category? Is there already one? "rpm" finds nothing useful on bugzilla. Regarding the ccc ebuild -- great! Shouldn't we suggest the rest of the ebuild that use rpm2cpio adopt the same idea for the moment. It seems those ebuilds are: scim-chinese mkl sgi-oss-glu icaclient Maybe someone could warn the maintainers of those ebuilds to take a look at this bug.
interesting. regarding scim-chinese, there is a reason why i'm using rpm2cpio, it is because rpmoffset plain doesn't work on that rpm. if you can get it to work, please let me know. otherwise, even the README for rpm2targz states that it doesn't work for all rpm2, 9.0 that i just commited even uses rpm2cpio if that is avaliable because it works much better than rpmoffset.
What about creating a "virtual" (?) thats able to extract stuff from rpm files? Then for people who've already got rpm installed they can use that or if not, require something more minimal etc.
no that won't work. some rpms don't work with rpm2targz and need rpm explicitly to extract.
The rpmoffset.c program his this line: if (*p == '\037' && p[1] == '\213' && p[2] == '\010') Overe here: http://www.rpm.org/max-rpm/s1-rpm-file-format-rpm-file-format.html it mentions that a gzipped archive starts with 1F8B (hex), and 1F happens to be 37 octal and 8B is 213 octal. No idea what the 10 octal is for... Also, this is interesting: > ./rpmoffset < scim-chinese-0.2.2-1.i586.rpm > It never finds an offset. however: > grep "BZ" scim-chinese-0.2.2-1.i586.rpm Binary file scim-chinese-0.2.2-1.i586.rpm matches > My guess would be that it's because the archive doesn't have a gzipped compressed file. Some error checking should be added in the rpm eclass to make sure it returns something... "BZ" is the magic string identifier for a bzip2 compressed file.... I can't find any info on the rpm format that lists it though so maybe it just appeared in there by chance. Will check.
Wow... looking at the rpm2cpio.c source reveals the following piece of code if (!strcmp(payload_compressor, "gzip")) t = stpcpy(t, ".gzdio"); if (!strcmp(payload_compressor, "bzip2")) t = stpcpy(t, ".bzdio"); } Comment #6 tells us the obvious problem - rpmoffset does not check for bzip2 compression. I used the data from /usr/share/misc/file/magic to see that "BZh" is actually the magic for bzip2 and patching rpmoffset.c as follows: (see the attachment I'll attach next as well) --- rpmoffset.c 2003-06-21 21:25:14.000000000 +0900 +++ rpmoffset.c 2003-06-25 01:34:53.000000000 +0900 @@ -16,8 +16,13 @@ { char *buff = malloc(RPMBUFSIZ),*eb,*p; for (p = buff, eb = buff + read(0,buff,RPMBUFSIZ); p < eb; p++) + { if (*p == '\037' && p[1] == '\213' && p[2] == '\010') printf("%d\n",p - buff), exit(0); + if (*p == 'B' && p[1] == 'Z' && p[2] == 'h' ) + printf("%d\n",p - buff), + exit(0); + } exit(1); } actually makes rpmoffset produce 4412 for "rpmoffset < scim-chinese*" However, this also means that we'd need to do something like: cmd="dd if=$rpm bs=`./rpmoffset < $rpm` skip=1" case $cmd 2>/dev/null | file - in *gzip*) cat=gzcat ;; *bzip*) cat=bzcat ;; esac in order to uncompress it nicely. Mr. Tse, maybe you can use this in your rpm.eclass?
Created attachment 13781 [details, diff] rpm2targz-bzip2.diff makes rpmoffset.c recognize bzip2 compressed data
Ooops, - case $cmd 2>/dev/null | file - in + case "`$cmd 2>/dev/null | file -`" in
Well, looks like you beat me to the patch. :) > dd if=scim-chinese-0.2.2-1.i586.rpm bs=4412 skip=1 | file - 503+1 records in 503+1 records out /dev/stdin: bzip2 compressed data, block size = 900k Whaddya know.. Btw, anyone know why this damn thing keeps inserting newlines in my comments? it is kind of annoying.
[OT] Re: newlines in comments It's the fault of <textarea wrap="physical"> it seems. Nice comment on the subject can be found at http://www.utexas.edu/learn/forms/boxes.html . I am now trying to submit this with opera and it seems it will submit it nicely (no newlines on this line I hope). I was really annoyed when submitting bug #22722. If this goes in nicely, we'd know we've better *not* use Mozilla with bugzillas :) Also, I forgot to add $cmd 2>/dev/null | $cat to the end of the sample script in comment #7
Another approach would be to patch rpmoffset to produce output like: gzcat:12345 bzcat:12345 rpm2targz would have to be patched to understand this output as well, just in case there are people who use it. We can then have the much simpler: offset=`rpmoffset < $rpm` dd if="$rpm" bs=${offset##*:} skip=1 | ${offset%%:*} | cpio .... This would however break packages that already use rpmoffset. Maybe the patched rpmoffset can be included with a different name, because it would be much easier to use anyway.
hmm, the changes seem interesting enough to propogate back to slackware, since this is their code. so the solutions we have is: 1. add bzip2 detection and change the output of rpmoffset to output both compression type and offset 2. add bzip2 detection and keep the same output format, add some further logic in the rpm.eclass to handle that case by running file against it. 3. totally rewrite rpmoffset (its only a couple of lines of code) to do the decompression as well so it is more "rpm2cpio"-like. i guess since george has already done (2), that is probably the easiest of them all. assigning to myself, i'll fix it within the next couple of days
ok .. the changes have been committed. thanks for your help guys. scim-chinese can now use the rpmoffset and rpm eclass happily. as for the other packages, sgi-oss-glu is p.masked and will prob be removed in the near future. mkl and icaclient are both fetch restricted packages. the URI for mkl is incorrect as well, so i can't verify that they work. i'll leave it to the maintainers of those packages to switch it over to the rpm eclass.