Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 113630 - re-releasing snapshots can break things
Summary: re-releasing snapshots can break things
Status: RESOLVED NEEDINFO
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Tools (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Zac Medico
URL: http://dev.gentoo.org/~robbat2/emerge...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-26 07:51 UTC by Florian Steinel
Modified: 2010-08-03 22:45 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Steinel 2005-11-26 07:51:04 UTC
ls -la /var/delta-webrsync/
drwxrwx---   2 root portage     4096 23. Nov 20:51 .
drwxr-xr-x  14 root root        4096  4. Aug 20:03 ..
-rw-r--r--   1 root root    21468300 23. Nov 20:05 portage-20051122.tar.bz2
-rw-r--r--   1 root root          59 23. Nov 20:51 portage-20051122.tar.bz2.md5sum
-rw-r--r--   1 root root          55 23. Nov 03:44 portage-20051122.tar.bz2.umd5sum

ls -la /usr/portage/distfiles/snapshot-2005*
-rw-r--r--  1 root root  65559 24. Nov 03:40
/usr/portage/distfiles/snapshot-20051122-20051123.patch.bz2
-rw-r--r--  1 root root     71 24. Nov 03:40
/usr/portage/distfiles/snapshot-20051122-20051123.patch.bz2.md5sum-rw-r--r--  1
root root 122802 25. Nov 03:42
/usr/portage/distfiles/snapshot-20051123-20051124.patch.bz2
-rw-r--r--  1 root root     71 25. Nov 03:42
/usr/portage/distfiles/snapshot-20051123-20051124.patch.bz2.md5sum-rw-r--r--  1
root root  13421 26. Nov 03:37
/usr/portage/distfiles/snapshot-20051124-20051125.patch.bz2
-rw-r--r--  1 root root     71 26. Nov 03:37
/usr/portage/distfiles/snapshot-20051124-20051125.patch.bz2.md5sum

ls -la /etc/localtime
lrwxrwxrwx  1 root root 33 26. Nov 16:45 /etc/localtime ->
/usr/share/zoneinfo/Europe/Berlin

emerge-delta-webrsync
Looking for available base versions for a delta
fetching patches
failed fetching snapshot-20051125-20051126.patch.bz2.md5sum
patch_fh size=65559
patch_type=8
patch_fh size=122802
patch_type=8
patch_fh size=13421
patch_type=8
verbosity level(1)
src_fh size=21468300
disabling bufferless, patch_count(3) == 1 || forced_reorder(1)
size1=206653440, size2=206858240
reconstruction return=0, commands=9617
result was 9617 commands
versions size is 206858240
index_size = 4808
size1=206858240, size2=206981120
reconstruction return=0, commands=30061
result was 30061 commands
versions size is 206981120
index_size = 15030
size1=207134720, size2=207124480
/usr/bin/emerge-delta-webrsync: line 426: 26651 Speicherzugriffsfehler  patcher
-v "${dfile}" ${patches} "${TEMPDIR}/portage-${final_date}.tar"
reconstruction failed (contact the author with the error from the reconstructor
please)
Fetching most recent snapshot
Attempting to fetch file dated: 20051126
 --- No md5sum present on the mirror. (Not yet available.)
Attempting to fetch file dated: 20051125
Comment 1 Florian Steinel 2005-11-26 07:52:39 UTC
emerge --info
Portage 2.0.51.22-r3 (default-linux/x86/2005.1, gcc-3.3.6, glibc-2.3.5-r1,
2.6.14-gentoo-r2 i686)
=================================================================
System uname: 2.6.14-gentoo-r2 i686 AMD Athlon(tm) Processor
Gentoo Base System version 1.6.13
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.3 [enabled]
dev-lang/python:     2.4.2
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1
sys-devel/libtool:   1.5.20
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O3 -march=athlon -funroll-loops -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
/usr/lib/X11/xkb /usr/lib/mozilla/defaults/pref /usr/share/config
/var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/splash /etc/terminfo /etc/env.d"
CXXFLAGS="-O3 -march=athlon -funroll-loops -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig ccache distcc distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://ftp.tu-clausthal.de/pub/linux/gentoo/"
LANG="de_DE.UTF-8"
LC_ALL="de_DE.UTF-8"
LINGUAS="de en"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
USE="x86 3dnow X aac acl alsa apm audiofile avi bash-completion berkdb
bitmap-fonts bonobo bzip2 bzlib cdr crypt cups curl curlwrappers dbus dts eds
emboss encode esd exif expat fam ffmpeg foomaticdb fortran ftp gdbm gif glut
gnome gpm gstreamer gtk gtk2 guile hal howl imagemagick imlib ipv6 java jpeg
lcms ldap libg++ libwww mad mikmod mmx mng motif mozilla mozsvg mp3 mpeg ncurses
nls ogg oggvorbis opengl oss pam pcre pdflib perl png python qt quicktime
readline recode sdl slang slp spell ssl svg svga tcltk tcpd tiff truetype
truetype-fonts type1-fonts udev unicode userlocales vorbis xine xml2 xmms xprint
xv zlib linguas_de linguas_en userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LDFLAGS
Comment 2 Brian Harring (RETIRED) gentoo-dev 2005-11-28 00:21:41 UTC
please paste the output of diffball -V
Also, please post md5sum of the patches involved (actual md5sum, not .md5 file),
and uncompressed md5sum of the portage tarball (bzip2 -dc the_file | md5sum - #
is an easy way to get it).

Trying to A) track down if it's a bug in diffball, B) corruption of local
basefile, C) upstream corruption of the patches (will verify tomorrow), D)
something else. :)
Comment 3 Florian Steinel 2005-11-28 12:26:31 UTC
diffball version 0.6.5
2ecc03a8c0794cb6cfd04b9b8cc0fa43  snapshot-20051122-20051123.patch.bz2
21306ade8655e7831ef340a3be8eb582  snapshot-20051123-20051124.patch.bz2
a3c3ac36d547c5384b0e99769848a14f  snapshot-20051124-20051125.patch.bz2
56cce97b339584c69d69c8cf93a1889c  snapshot-20051125-20051126.patch.bz2
Output from "bzip2 -dc portage-20051122.tar.bz2 | md5sum -":
8f896e15d2ea157f04ad6b63f1bde0ff

portage-20051122.tar.bz2.umd5sum states 
8f896e15d2ea157f04ad6b63f1bde0ff 
so emerge-delta-webrsync is not the cause and i compared the output from md5sum
with *.md5sum files, they all match.
Maybe c) upstream corruption of the patches?
Comment 4 Yaroslav Isakov 2005-11-28 13:53:14 UTC
I also get this error with patcher -v portage-20051124.tar
snapshot-20051124-20051125.patch.bz2 portage-20051125.tar

output of this command:

patch_fh size=13421
patch_type=8
verbosity level(1)
src_fh size=206981120
disabling bufferless, patch_count(1) == 1 || forced_reorder(1)
size1=207134720, size2=207124480
reconstruction return=0, commands=1954
result was 1954 commands
versions size is 207124480
applied 1 patches
reordering commands? 1
reconstructing target file based off of dcbuff commands...
collapsing
processing src 0: 1194 commands.
apply-patch.c: x=52072, pos=206929048, len=65535
apply-patch.c: bailing, io_error 2
error detected in patcher.c:325
reconstructFile: io/bus error, exiting
error detected while reconstructing file, quitting 

PLEASE, see line size1=207134720, size2=207124480. File portage-20051124.tar has
size 206981120 !!! md5sum of portage-20051124.tar is
aece7bb7be91433f1d4bdd926e874462
Comment 5 Brian Harring (RETIRED) gentoo-dev 2005-11-28 20:12:31 UTC
Friendly reminder folks that re-releasing binary snapshots without notifying me
means you've just changed the md5 on a snapshot, moreso, changed the data of
that snapshot.

Why does this matter?  The patch generation that users use is cronjobbed to run
shortly after the _normal_ snapshot is generated.  In other words, you re-run it
without giving me any form of warning, you just broke the upgrade chain for all
emerge-delta-webrsync, meaning no more 150kb per day patch, they have to pull
down the full 20mB.

Klieber please notify me *prior* to doing this.  Re-running the snapshots just
to create symlinks on 11/24 just broke upgrade paths for people, and that sucks.  

Anyone who has the original 11/24 tarball, please make it available to me.  

Either you can wipe all portage snapshots you have, and suffer a full download,
or I'll generate a patch to mangle the data so y'all are back to the regenerated
md5.
Comment 6 Lance Albertson (RETIRED) gentoo-dev 2005-11-29 06:15:27 UTC
(In reply to comment #5)
> Friendly reminder folks that re-releasing binary snapshots without notifying me
> means you've just changed the md5 on a snapshot, moreso, changed the data of
> that snapshot.

Eh? I don't recall doing this unless klieber did w/o letting us know.

> Why does this matter?  The patch generation that users use is cronjobbed to run
> shortly after the _normal_ snapshot is generated.  In other words, you re-run it
> without giving me any form of warning, you just broke the upgrade chain for all
> emerge-delta-webrsync, meaning no more 150kb per day patch, they have to pull
> down the full 20mB.

Perhaps you should include checks in your code so that such things don't break
like this. You can't assume that the snapshot will be generated properly every
day at the exact time if something got mixed up. Please don't blame us if you
can't handle something like this in your app. 

> Klieber please notify me *prior* to doing this.  Re-running the snapshots just
> to create symlinks on 11/24 just broke upgrade paths for people, and that sucks.  
> 
> Anyone who has the original 11/24 tarball, please make it available to me.  

No idea if that will be possible if he overwrote it.

Comment 7 Kurt Lieber (RETIRED) gentoo-dev 2005-11-29 07:36:48 UTC
I recreated the snapshots to fix the expired GPG key, not "just to create
symlinks".  As Lance said, your code needs to account for situations like this.
Comment 8 Brian Harring (RETIRED) gentoo-dev 2005-11-29 09:31:56 UTC
What you're doing when you go and re-release it without letting me know what has
occured, is screwing any delta-webrsync user who happened to use the patch
during that time frame- yes, I can fix the upgrade path to use the *new* md5,
but anybody who happened to get the 'old' release is now stuck in an upgrade
line you've effectively created, and abandoned (one that if they upgrade during
that time frame, they're locked into).  

Frankly, this isn't my code sucking (the code *does* suck, but this isn't an
issue of the code, it's an issue of the process).

Every time you do a re-release, you are effectively rebasing that date's
snapshot to a different dataset.  Yes you can update the md5, but the patch
generation has already split a patch for that date, and users have *already*
downloaded it.  Meaning, even if I go and correct this crap, the users who hit
that window are screwed and their isn't really a damn thing I can do for them
unless I have access to the new *and* old snapshots.  Even if I have access to
the old, I have to either jam hacks into my code to correct for people doing
this stuff (need a good reason why also), or post a patch and hope the screwed
users find it.

This "your code needs to account for this" is pretty much misunderstanding how
this stuff works, or a nice attempt to tell me to get bent ;)

Further... re-releases just for signing (which you could have done without
regenerated the data, or just waited the 2 hours for the next snapshot regen to
occur), horks up any users who are pulling the tarball down but have stopped. 
Howso?  Even if they were *not* using emerge-delta-webrsync, you just
invalidated whatever data they had already pulled down.  Same base problem, same
resultant outcome- full download required.

Again, please notify me *prior* to doing this, or at least let me know when it
occurs so that I can do something.  The issues I've described above aren't shite
code problems, it's a fundamental problem involving *any* patch.  Look at the
kernel, you don't  see Linus telling people "fix your patch generation software"
if he goes and re-releases 2.6.14 with different data- no, he understands
releases (rules involved) and releases another version, rather then going back
and screwing with releases that have already been made available to others. 
Choose any other project, and you see the same thing.

I hope I've made it clear why doing the re-releasing screws things up and is
avoided wherever releases are done- further, hope I've made clear why this isn't
me being a tool, it's a fundamental issue for any deltas generated- you cannot
go changing version datasets without causing issues (shite code or not).

What I *can* do is extend the version namespace so it's not just dates-
20051124.1.tar.bz2 fex.  This however sucks, since the querying is against
distfiles mirrors, it'll have to try and pull the .1 everytime to see if someone
went and did the re-release crap.  Further, it doesn't solve anything unless I
know y'all did it, since again, there is that issue of old vs new.
Comment 9 Lance Albertson (RETIRED) gentoo-dev 2005-11-29 09:50:19 UTC
I have told you many times that you cannot rely on the snapshots being perfect
every day. If this delta project is that fragile to snapshot changes, you should
really re-think how the whole thing works. Our snapshots cannot be gaurunteed
like linux patchsets. Its a plain and simple fact. Please stop complaining about
something I told you ages ago you could not rely on 100%.

We need to work through this to find a solution thats sane instead of you
relying on us to report to you whenever we do something like this. 
Comment 10 Brian Harring (RETIRED) gentoo-dev 2005-11-29 10:03:48 UTC
>I have told you many times that you cannot rely on the snapshots being perfect
>every day.
First I've heard.

>We need to work through this to find a solution thats sane instead of you
>relying on us to report to you whenever we do something like this. 
The problem here, is that even if I pull a horrible hack out to cover up y'all
not being willing to provide consistant snapshots, it *still* is reliant on my
scripts  catching it and correcting it as soon as it occurs.

Please clarify to me why you cannot provide consistant snapshots; obviously the
whatever previous explanation never registered on my end, so lets have at it
again please.

The only valid scenario I can see for when you cannot provide consistant
snapshots is when the generated snapshot (the raw tarball) is horked.  Hasn't
been the issue, issue has been snapshots screwed with after the fact.

Re-signing it (fex) doesn't require invalidating the data- merely resigning the
data with the new key. So... yeah, please clarify the "cannot provide consistant
snapshots".
Comment 11 Brian Harring (RETIRED) gentoo-dev 2005-11-29 10:56:42 UTC
bleh, pardon, sans the "the whatever" from the last posting since my editing
skills suck.

Regarding a solution that is 'sane', as long as the patch _generation_ is in the
loop when y'all go screwing with stuff, I can at least automate correcting the
upgrade path for users who haven't _yet_ upgraded to the release y'all replaced.

That still leaves anyone who upgraded during that window a bit screwed (coming
up with hack to deal with that won't be sane by my definition, offhand), which
is the  main issue here, and why avoiding regenning the _data_ unless absolutely
needed is damn important.
Comment 12 Kurt Lieber (RETIRED) gentoo-dev 2005-11-29 11:05:49 UTC
This isn't an issue of us providing consistent snapshots -- we provide at least
one snapshot per 24 hour period.  Consistently.

This is an issue of you re-purposing something in a way it was never designed to
do and then complaining to us when we continue to use it in the way that we
always have.

We are telling you we cannot guarantee one and only one snapshot per 24 hour
period.  While we will certainly *try* to notify you if and when we decide we
have to re-issue a snapshot, we are also telling you that we cannot guarantee
this and we will certainly not wait for your approval before issuing the
re-release.   

You have a dependency on an external source and need to account for reasonablle
unexpected behavior in that external source.  There are any number of ways you
can accomplish this, including storing the last  snapshot that you've generated
a delta from or writing something that uses the portage tree directly.

As Lance said, we're willing to work with you to find a mutually acceptable
solution, but hamstringing us so that we cannot re-release a snapshot (or can't
release it without your approval) is not acceptable.
Comment 13 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-11-29 12:47:40 UTC
reading this, and following the issue, I see a very simple solution.
Don't re-use an existing filename. If you need to re-issue a snapshot, it should
be to a file with a different name.

Just changing the file contents (and not the name) leads to the same place as
upstream packages that do the same thing, breaking the tree each time.
Comment 14 Brian Harring (RETIRED) gentoo-dev 2005-11-29 19:04:59 UTC
>This is an issue of you re-purposing something in a way it was never designed 
>to do and then complaining to us when we continue to use it in the way that we
>always have.
I'll remind you *nicely* I was running this project externally prior to infra
deigning to fold it back onto official hardware- the issues were known when it
was folded back, and y'all should *still* know the issues of versioned delta
compression versus versionless delta compression. I've also reiterated the
points again above.

>We are telling you we cannot guarantee one and only one snapshot per 24 hour
>period.  While we will certainly *try* to notify you if and when we decide we
>have to re-issue a snapshot, we are also telling you that we cannot guarantee
>this and we will certainly not wait for your approval before issuing the
>re-release. 
You've not given any reasons why you cannot, merely told me you cannot.  If
you're still convinced it can't be done, hand it off to me and I'll do it.

Use Robin's suggestion- it's best practice, it prevents screwing the
emerge-delta-webrsync users, and in combination with your symlink addition you
have the _same_ capabilities.

Hell, I'll write it myself, since I'm already looking to replace the crap
scripts in use (namely avoiding the horrid race condition y'all have built in
for rsync generation and snapshot creation).
Comment 15 Kurt Lieber (RETIRED) gentoo-dev 2005-11-29 19:56:22 UTC
Brian, thank you for sharing your opinions on how you would solve this if you
were responsible for it.  Robin's suggestion, while good, won't work since it
would break emerge-webrsync, which expects files to be named a certain way.

If there are other suggestions which do not prevent us from re-issuing a
same-named file in the event of an emergency, then please post them here.  
Comment 16 Brian Harring (RETIRED) gentoo-dev 2005-11-29 21:14:20 UTC
What's the point of the symlink addition if emerge-webrsync isn't migrated over
to it then?  We change emerge-webrsync to use the symlink, you can still do the
re-release cruft as much as you want.  emerge-webrsync's only dependency is on
the upstream name it looks for- if it were looking for a simple symlink (instead
of doing the known bad 40 day backwards search), no horkage.

No issues in converting over to it either- the *valid* reasons to force a regen
do not occur that often, the soonest I would expect this to rear it's head in a
valid way would be 6 months down the line (I would still be surprised if it
occured).  More then enough time for the change to spread, plus it's backwards
compatible with existing webrsync (symlink merely points at a file, it does not
change *normal* release names).

So... offhand, it will *not* break anything converting over and is a larger time
frame then we give for even metadata conversions for both rsync and snapshots. 
The only potential horkage is while the symlink change over is spreading, which
if capped at 6 months for waiting for people to upgrade (meaning, you can do the
evil re-release of the same name during the time frame), I'll gladly shut my
mouth about. 

Finally... the existing code that allows for unneeded horkages, I'll address
(seperate bug or via email, your choice), so the chance of any race condition
rearing it's head won't exist.  Sans people re-releasing without a reason, it's
pretty much a bullet proof transition plan that exceeds our normal transition
period reqs.

Considering points made above, what exact points remain that render this
solution unacceptable?
Comment 17 Kurt Lieber (RETIRED) gentoo-dev 2005-11-29 21:18:56 UTC
Brian, I'm through discussing this.  We are not going to hamstring ourselves
into not being able to re-release a snapshot for any reason we deem necessary. 
I'm sorry if you disagree with this decision, but it is final.  Again, we *will*
work with you to find another solution that works for *both* sides, but I am not
interested in further debating the pros and cons of whether or not re-issuing
snapshots with the same name is valid/necessary/whatever.  
Comment 18 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-11-29 21:25:58 UTC
if you wish webrsync to continue to point to snapshots with simply a date, then
do so (but changing it to point to 'portage-current' is probably advisable too).

I would say at this point it's easier to get the users of the delta system to
move than it is to get the users of the webrsync system to move - simply because
there are probably a lot less delta users.

1. change the snapshot filename format to portage-YYYYMMDDHHMM.
2. for the next N months, have symlinks named portage-YYYYMMDD pointing to the
newest YYYYMMDDHHMM file.
3. never reissue a snapshot more than once a minute.
4. change the delta system to use the YYYYMMDDHHMM snapshots to create deltas.

This ensures an upgrade path for the users of webrsync. It's a bit bumpy for the
delta users as they need a newer version of delta-webrsync, but I believe this
is the least invasive means of solving everybodies problems.
Comment 19 Kurt Lieber (RETIRED) gentoo-dev 2005-11-29 21:36:03 UTC
Just for the record, we don't really care what emerge-webrsync chooses to point
to, be it a date, symlink, whatever.  That's up to the folks that maintain the
package.  (karltk, last I checked)  Historically, it has used a date, that's all.

Robin's suggestion seems to cover most of the bases.  The only problem I can
forsee is how will emerge-webrsync know what minute to look for?  Might be
easier to change it to look for the -latest symlink as long as we're going to
update the script.

So, from what I can see, this would still give us the flexibility to re-release
as often as necessary and gives the deltup stuff a consistent base.

Brian?  thoughts?
Comment 20 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-11-30 00:17:28 UTC
Attached is my updated emerge-webrsync that pulls the -latest symlink.
process as follows:
1. download -latest md5sum
2. look in md5sum for newest filename
3. download filename from #2

I've also done some cleanups and performance improvements.
Comment 21 Brian Harring (RETIRED) gentoo-dev 2005-11-30 19:52:53 UTC
Well, Robin's suggestion is inline with what I'm arguing for above, so yeah,
works for me.  Modified emerge-webrsync is fine, I'll commit it (the little
maintenance it requires I handle) as soon as y'all give the go on this route.

If you want a patch for gen-snapshots.sh, give a yell.

Any remaining issues for the route hashed out?
Comment 22 Brian Harring (RETIRED) gentoo-dev 2005-12-17 19:08:29 UTC
*bump*

Infra, do you accept the proposal?  If not, what issues?
Comment 23 Brian Harring (RETIRED) gentoo-dev 2006-01-16 00:18:45 UTC
Been a month- comment now, or I'm moving forward.
Comment 24 Brian Harring (RETIRED) gentoo-dev 2006-03-09 15:49:31 UTC
Zac, the -latest symlink modification you should probably take a look at...
Comment 25 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2006-09-30 13:32:56 UTC
zmedico/ferringb: is this bug still needed?
Comment 26 Brian Harring (RETIRED) gentoo-dev 2006-09-30 13:39:36 UTC
don't even remember, it's been 8 months and I don't have access to verify it's safe.

offhand, pretty sure the fundamental issue still is there.
Comment 27 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2010-08-03 22:45:08 UTC
4 years since last comment, no one even remembers the issue?? Closing..