Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 50872 - Proxy cache suited for Gentoo packages
Summary: Proxy cache suited for Gentoo packages
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Tools (show other bugs)
Hardware: All All
: High enhancement (vote)
Assignee: Maurice van der Pot (RETIRED)
URL: http:/gertjan.freezope.org/replicator
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-05-12 12:01 UTC by Tom P.
Modified: 2005-06-02 01:45 UTC (History)
14 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
http-replicator ebuild (http-replicator-2.0.ebuild,1.01 KB, text/plain)
2004-05-12 12:03 UTC, Tom P.
Details
config file patch for gentoo (http-replicator-2.0-conf-gentoo-patch,3.49 KB, patch)
2004-05-12 12:05 UTC, Tom P.
Details | Diff
Gentoo specific init script (http-replicator-2.0-init,424 bytes, text/plain)
2004-05-12 12:08 UTC, Tom P.
Details
patch for replicators daemon code to work with my init script (http-replicator-2.0-init-gentoo-patch,955 bytes, patch)
2004-05-12 12:10 UTC, Tom P.
Details | Diff
config file patch for gentoo: spelling corrected (http-replicator-2.0-conf-gentoo-patch,3.42 KB, patch)
2004-05-13 23:05 UTC, Tom P.
Details | Diff
http-replicator ebuild -r1 (http-replicator-2.0-r1.ebuild,1.05 KB, text/plain)
2004-05-17 03:04 UTC, Tom P.
Details
replicator partial files fix (http-replicator-2.0-gentoo-patch,515 bytes, patch)
2004-05-17 03:05 UTC, Tom P.
Details | Diff
config file patch for gentoo: simplify activation (http-replicator-2.0-conf-gentoo-patch,3.43 KB, patch)
2004-05-17 03:16 UTC, Tom P.
Details | Diff
http-replicator ebuild -r2 (http-replicator-2.0-r2.ebuild,1.13 KB, text/plain)
2004-06-04 20:16 UTC, Tom P.
Details
config file patch for gentoo (http-replicator-2.0-conf-gentoo-patch,3.56 KB, text/plain)
2004-06-04 20:17 UTC, Tom P.
Details
Gentoo specific init script (http-replicator-2.0-init,424 bytes, text/plain)
2004-06-04 20:17 UTC, Tom P.
Details
patch for replicators daemon code to work with my init script (http-replicator-2.0-init-gentoo-patch,1.20 KB, text/plain)
2004-06-04 20:18 UTC, Tom P.
Details
My cache manager script: installs, and maintains cache, imports new files. (repcacheman,5.75 KB, text/plain)
2004-06-04 20:23 UTC, Tom P.
Details
http-replicator including proxy support (http-replicator-flybynite-1.5.tar.bz2,7.38 KB, application/octet-stream)
2004-08-10 07:25 UTC, John Herdy
Details
http-replicator ebuild 2.1 (http-replicator-2.1.ebuild,1.29 KB, text/plain)
2004-10-31 07:27 UTC, Tom P.
Details
standard init script (http-replicator-2.1.init,657 bytes, text/plain)
2004-10-31 07:29 UTC, Tom P.
Details
standard /etc/conf.d/ config file (http-replicator-2.1.conf,1.36 KB, text/plain)
2004-10-31 07:31 UTC, Tom P.
Details
My cache manager script: installs, and maintains cache, imports new files. Portage 2.0.51 (http-replicator-2.1-repcacheman-0.32,6.09 KB, text/plain)
2004-10-31 07:42 UTC, Tom P.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tom P. 2004-05-12 12:01:47 UTC
Http-replicator is a proxy cache written in python.  It is best suited to maintain a local cache of packages for Gentoo on a Lan or Wan.  Requests for a package will be served from the local cache, if available.  If not available, replicator will simultaneously download the package to the cache AND multiple clients.

The author designed the proxy for debian.  I contacted him and worked with him to make sure it would work with gentoo.  He has made most of the mods I requested, but not all.  He has promised to make some more changes in the next verson :-)

I have patched the current version of replictor to work with the default gentoo install.  The defaults are to use /usr/portage/distfiles as the cache directory.  This ensures all previously downloaded files prime the cache.  It listens on port 8080.  Edit /etc/http-replicator.conf to change the defaults and see short activation help.

Minor changes to /etc/make.conf are required to make portage aware of the proxy.  I made the changes here so only portage is aware of the proxy, it won't interfere with any other proxy in use.

I suggest net-misc.

This package is important to gentoo because it will not only ease the load on mirrors, but will transfer packages at lan speeds to the clients.

Authors Description From Freshmeat:
Replicator is a replicating HTTP proxy server. Files that are downloaded through the proxy are transparently stored in a private cache, so an exact copy of accessed remote files is created on the local machine. It is, in essence, a general purpose proxy server, but especially suited for maintaining a cache of Debian or Gentoo packages
Comment 1 Tom P. 2004-05-12 12:03:36 UTC
Created attachment 31285 [details]
http-replicator ebuild
Comment 2 Tom P. 2004-05-12 12:05:21 UTC
Created attachment 31286 [details, diff]
config file patch for gentoo

Changes default to gentoo specific
Comment 3 Tom P. 2004-05-12 12:08:13 UTC
Created attachment 31288 [details]
Gentoo specific init script

Written by me
Comment 4 Tom P. 2004-05-12 12:10:26 UTC
Created attachment 31289 [details, diff]
patch for replicators daemon code to work with my init script

replicator has separate daemonizing code, this patches that code to work with
the gentoo specific init script I wrote.
Comment 5 Tom P. 2004-05-12 12:15:48 UTC
I should add that python 2.3 is only needed for an experimental timeout() to try and work with cranky, overloaded mirrors.  If most users haven't upgraded yet, I can remove that experimental feature.
Comment 6 Tom P. 2004-05-13 23:00:14 UTC
Comment on attachment 31286 [details, diff]
config file patch for gentoo

>--- http-replicator-2.0/http-replicator.orig/http-replicator.conf	2004-05-01 12:16:06.000000000 -0500
>+++ http-replicator-2.0/http-replicator.conf	2004-05-07 14:56:52.000000000 -0500
>@@ -1,3 +1,29 @@
>+# ************README-Gentoo Http-Replicator *******************
>+# The defaults in Http-Replicator have been changed to work with the
>+# default Gentoo install and shouldn't have to be changed.  The only 
>+# changes required to activate Http-Replicator are in /etc/make.conf
>+# on the clients and the server itself.
>+#
>+# Find the Default fetch command section in /etc/make.conf.  If you are
>+# already using one of these alternate fetch commands apply the changes
>+# to your section.  Otherwise, make the following changes:
>+#
>+# 1.  Add the PROXY="http_proxy=http://YourProxyHere.com:8080" Line
>+#	replacing YourProxyHere.com with your proxy hostname or IP address.
>+# 2.  Uncomment (remove the leading '#') from FETCHCOMMAND and RESUMECOMMAND
>+# 3.  Remove the -c from the RESUMECOMMAND
>+# 4.  Add $PROXY to the beginning of the lines.
>+# 
>+# It will look like this when complete:
>+#
>+# Default fetch command (5 tries, passive ftp for firewall compatibility)
>+# PROXY="http_proxy=http://YourMirrorHere.com:8080"
>+# FETCHCOMMAND="$PROXY /usr/bin/wget -t 5  \${URI} -P \${DISTDIR}"
>+# RESUMECOMMAND="$PROXY /usr/bin/wget -t 5  \${URI} -P \${DISTDIR}"
>+#
>+#
>+# ************END README-Gentoo Http-Replicator *******************
>+
> #  This is the configuration file for the replicator proxy server.
> #  Settings from this file will apply to the server in daemon mode and also to the cache cleaner script, if used.
> #
>@@ -14,36 +40,36 @@
> #  * flat: save files in a single directory
> #  * debug: crash on exceptions
> 
>-FLAGS = []
>+FLAGS = ['static','flat']
> 
> #  For security reasons the hosts for which access to the proxy is granted should be specified in the [IP] list.
> #  A '?' can be used as wildcard for a single digit and a '*' for a multiple digits.
> #  For example '10.0.?.*' grants access from 10.0.1.25 but not from 10.0.15.25.
> 
>-IP = ['127.0.0.1']
>+IP = ['127.0.0.1','192.168.*.*','10.*.*.*']
> 
> #  The proxy server can be monitored via telnet on port [TELNET].
> #  This is disabled by entering a zero value.
> #  Otherwise make sure the port is available or replicator will not start.
> 
>-TELNET = 8081
>+TELNET = 0
> 
> #  The process user id is set to [USER].
> #  The daemon must be started as root because no other user can change into another.
> #  Not even [USER] can change into itself!
> 
>-USER = 'proxy'
>+USER = 'portage'
> 
> #  All cached files ar saved in directory [DIR].
> #  The [USER] should of course have write permission in this directory.
> #  Where in this directory the files are actually put depends on if the server is in flat mode.
> #  By default the entire directory structure is copied.
> 
>-DIR = '/home/cache'
>+DIR = '/usr/portage/distfiles'
> 
> #  The process id of the running process is saved in [PID].
> #  As this is done before changing into [USER], write permission for [USER] for this file is not needed.
>-
>+#  Changes here also have to be make in the init script.
> PID = '/var/run/http-replicator.pid'
> 
> #  All messages on stdout and stderr are in daemon mode written to the [LOG].
>@@ -55,5 +81,5 @@
> #  The value of [KEEP] sets the maximum number of versions of each package to be kept.
> #  For example a value of one will delete all versions but the most recent.
> #  The script is disabled by setting this value to zero.
>-
>+#  Not implemented in Gentoo yet!
> KEEP = 2
Comment 7 Tom P. 2004-05-13 23:05:13 UTC
Created attachment 31393 [details, diff]
config file patch for gentoo: spelling corrected

updated spelling error
Comment 8 Bret Towe 2004-05-16 13:06:33 UTC
i just tried this out it friggin rocks
ive been waiting for somethin like this for ages

aka works for me :)
Comment 9 Seemant Kulleen (RETIRED) gentoo-dev 2004-05-16 20:44:31 UTC
I'm going to try this in the next couple of days, this is really cool!
Comment 10 Tom P. 2004-05-17 03:04:03 UTC
Created attachment 31571 [details]
http-replicator ebuild -r1

-r1 - added patch for replicator/emerge partial files fix
Comment 11 Tom P. 2004-05-17 03:05:34 UTC
Created attachment 31572 [details, diff]
replicator partial files fix
Comment 12 Tom P. 2004-05-17 03:16:04 UTC
Created attachment 31573 [details, diff]
config file patch for gentoo: simplify activation

simplify the activation instructions
Comment 13 Tom P. 2004-05-17 03:19:27 UTC
Fixed a couple of minor bugs and simplified the activation changes.  There is more info at http://forums.gentoo.org/viewtopic.php?t=173226
Comment 14 Tom P. 2004-05-17 15:21:57 UTC
The new fix needs more testing, sorry.  Use the original ebuild/patches or the instructions on the forum for now.  The problem isn't replicator, just my patch has side affects.
Comment 15 Nicholas Jones (RETIRED) gentoo-dev 2004-05-18 10:46:07 UTC
Tools-portage might consider picking this up when it's stable.
Otherwise I'd suggest the script-repo once it gets going.
Comment 16 Tom P. 2004-05-23 20:33:04 UTC
Http-Replicator install HOWTO updated to version 1.3

Http-replicator still at 2.0

I will post individual files after a get some feedback on the forum at:
http://forums.gentoo.org/viewtopic.php?t=173226
Comment 17 Bret Towe 2004-05-26 19:41:45 UTC
ive been using this for a while now and its working great
only problem ive seen aside from resuming
is ftp stuff isnt cached which happens when a ebuild has ftp src_uri in it
and also RESTRICT="nomirror" like alsa-libs and utils do for example
and setting ftp_proxy doesnt help matters since it doesnt understand ftp i guess
Comment 18 Tom P. 2004-05-27 00:18:31 UTC
Thanks for the report  Bret,

Your right that http-replicator doesn't support ftp.  You can of course, copy any file to replicator's cache directory and it will be available.  I've even written a tool to automatically import files to the cache, repcacheman.  The latest version is still available on the forums only.   

Here is the tricky part.. Portage will ignore replicator UNLESS you create a file called /etc/portage/mirrors containing:

# Http-Replicator override for FTP and RESTRICT="nomirror packages
local http://gentoo.oregonstate.edu/distfiles

With this file, portage WILL check replicator for those RESTRICTED and FTP files!!!

For now, replicator doesn't support resuming.  Resuming has dubious value on the LAN side, it's probably faster just to download the whole package at 11 MB/s.  Resuming on the internet side is in the planning stage.

Comment 19 Tom P. 2004-06-04 20:16:02 UTC
Created attachment 32673 [details]
http-replicator ebuild -r2
Comment 20 Tom P. 2004-06-04 20:17:14 UTC
Created attachment 32674 [details]
config file patch for gentoo
Comment 21 Tom P. 2004-06-04 20:17:53 UTC
Created attachment 32675 [details]
Gentoo specific init script
Comment 22 Tom P. 2004-06-04 20:18:38 UTC
Created attachment 32676 [details]
patch for replicators daemon code to work with my init script
Comment 23 Tom P. 2004-06-04 20:23:22 UTC
Created attachment 32677 [details]
My cache manager script: installs, and maintains cache, imports new files.

Not scrictly necessary for replicator to work, but makes it nice for gentoo. 
If new install this will create cache directory with proper permissions, import
distfiles to replicator cache checking to make sure they are complete and not
corrupt(md5 check).

After install, this script will delete the duplicate files on the server and
import any new files, such as the ones that were downloaded on the server using
ftp.
Comment 24 Tom P. 2004-06-04 20:27:01 UTC
This is the latest http-replicator ebuild and files.  Has been working with no major problems for users on the forum for over 1 week.
Comment 25 Daniel Hurt 2004-06-13 23:17:19 UTC
Works seamlessly on my lan with 4 computers.  I am extremely pleased with it.
Comment 26 Tom P. 2004-07-01 00:45:28 UTC
Received reports confirming http-replicator works fine on amd64 - Added amd64 to keywords.
Comment 27 Jose Gonzalez Gomez 2004-07-21 03:02:58 UTC
Tom: epatch is not working right now... there's a missing line if you want this to work properly:

inherit eutils

I have added it locally just before DESCRIPTION and now it's working properly.

Best regards
Comment 28 Jose Gonzalez Gomez 2004-07-21 04:06:30 UTC
I've also noticed that the ebuild doesn't create the proxy user... that leads to an error when executing repcacheman first time just after a fresh install. Maybe it could be a good idea to do it, as the default configuration uses that user.

Best regards
Comment 29 Jose Gonzalez Gomez 2004-07-21 04:37:37 UTC
Sorry, I was checking the application without the patches, once the patches get applied I have noticed you use the portage user... forget about the last comment.
Comment 30 Tom P. 2004-07-24 15:06:59 UTC
Jose,

Http-Replicator should be working for you now?  I've 

I guess something changed with epatch as it was working before...  I'll add inherit eutils to the ebuild.

I used the user 'portage' because I assumed it is present on all gentoo systems.  
Comment 31 Jose Gonzalez Gomez 2004-07-26 04:19:27 UTC
Yes, it's working great with the added inherit eutils.

I think using the portage user is a sensible choice. Some gentoo developer should give her opinion on this. Maybe I would change the default port, as 8080 is a port typically used by generic http proxies... what do you think?
Comment 32 Tom P. 2004-07-29 15:55:36 UTC
http-replicator confirmed working on alpha
Comment 33 John Herdy 2004-08-10 07:25:18 UTC
Created attachment 37148 [details]
http-replicator including proxy support

Tarball with all the necessary files to install http-replicator.
Comment 34 John Herdy 2004-08-10 07:28:39 UTC
previous attachment works perfectly in our environment (x86), going through the forum I see a lot of people who greatly benefit from this package, what needs to be done to add this to the tree?
Comment 35 Olaf Fichtner 2004-09-01 23:02:05 UTC
Works also here very nicely on x86. We had long been waiting for something like this, hope it will be officially available soon.

Great job!
Comment 36 Nicholas Jones (RETIRED) gentoo-dev 2004-10-19 21:31:40 UTC
If anyone is interested in binary packages and proxies...

From Bug #61845 Comment #2

Try this patch.
http://zarquon.twobit.net/gentoo/portage/getbinpkg.py.diff
Comment 37 Francois Guimond 2004-10-25 06:46:00 UTC
confirmed working on ppc
Comment 38 Ylosar Goer 2004-10-29 06:07:40 UTC
Hi, i am trying to setup this promising tool, and the initial run of repcacheman fails (see below). Any idea what is wrong here ?


Found 16432 ebuilds.

Extracting the checksums....
Missing digest: net-mail/gml-0.5
Done!

Verifying checksum's....
/usr/portage/distfiles/mailx-support-20030215.tar.bz2
Traceback (most recent call last):
  File "/usr/bin/repcacheman", line 204, in ?
    if t[0]:
KeyError: 0
Comment 39 Ylosar Goer 2004-10-29 08:59:27 UTC
I forgot to mention that this exception occurs with http-replicator-2.1_rc3 / Python 2.3.4 / Portage 2.0.51-r2 / x86 profile.

Anyway, i know quite *nothing* about python coding, but as far as i can tell, the problem is that '0' is not a key in 't'. Replacing t[0] by t['MD5'] (line 204 and 205) *seems* to solve the problem.

repcacheman final output:

SUMMARY:
Found 0 duplicate file(s).
        Deleted 0 dupe(s).
Found 663 new file(s).
        Added 626 of those file(s) to the cache.
        Rejected 0 corrupt or incomplete file(s).
        37 Unknown file(s) that are not listed in portage
        You may want to delete them yourself....

I may be all wrong here because.... how can i be the only one with this problem ???
Comment 40 Tom P. 2004-10-29 10:00:28 UTC
Yoann,

You're not the only one with this problem.  Your also probably not the only one who hasn't already read the solution at the original Http-Replicator forum thread:
http://forums.gentoo.org/viewtopic.php?t=173226

You probably only want the solution at:
http://forums.gentoo.org/viewtopic.php?t=173226&start=187

But yes, you've already solved the problem :-)
Comment 41 Ylosar Goer 2004-10-29 10:26:48 UTC
Ooops... sorry, and thanks for pointing me to this thread.

So i just tried http://www.updatedlinux.com/replicator/portagefix/repcacheman (v0.31, according to its header) and got:

Extracting the checksums....
Traceback (most recent call last):
  File "/usr/bin/repcacheman", line 167, in ?
    portage_util.writemsg("Missing digest: %s\n" % mycpv)
NameError: name 'portage_util' is not defined

So i switched back to portage.writemsg(), which appears to work fine here.

(sorry if this is a duplicate. I did not read the whole forum post (8 pages!))
Comment 42 Tom P. 2004-10-29 12:29:31 UTC
That was no dup!!

Thanks, I missed that.  Thats what I get for blindly copying code...
Comment 43 Tom P. 2004-10-31 07:27:49 UTC
Created attachment 43011 [details]
http-replicator ebuild 2.1
Comment 44 Tom P. 2004-10-31 07:29:50 UTC
Created attachment 43012 [details]
standard init script

Standard start-stop daemon
Comment 45 Tom P. 2004-10-31 07:31:41 UTC
Created attachment 43013 [details]
standard /etc/conf.d/  config file
Comment 46 Tom P. 2004-10-31 07:42:02 UTC
Created attachment 43014 [details]
My cache manager script: installs, and maintains cache, imports new files. Portage  2.0.51
Comment 47 Tom P. 2004-10-31 07:53:03 UTC
The Http-Replicator ebuild has surpassed 15,000 downloads.  Http-Replicator has worked perfectly saving users and gentoo infrastructure significant bandwidth while speeding up emerge's. My cache manager script helps new users install and maintain the cache and has been upgraded for portage 2.0.51 
Comment 48 Benjamin Martin 2004-12-16 02:37:48 UTC
Is there any chance that this makes it into portage? It's working like a charm here but i don't like keeping an eye on updates myself.
Comment 49 Maurice van der Pot (RETIRED) gentoo-dev 2005-05-06 02:06:00 UTC
I'm considering picking this one up, because it doesn't seem like it will get
into the tree any time soon if I don't.

Tom, is this the most recent version of your ebuild?
Comment 50 Tom P. 2005-05-08 19:19:25 UTC
Thanks for looking into this!

I kept this bug updated for a while, but I had just about given up hope getting this into portage.  Too many changes also makes this bug look more complex then it really is.  All but one of the 18 attachments are upgrades with new features, but some people may think they are "bugs".  I think it makes http-replicator look unstable when it is actually _very_ stable having so many attachments to this bug.

I have been keeping the original forum post up to date and it still has a lively following even today.  The thread has 315 posts and 30634 views.

See http://forums.gentoo.org/viewtopic-t-173226.html for the latest ebuild and HOWTO plus complete instructions.

I will update this bug with the latest ebuild if you think it will help....
Comment 51 Maurice van der Pot (RETIRED) gentoo-dev 2005-05-09 04:27:52 UTC
My 3 week vacation starts this week, but when I get back I will look into this and get back to you. Thanks.
Comment 52 Maurice van der Pot (RETIRED) gentoo-dev 2005-06-01 12:03:56 UTC
Too late, it's mine now ;)
Comment 53 Maurice van der Pot (RETIRED) gentoo-dev 2005-06-02 01:45:50 UTC
http-replicator 3.0 has been added to portage.
Please open new bug reports for any problems you may encounter.