Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 629048 - If Gentoo is offline, portage could store in a file all the SRC_URI of the packages to be merged, in order to get them in another machine
Summary: If Gentoo is offline, portage could store in a file all the SRC_URI of the pa...
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Enhancement/Feature Requests (show other bugs)
Hardware: All Linux
: Normal enhancement (vote)
Assignee: Portage team
URL: https://pypi.org/project/portage/
Whiteboard:
Keywords:
Depends on:
Blocks: 377365 world-domination
  Show dependency tree
 
Reported: 2017-08-27 11:09 UTC by Petross404(Petros S)
Modified: 2021-06-26 19:18 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Links to Firefox files.txt (links.txt,6.45 KB, text/plain)
2017-09-02 23:15 UTC, Agustin Ferrari
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Petross404(Petros S) 2017-08-27 11:09:32 UTC
I think a feature that would allow a user to have gathered all the links for the tarballs in a file, could allw him/her to feed that file in wget in another machine and move the tarballs to Gentoo in order the emerge to continue.

Let's say that my machine is offline and I want to reemerge @system but some tarballs are deleted because of a previous run of eclean. Portage then could calculate the deps, create the tree and append to a text file all the download links. This way I will move the text file in another machine and after I downloaded all the tarballs I will move them back to my DISTFILES.



Reproducible: Always

Steps to Reproduce:
1. emerge -options --fetch-links package
2. copy in thumb drive the text file and feed it to wget in another machine
3. move from thumbdrive all the tarballs under DISTFILES and continue emerge



This way if I want to install let's say some package and/or group of packages like KDE and my gentoo is offline, I will be able to easily gather all the tarballs instead of manually downloading each and every tarball one by one.
Comment 1 Agustin Ferrari 2017-09-02 23:15:34 UTC
Created attachment 491960 [details]
Links to Firefox files.txt

Do you mean something like this?:
emerge -p -f program | rev | cut -d" " -f2 | rev | grep "://" > links.txt
In the attached file you can see what the command returns if I wanted to install Firefox.
Comment 2 Petross404(Petros S) 2017-12-27 11:58:32 UTC
(In reply to Agustin Ferrari from comment #1)
> Created attachment 491960 [details]
> Links to Firefox files.txt
> 
> Do you mean something like this?:
> emerge -p -f program | rev | cut -d" " -f2 | rev | grep "://" > links.txt
> In the attached file you can see what the command returns if I wanted to
> install Firefox.

I mean exactly this. So: 

# emerge -p -f glibc | rev | cut -d" " -f2 | rev | grep "://" > links.txt
# wget `cat links.txt `

--2017-12-27 13:55:32--  http://ftpmirror.gnu.org/glibc/glibc-2.26.tar.xz
Resolving ftpmirror.gnu.org (ftpmirror.gnu.org)... 208.118.235.200
Connecting to ftpmirror.gnu.org (ftpmirror.gnu.org)|208.118.235.200|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://ftp.sotirov-bg.net/pub/mirrors/gnu/glibc/glibc-2.26.tar.xz [following]
--2017-12-27 13:55:33--  http://ftp.sotirov-bg.net/pub/mirrors/gnu/glibc/glibc-2.26.tar.xz
Resolving ftp.sotirov-bg.net (ftp.sotirov-bg.net)... 46.10.210.166
Connecting to ftp.sotirov-bg.net (ftp.sotirov-bg.net)|46.10.210.166|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14682748 (14M) [application/x-xz]
Saving to: ‘glibc-2.26.tar.xz’

glibc-2.26.tar.xz                                       8%[=========>                                                                                                             ]   1,24M   395KB/s    eta 33s    ^C

Can this be integraded in portage so the average user (or the newcomer) don't have to remember complicated commands like this?
Comment 3 Zac Medico gentoo-dev 2017-12-28 05:27:42 UTC
It requires support for SRC_URI arrows that are used to save the file under a different name.

For the file format, I suppose we could use a json file that maps each file name to a list of locations that the file can be fetched from.

The fetcher program could by installable via pypi, so that people can use it on any operating system that has python.
Comment 4 Zac Medico gentoo-dev 2017-12-28 05:28:56 UTC
We could also include digests in the file, for purposes of verifying the downloaded files.
Comment 5 Zac Medico gentoo-dev 2020-05-03 22:28:01 UTC
I'm planning to create a tool named edistfetch that will have the ability to import and export SRC_URI data. This tool can also be used to implement a caching proxy that's able to fetch arbitrary files that might not be mirrored (yet).

edistfetch features that are planned:

* Ability to run on a non-gentoo system, as a means to fetch files for an offline gentoo system or to implement a caching proxy.

* A key-value store which maps a distfile name to essential metadata like SRC_URI and file digests.

* Scraping, import, and export of data SRC_URI/digest data. A client of a caching proxy should be able to scrape its locally available SRC_URI/digest data and export it in a form that the caching proxy can merge into its own key-value store.

* Changes to file digests in the key-value store should result in persistent log entries, creating records of any conflicts that occur.

* Options to verify file digests, and archive or delete files with non-matching digests.
Comment 6 Zac Medico gentoo-dev 2021-02-28 18:24:14 UTC
(In reply to Zac Medico from comment #5)
> I'm planning to create a tool named edistfetch that will have the ability to
> import and export SRC_URI data. This tool can also be used to implement a
> caching proxy that's able to fetch arbitrary files that might not be
> mirrored (yet).
> 
> edistfetch features that are planned:
> 
> * Ability to run on a non-gentoo system, as a means to fetch files for an
> offline gentoo system or to implement a caching proxy.

It's possible that pip could be a good way for some people to install portage on a machine with internet access, so I'm planning to publish portage on pypi:

https://wiki.gentoo.org/wiki/Project:Portage/Releases-On-PyPi

> * A key-value store which maps a distfile name to essential metadata like
> SRC_URI and file digests.
>
> * Scraping, import, and export of data SRC_URI/digest data. A client of a
> caching proxy should be able to scrape its locally available SRC_URI/digest
> data and export it in a form that the caching proxy can merge into its own
> key-value store.

For normal sized fetch lists, a stream of newline delimited JSON will work. There's no need for random access. A stream like this can be dumped from or loaded into a key-value store, using a tool like the one from bug 721680.
Comment 7 Zac Medico gentoo-dev 2021-04-18 00:54:13 UTC
Portage is now available via pip install from PyPI:

    https://pypi.org/project/portage/


It doesn't have a specialized fetch tool yet, but it is possible to call emerge --fetch. When installed into a venv, portage is pre-configured with an EPREFIX offset which can be queried with portageq. For the default configuration, all paths are the same as for a normal Gentoo system, except that they are relative to ${EPREFIX}.

This is an example how to create a venv, install portage in it, query the repository configuration, and run the fetch unit tests. There's no profile configured at ${EPREFIX}/etc/portage/make.profile, so users need to create that before they perform any operations that require a profile.

>  ~ $ python -m venv ~/portage-venv
>  ~ $ . ~/portage-venv/bin/activate
> (portage-venv)  ~ $ pip install portage
> Successfully installed portage-3.0.18
> (portage-venv)  ~ $ type emerge
> emerge is hashed (/home/zmedico/portage-venv/bin/emerge)
> (portage-venv)  ~ $ type portageq
> portageq is hashed (/home/zmedico/portage-venv/bin/portageq)
> (portage-venv)  ~ $ portageq envvar EPREFIX
> /home/zmedico/portage-venv/lib/python3.8/site-packages
> (portage-venv)  ~ $ portageq envvar DISTDIR 2>/dev/null
> /home/zmedico/portage-venv/lib/python3.8/site-packages/var/cache/distfiles
> (portage-venv)  ~/portage-venv $ portageq envvar PORTAGE_REPOSITORIES 2>/dev/null
> [DEFAULT]
> auto-sync = yes
> main-repo = gentoo
> strict-misc-digests = true
> sync-allow-hardlinks = true
> sync-openpgp-key-refresh = true
> sync-rcu = false
> 
> [gentoo]
> auto-sync = yes
> location = /home/zmedico/portage-venv/lib/python3.8/site-packages/var/db/repos/gentoo
> masters = 
> priority = -1000
> strict-misc-digests = true
> sync-allow-hardlinks = true
> sync-openpgp-key-path = /home/zmedico/portage-venv/lib/python3.8/site-packages/usr/share/openpgp-keys/gentoo-release.asc
> sync-openpgp-key-refresh = true
> sync-openpgp-key-refresh-retry-count = 40
> sync-openpgp-key-refresh-retry-delay-exp-base = 2
> sync-openpgp-key-refresh-retry-delay-max = 60
> sync-openpgp-key-refresh-retry-delay-mult = 4
> sync-openpgp-key-refresh-retry-overall-timeout = 1200
> sync-openpgp-keyserver = hkps://keys.gentoo.org
> sync-rcu = false
> sync-type = rsync
> sync-uri = rsync://rsync.gentoo.org/gentoo-portage
> sync-rsync-verify-jobs = 1
> sync-rsync-verify-max-age = 24
> sync-rsync-extra-opts = 
> sync-rsync-verify-metamanifest = yes
> 
> (portage-venv)  ~ $ python ~/portage-venv/lib/python3.8/site-packages/portage/tests/{runTests.py,ebuild/test_fetch.py}
> testEbuildFetch (portage.tests.ebuild.test_fetch.EbuildFetchTestCase) ... ok
> test_content_hash_layout (portage.tests.ebuild.test_fetch.EbuildFetchTestCase) ... ok
> test_filename_hash_layout (portage.tests.ebuild.test_fetch.EbuildFetchTestCase) ... ok
> test_filename_hash_layout_get_filenames (portage.tests.ebuild.test_fetch.EbuildFetchTestCase) ... ok
> test_flat_layout (portage.tests.ebuild.test_fetch.EbuildFetchTestCase) ... ok
> test_mirror_layout_config (portage.tests.ebuild.test_fetch.EbuildFetchTestCase) ... ok
> 
> ----------------------------------------------------------------------
> Ran 6 tests in 3.691s
> 
> OK
Comment 8 Zac Medico gentoo-dev 2021-06-06 18:03:23 UTC
(In reply to Zac Medico from bug 794487 comment #1)
> We'll need some helper command(s) for working with the mirror layout. For
> example, we need a way to locate the distfiles associated with a particular
> ebuild, and those distfiles can originate from multiple local mirror with
> different layouts. I imagine something like the portageq metadata command
> that outputs a mapping of distfiles to paths will work. A versatile format
> for the output is newline delimited json, and in addition we can support
> alternative form(s) of columnar output if there's demand.

If we have a portageq command that outputs distfiles metadata with a certain schema, then this schema can possibly also be used as an interchange format for edistfetch.

(In reply to Zac Medico from comment #5)
> * A key-value store which maps a distfile name to essential metadata like
> SRC_URI and file digests.

We need to consider that the content-hash distfiles layout makes it possible for a particular distfile name to have different identities for different ebuilds. For example, an third-party repository can have a different distfile with a different digest but the same distfile name.