Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 721680 - sys-apps/portage: emirrordist database dump and restore
Summary: sys-apps/portage: emirrordist database dump and restore
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Unclassified (show other bugs)
Hardware: All All
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
Depends on: 766117
Blocks:
  Show dependency tree
 
Reported: 2020-05-08 19:26 UTC by Zac Medico
Modified: 2021-03-31 20:52 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
quick and dirty shelve dump/restore script (shelve_utils.py,3.14 KB, text/x-python)
2020-05-09 00:32 UTC, Zac Medico
Details
quick and dirty shelve dump/restore script (shelve_utils.py,3.45 KB, text/x-python)
2020-05-09 07:24 UTC, Zac Medico
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2020-05-08 19:26:28 UTC
When migrating from python2 to python3 it may be useful for emirrordist to have database dump and restore in order to avoid problems like this:

> # file {deletion-db.bdb,distfile-db.bdb,recycle-db.gdbm}
> deletion-db.bdb: Berkeley DB (Hash, version 9, native byte-order)
> distfile-db.bdb: Berkeley DB (Hash, version 9, native byte-order)
> recycle-db.gdbm: Berkeley DB (Hash, version 9, native byte-order)
> note the recycle-db
> running emirrordist under py3 (it's been in py2 till now), throws this:
>   File "/usr/lib64/python3.6/site-packages/portage/_emirrordist/Config.py", line 64, in __init__
>     options.recycle_db, 'recycle')
>   File "/usr/lib64/python3.6/site-packages/portage/_emirrordist/Config.py", line 121, in _open_shelve
>     db = shelve.open(db_file, flag=open_flag)
>   File "/usr/lib64/python3.6/shelve.py", line 243, in open
>     return DbfilenameShelf(filename, flag, protocol, writeback)
>   File "/usr/lib64/python3.6/shelve.py", line 227, in __init__
>     Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
>   File "/usr/lib64/python3.6/dbm/__init__.py", line 88, in open
>     raise error[0]("db type could not be determined")
> dbm.error: db type could not be determined
Comment 1 Zac Medico gentoo-dev 2020-05-09 00:32:07 UTC
Created attachment 636914 [details]
quick and dirty shelve dump/restore script

> $ shelve_utils.py -h
> usage: shelve-utils [-h] {dump,restore,test} ...
> 
> positional arguments:
>   {dump,restore,test}  sub-command help
>     dump               dump shelve database
>     restore            restore shelve database
>     test               run unit tests
> 
> optional arguments:
>   -h, --help           show this help message and exit
Comment 2 Zac Medico gentoo-dev 2020-05-09 00:46:23 UTC
shelve_utils.py dump foo.shelve foo.pickle
shelve_utils.py restore foo.pickle foo.shelve
Comment 3 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2020-05-09 05:08:35 UTC
Failure on one of the files:

# for i in *bdb *gdbm ; do echo $i ; python2 /home/robbat2/shelve_utils.py dump ${i} ${i%.*}.pickle ; echo $? ; done ;
deletion-db.bdb
0
distfile-db.bdb
Traceback (most recent call last):
  File "/home/robbat2/shelve_utils.py", line 136, in <module>
    main(argv=sys.argv)
  File "/home/robbat2/shelve_utils.py", line 132, in main
    args.func(args)
  File "/home/robbat2/shelve_utils.py", line 89, in dump
    for item in src.items():
  File "/usr/lib/python-exec/python2.7/../../../lib64/python2.7/UserDict.py", line 144, in iteritems
    yield (k, self[k])
  File "/usr/lib64/python2.7/shelve.py", line 121, in __getitem__
    f = StringIO(self.dict[key])
  File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in __getitem__
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
  File "/usr/lib64/python2.7/bsddb/dbutils.py", line 68, in DeadlockWrap
    return function(*_args, **_kwargs)
  File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in <lambda>
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
KeyError: 'Config-Tiny-2.16.tgz'
1
recycle-db.gdbm
0
Comment 4 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2020-05-09 05:11:39 UTC
The databases are available at:
dev.gentoo.org:/home/robbat2/emirrordist-snapshot-20200507T2155/
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2020-05-09 05:12:56 UTC
The two files that do dump also restore fine, but i'm surprised how much smaller the new files are.

# for i in *bdb *gdbm ; do echo $i ; python2 /home/robbat2/shelve_utils.py dump ${i} ${i%.*}.pickle.tmp && mv ${i%.*}.pickle.tmp ${i%.*}.pickle && python3 /home/robbat2/shelve_utils.py restore  ${i%.*}.pickle ${i}-py3 ; echo $? ; done ;
deletion-db.bdb
0
distfile-db.bdb
Traceback (most recent call last):
  File "/home/robbat2/shelve_utils.py", line 136, in <module>
    main(argv=sys.argv)
  File "/home/robbat2/shelve_utils.py", line 132, in main
    args.func(args)
  File "/home/robbat2/shelve_utils.py", line 89, in dump
    for item in src.items():
  File "/usr/lib/python-exec/python2.7/../../../lib64/python2.7/UserDict.py", line 144, in iteritems
    yield (k, self[k])
  File "/usr/lib64/python2.7/shelve.py", line 121, in __getitem__
    f = StringIO(self.dict[key])
  File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in __getitem__
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
  File "/usr/lib64/python2.7/bsddb/dbutils.py", line 68, in DeadlockWrap
    return function(*_args, **_kwargs)
  File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in <lambda>
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
KeyError: 'Config-Tiny-2.16.tgz'
1
recycle-db.gdbm
0


# ls -la *.{bdb,gdbm,pickle}*
-rw-r--r-- 1 root root  5095424 May  7 21:55 deletion-db.bdb
-rw-r--r-- 1 root root  1937408 May  9 05:09 deletion-db.bdb-py3
-rw-r--r-- 1 root root  1334380 May  9 05:09 deletion-db.pickle
-rw-r--r-- 1 root root 20832256 May  7 21:55 distfile-db.bdb
-rw-r--r-- 1 root root 10246503 May  9 05:09 distfile-db.pickle.tmp
-rw-r--r-- 1 root root  2600960 May  7 21:55 recycle-db.gdbm
-rw-r--r-- 1 root root   851968 May  9 05:09 recycle-db.gdbm-py3
-rw-r--r-- 1 root root   461908 May  9 05:09 recycle-db.pickle
Comment 6 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2020-05-09 05:13:44 UTC
Oh, and to make the python2 dump side work, I had to reinstall portage-2.3.79 before the py27 stuff was removed, because python3 fails to dump correctly on the old files.
Comment 7 Zac Medico gentoo-dev 2020-05-09 07:24:38 UTC
Created attachment 636928 [details]
quick and dirty shelve dump/restore script

With this version I was able to create dumps which I've uploaded here:

dev.gentoo.org:/home/zmedico/emirrordist-snapshot-20200507T2155/*.pickle

I used dev-python/bsddb3 in order to avoid a "No module named _bsddb" ImportError.
Comment 8 Larry the Git Cow gentoo-dev 2021-02-24 15:28:51 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=d121ea57ed5310d84328be27f10a8556b0a7d7ba

commit d121ea57ed5310d84328be27f10a8556b0a7d7ba
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2020-05-08 23:32:49 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2021-02-24 15:27:47 +0000

    Add emirrordist shelve dump/restore (bug 721680)
    
    Bug: https://bugs.gentoo.org/721680
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 bin/shelve-utils                      | 32 +++++++++++++++++++
 lib/portage/tests/util/test_shelve.py | 60 +++++++++++++++++++++++++++++++++++
 lib/portage/util/shelve.py            | 58 +++++++++++++++++++++++++++++++++
 3 files changed, 150 insertions(+)