When migrating from python2 to python3 it may be useful for emirrordist to have database dump and restore in order to avoid problems like this: > # file {deletion-db.bdb,distfile-db.bdb,recycle-db.gdbm} > deletion-db.bdb: Berkeley DB (Hash, version 9, native byte-order) > distfile-db.bdb: Berkeley DB (Hash, version 9, native byte-order) > recycle-db.gdbm: Berkeley DB (Hash, version 9, native byte-order) > note the recycle-db > running emirrordist under py3 (it's been in py2 till now), throws this: > File "/usr/lib64/python3.6/site-packages/portage/_emirrordist/Config.py", line 64, in __init__ > options.recycle_db, 'recycle') > File "/usr/lib64/python3.6/site-packages/portage/_emirrordist/Config.py", line 121, in _open_shelve > db = shelve.open(db_file, flag=open_flag) > File "/usr/lib64/python3.6/shelve.py", line 243, in open > return DbfilenameShelf(filename, flag, protocol, writeback) > File "/usr/lib64/python3.6/shelve.py", line 227, in __init__ > Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback) > File "/usr/lib64/python3.6/dbm/__init__.py", line 88, in open > raise error[0]("db type could not be determined") > dbm.error: db type could not be determined
Created attachment 636914 [details] quick and dirty shelve dump/restore script > $ shelve_utils.py -h > usage: shelve-utils [-h] {dump,restore,test} ... > > positional arguments: > {dump,restore,test} sub-command help > dump dump shelve database > restore restore shelve database > test run unit tests > > optional arguments: > -h, --help show this help message and exit
shelve_utils.py dump foo.shelve foo.pickle shelve_utils.py restore foo.pickle foo.shelve
Failure on one of the files: # for i in *bdb *gdbm ; do echo $i ; python2 /home/robbat2/shelve_utils.py dump ${i} ${i%.*}.pickle ; echo $? ; done ; deletion-db.bdb 0 distfile-db.bdb Traceback (most recent call last): File "/home/robbat2/shelve_utils.py", line 136, in <module> main(argv=sys.argv) File "/home/robbat2/shelve_utils.py", line 132, in main args.func(args) File "/home/robbat2/shelve_utils.py", line 89, in dump for item in src.items(): File "/usr/lib/python-exec/python2.7/../../../lib64/python2.7/UserDict.py", line 144, in iteritems yield (k, self[k]) File "/usr/lib64/python2.7/shelve.py", line 121, in __getitem__ f = StringIO(self.dict[key]) File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in __getitem__ return _DeadlockWrap(lambda: self.db[key]) # self.db[key] File "/usr/lib64/python2.7/bsddb/dbutils.py", line 68, in DeadlockWrap return function(*_args, **_kwargs) File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in <lambda> return _DeadlockWrap(lambda: self.db[key]) # self.db[key] KeyError: 'Config-Tiny-2.16.tgz' 1 recycle-db.gdbm 0
The databases are available at: dev.gentoo.org:/home/robbat2/emirrordist-snapshot-20200507T2155/
The two files that do dump also restore fine, but i'm surprised how much smaller the new files are. # for i in *bdb *gdbm ; do echo $i ; python2 /home/robbat2/shelve_utils.py dump ${i} ${i%.*}.pickle.tmp && mv ${i%.*}.pickle.tmp ${i%.*}.pickle && python3 /home/robbat2/shelve_utils.py restore ${i%.*}.pickle ${i}-py3 ; echo $? ; done ; deletion-db.bdb 0 distfile-db.bdb Traceback (most recent call last): File "/home/robbat2/shelve_utils.py", line 136, in <module> main(argv=sys.argv) File "/home/robbat2/shelve_utils.py", line 132, in main args.func(args) File "/home/robbat2/shelve_utils.py", line 89, in dump for item in src.items(): File "/usr/lib/python-exec/python2.7/../../../lib64/python2.7/UserDict.py", line 144, in iteritems yield (k, self[k]) File "/usr/lib64/python2.7/shelve.py", line 121, in __getitem__ f = StringIO(self.dict[key]) File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in __getitem__ return _DeadlockWrap(lambda: self.db[key]) # self.db[key] File "/usr/lib64/python2.7/bsddb/dbutils.py", line 68, in DeadlockWrap return function(*_args, **_kwargs) File "/usr/lib64/python2.7/bsddb/__init__.py", line 270, in <lambda> return _DeadlockWrap(lambda: self.db[key]) # self.db[key] KeyError: 'Config-Tiny-2.16.tgz' 1 recycle-db.gdbm 0 # ls -la *.{bdb,gdbm,pickle}* -rw-r--r-- 1 root root 5095424 May 7 21:55 deletion-db.bdb -rw-r--r-- 1 root root 1937408 May 9 05:09 deletion-db.bdb-py3 -rw-r--r-- 1 root root 1334380 May 9 05:09 deletion-db.pickle -rw-r--r-- 1 root root 20832256 May 7 21:55 distfile-db.bdb -rw-r--r-- 1 root root 10246503 May 9 05:09 distfile-db.pickle.tmp -rw-r--r-- 1 root root 2600960 May 7 21:55 recycle-db.gdbm -rw-r--r-- 1 root root 851968 May 9 05:09 recycle-db.gdbm-py3 -rw-r--r-- 1 root root 461908 May 9 05:09 recycle-db.pickle
Oh, and to make the python2 dump side work, I had to reinstall portage-2.3.79 before the py27 stuff was removed, because python3 fails to dump correctly on the old files.
Created attachment 636928 [details] quick and dirty shelve dump/restore script With this version I was able to create dumps which I've uploaded here: dev.gentoo.org:/home/zmedico/emirrordist-snapshot-20200507T2155/*.pickle I used dev-python/bsddb3 in order to avoid a "No module named _bsddb" ImportError.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=d121ea57ed5310d84328be27f10a8556b0a7d7ba commit d121ea57ed5310d84328be27f10a8556b0a7d7ba Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2020-05-08 23:32:49 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-02-24 15:27:47 +0000 Add emirrordist shelve dump/restore (bug 721680) Bug: https://bugs.gentoo.org/721680 Signed-off-by: Zac Medico <zmedico@gentoo.org> bin/shelve-utils | 32 +++++++++++++++++++ lib/portage/tests/util/test_shelve.py | 60 +++++++++++++++++++++++++++++++++++ lib/portage/util/shelve.py | 58 +++++++++++++++++++++++++++++++++ 3 files changed, 150 insertions(+)