Summary: | anydbm requires external locking | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Jason Wever (RETIRED) <weeve> |
Component: | [OLD] Core system | Assignee: | Portage team <dev-portage> |
Status: | RESOLVED FIXED | ||
Severity: | normal | Keywords: | InVCS |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | Sparc | ||
OS: | All | ||
See Also: | https://bugs.gentoo.org/show_bug.cgi?id=736473 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 349307 |
Description
Jason Wever (RETIRED)
![]() running multiple emerges by chance? Err, bah, pardon- pays to read the summary careful :) need try/except locks in portage_db_anydbm.database for when anydbm throws the exception- lockfiles possibly? Would be nice if the actual dbm instance would just block till it has access... Jason, line 22 of portage_db_anydbm.py (roughly, dependant on version), if you could chuck this in- -except: +except Exception, e: +print "exception caught e(%s)" % str(e) (obviously adjust whitespace) Looks like the initial open fails, although I'd be curious what the specific exception is- after the open fails, portage tries to create a new db, resulting in GDB throwing a GDBM_READER_CANT_DELETE (11) error. The initial open could be failing due to access denied, or some other crazy crap. Adding the print for the exception will help to identify the issue. Will do once this emerge finishes (it's still going). I've noticed in the past that when I run into this, typically it is during an emerge -uvD world that has package updates that will take some time (like KDE). When the emerge first starts up, I can still use repoman, but after roughly an hour or two I cannot until the emerge finishes. Hopefully that will be of some help/ Here are the results after changing/usr/lib/portage/pym/portage_db_anydbm.py. excelsior webmin # repoman RepoMan scours the neighborhood... exception caught e((11, 'Resource temporarily unavailable')) Traceback (most recent call last): File "/usr/bin/repoman", line 667, in ? myaux=portage.db["/"]["porttree"].dbapi.aux_get(catdir+"/"+y,allvars,strict= 1) File "/usr/lib/portage/pym/portage.py", line 4496, in aux_get self.auxdb[cat] = self.auxdbmodule(self.cachedir,cat,auxdbkeys,uid,portage_g id) File "/usr/lib/portage/pym/portage_db_anydbm.py", line 23, in __init__ self.db = anydbm.open(self.filename, "n", 0664) File "/usr/lib/python2.3/anydbm.py", line 83, in open return mod.open(file, flag, mode) gdbm.error: (11, 'Resource temporarily unavailable') Jason, haven't forgot about this bug, although addressing it is going to be a bit of a pita. Long story short, portage_db_anybm is rather flawed- anydbm either uses bsddb, or gdbm. gdbm flat out denies multiple writers from working on a file. Good behaviour, annoying for our needs though (this is what you're hitting btw). bsddb on the other hand, is dumb. Really, *really* dumb. Multiple writers can work on a file, problem is there isn't any coordination, so pretty much the last writer to close the file is the one that gets it's changes into the file. This of course is assuming that the file hasn't been corrupted beyond belief. Changes made by writer A, do not show up in reader B, unless A syncs, and reader B close's/reopens the db in question. Pretty much, we ought to either A) default to bsddbm, and uses locks to signal when readers are out of sync w/ what's on disk (and must close/reopen), or B) add a use flag, dbm, and have it pull in dev-python/bsddb3 , a reworked bsddb module that allows for concurrent access sanely. Gah... s/bsddbm/bsddb/ for above. To verify what I'm saying, for gdbm- echo "import gdbm; db=gdbm.open('dar','n',0664);db2=gdbm.open('dar','r',0664);" | python implodes when attempting to open a reader- you *cannot* open a writer when readers have the db open, and vice versa, no opening reader's when a writer is fooling with it. for bsddb just pop open two python instances, and run this in both- import anydbm db=anydbm.open("dar.bsddb","c",0644) now start assigning records into the db (calling db.sync() as you please), and check the available keys via db.keys() You'll note they differ. Annoying, but can be dealt with. sync the db's (noting the order they've been synced), and close them. Reopen the db, and note they weren't merged, one writer overwrote the other. If you're lucky, you'll get to see it corrupt the db, which is *really* fun :) *** Bug 54280 has been marked as a duplicate of this bug. *** Can someone confirm that this is a long-since-dead bug? (Last modified: 2005-10-07) The gdbm module still raises 'Resource temporarily unavailable' if the db is already opened for write by another process. Python's gdbm module docs show a 'u' flag that can be used to open the database without locking, which might work for our purposes. It seems that bsddb deprecated in python-2.x, and removed in python-3.x, so that's not worth supporting. It's fixed here to use the "u" flag with gdbm, for concurrent writers, and it always creates gdbm type databases when python is built with gdbm suport: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=460fa368db599a21769e1be267d19cd3a5bd9572 This is fixed in 2.1.9.27. |