|Summary:||Portage module to use DJB's cdb (constant database) for performance reasons, in ebuild format|
|Product:||Gentoo Linux||Reporter:||Matan Peled <chaosite>|
|Component:||New packages||Assignee:||Portage team <dev-portage>|
|Severity:||enhancement||CC:||abraham, hoffbrinkle, leho, mstearn, pacho, wschlich|
|Package list:||Runtime testing required:||---|
Description Matan Peled 2005-02-26 06:16:18 UTC
Reading through the forum thread, I saw major improvements in metadata generation, searching, etc. times. On a chat with carpaski on #gentoo, he explained why the module could not be made default (external runtime dependancies subject to a segfault), but said it might be included. So, I made an (hackish, I'm new at this) ebuild out of it. the dev-db/cdb ebuild is keyworded for x86, alpha, and ~amd64, but the python-cdb ebuild is only keyworded for x86 and ~amd64. So either the ebuild(s?) need to be fixed, or this only applies to x86/amd64. Reproducible: Always Steps to Reproduce:
Comment 1 Matan Peled 2005-02-26 06:17:24 UTC
Created attachment 52185 [details] app-portage/portage-cdb
Comment 3 Brian Harring (RETIRED) 2005-02-27 19:30:21 UTC
Make database.sync actually do something, rather then silently acting as if it did what was requested please :) Re: database instance caching, don't like the approach offhand- binding within the class rather then w/in the module namespace is preferable imo, but that's just my opinion. Beyond that, why cache category db instances? That's not the slowdown (exempting repoman crazyness, the caching of new instances isn't required). Aside from that, self.modified shouldn't be False till after the data has actually been sync'd to disk- if an exception bails out afterwards, the module is now in an invalid state. iirc, cPickle.HIGHEST_PROTOCOL w/ 2.2 -> 2.3, has an issue- the highest protocol was upped for 2.3, leading to incompatible pickle'd data. Something to note... Meanwhile, marking it as LATER- take a look at http://dev.gentoo.org/~ferringb/cache/ also please. I'm intending on moving the cache db classes from being category specific, to being repository specific- template.py, and fs_template.py should be usable for your CDB backend. If the framework above isn't usable enough/have suggestions, please give a yell (preferably on the bug).
Comment 4 Brian Harring (RETIRED) 2005-02-27 20:45:58 UTC
Addendum, setting /etc/modules by default I'm not much for... We also lack any form of policy on that so suggestions are welcome.
Comment 5 Brian Harring (RETIRED) 2005-02-27 23:21:12 UTC
*** Bug 26447 has been marked as a duplicate of this bug. ***
Comment 6 Tobias Bell 2005-05-10 23:58:07 UTC
The problem with database.sync is, that it's done after every key-value insert by emerge --sync or --metadata. That would make cdb unbelieveable slow, because normaly it's a constant database. And there is no problem with setting self.modified to True because you can't corrupt the database. The realSync method creates a new database and makes an rename to the old database. This is a rather atmic operation. It works or not, but no corruption. I think the whole portage db-caching needs a new design. My module is so hackish to gain a bit performance.
Comment 7 Marius Mauch (RETIRED) 2007-01-11 14:41:41 UTC
Closing due to old age (module does't work with portage-2.1 anyway).