I figured out, that the portage tree most of the time (ever?) have a lot of false MD5 checksums in the Manifest files. Sometimes even files are not listed there or listed files do not exist. I've written a small shell script which checks the whole portage tree and reports any bugs it finds. It is still under development. For instance I plan to add an automatic report by mail to the maintainer as given in metadata.xml if existent. I now just wanted to know if somebody is already writing such a beast and if not if there are some whishes from your side and of course if you find it helpful. In the future I will surely rewrite this in another language because right now a complete run needs about an hour on my system. In the following attachments you will find the current shell script and the output it gives on the current portage tree. Reproducible: Always Steps to Reproduce:
Created attachment 30246 [details] The script which checks portage tree
Created attachment 30247 [details] Current output of the script
The idea is to run this on a central portage server regularly (e.g. once a day) so that developers are informed automatically about any mistakes they made. But I would not know what to do if the maintainer email address is not given in the metadata.xml file.
agriffis: you last touched app-admin/gkrellm, which is listed do you use repoman commit ?
mholzer, yes, 99% of the time I use repoman. I will check into these failures.
Okay, some of these are definitely my fault. I'll go through the whole list and make sure it's cleaned up immediately. :-|
I've finished fixing these and I'm now running Andre's script for verification
Okay, it all checks out now. Andre, Aliz might be interested what your script does. He runs http://gentoo.tamperd.net/stable which does a range of checks like this. I have added him to the cc
i already made and run a script that does this ... been using it for a few months now ...
Even though it's fixed (currently checking it), I just wanted to say what the main reason was to write it. My brother had some strange problems with rsync mirrors (see bug 4660). Means, files were not downloaded because of same size and timestamp. Unfortunately, checksums did not match with Manifest files. OK, whole portage tree could have been deleted and reloaded again. But, since I think that it is unlikely that he was the only person having these problems, I decided to write a tool to check whole portage for bad checksums. This should be able to delete files, which have bad checksums. A sync afterwards would only download these previously deleted files. This would dramatically decrease network load compared to reloading the whole portage tree. This means, I will continue writing this tool. It should be able to reduce time to one minute or less when using another language. If I'm able to reduce needed time to a few seconds it would be able to run it just before the next sync. That's what the option -a is for! Then local portage tree is free of bugs if main portage is also. Wouldn't that be nice?
Again there are A LOT OF bad checksums. Things were nearly OK two days ago. I've set up a service on my local system were you can look at the current state of the portage errors. It is updated regularly once a day and is available approx. at 9am GMT. You can also find the current version of the check-portage script in this directory. http://anhi.homelinex.net/portage/ There are still some things to do in the script. E.g. I haven't started coding the parser for the maintainer mailing adress. And of course it will be transfered to python as it is the standard scripting language for Gentoo and it is far better the shell script (I expect a speed advantage of about 80%). I recognized that the error which were fixed two days ago were only the checksum failures. Check for missing files or outdated files was not done. If you have any trouble with the link, don't hesitate to send me a mail.
Re-opening for further investigation by qa team
Number of corrupted manifests is now slowly decreasing. After having a high of 6572 corrupted manifests on 2004-06-08 it's now at 5731 by 2004-06-20. Unfortunately, I didn't have time to work on the script. By thursday I'm on vacation for ten days... if it rains there... BTW, could someone please tell me, what these files in the metadata directory are good for? Is it worth to check these files for consistency with the ebuilds?
Created attachment 35664 [details] modified version Nice script- I added a -f (fix) option and have run it against the cvs tree. I got no where near the 1000s of broken Manifests but most are hopefully fixed. The portage cache (metadata) helps speed up portage (believe it or not) by storing raw easy to obtain ebuild stuff in a flat file. I don't think its worth checking the metadata as they are automatically generate from the ebuilds.
I'm closing this. All Manifest are fixed and I'm watching the tree for new broken broken Manifests.