I've written a couple of ebuilds for the FuzzyOcr SpamAssassin plugin (and dependency) which assigns scores based on matching spam terms in attached images. The ebuilds can be found in my devspace: http://dev.gentoo.org/~tomk/tmp/spamassassin-fuzzyocr-2.3b.ebuild http://dev.gentoo.org/~tomk/tmp/String-Approx-3.26.ebuild The String::Approx module is the only dependency which isn't already in the tree. I'm going for dev-perl for String-Approx and mail-filter for spamassassin-fuzzyocr. I'll be maintaining both packages but wanted to get the OK from you guys before adding them to the tree. The plugin seems to work well, since installing it earlier today I haven't had any of those image spam messages getting through (this kind of spam is the one that gets past the filters the most). Comments, questions, flames, etc. welcome :)
(In reply to comment #0) > Comments, questions, flames, etc. welcome :) I forgot to note that the URL for FuzzyOcr is being changed. The new site will be at http://fuzzyocr.own-hero.net/ . It is already up but needs more work in the wiki etc. But this is also the place for SVN etc. I will do more work as soon as I have my workstation back :) Best regards, Chris >
Sorry to bother you, but how is it with the patches that are needed for the other programs? They are mentioned in the docs of fuzzyocr? As an example I think this bug should depend on bug #145939 as it looks that the needed patch for giflib is attached there. The other patch is for gocr (http://users.own-hero.net/~decoder/fuzzyocr/gocr-segfault.patch) I am not sure if it is really needed, or if the bug is fixed in the newer version (which is at the moment not in portage #145624 )
(In reply to comment #2) > Sorry to bother you, but how is it with the patches that are needed for the > other programs? They are mentioned in the docs of fuzzyocr? > So far it's been working fine without needing the patches, I'll have to test it out with images which cause the segfaults and see if they are needed. If so then we'll either have to get the patches applied upsteam (preferable) or in the ebuilds for those two packages.
(In reply to comment #3) > (In reply to comment #2) > > Sorry to bother you, but how is it with the patches that are needed for the > > other programs? They are mentioned in the docs of fuzzyocr? > > > > So far it's been working fine without needing the patches, I'll have to test it > out with images which cause the segfaults and see if they are needed. > > If so then we'll either have to get the patches applied upsteam (preferable) or > in the ebuilds for those two packages. Using gocr 0.40 and the most current giflib, you get segfaults from both giftext and gocr, when images are meet special requirements. I think I can supply some image examples which cause this. As far as I know, both bugs have not been fixed so far. Best regards, Chris >
*** Bug 154938 has been marked as a duplicate of this bug. ***
I'm using SA 3.1.3 because my systems use the "stable" tree. I need to use a recent version of FuzzyOcr (a SA plugin) which requires at least 3.1.4. Unfortunately, there are too many reports of problems with SA versions previous to 3.1.7, so the best thing would be to use 3.1.7. I would like to push the inclusion of SA 3.1.7 in the portage tree. Thanks.
Ah, what a Genius! I missed the right bug... Sorry
Since we would've tried to take care of it anyway, String-Approx has been under dev-perl. Also, somehow I missed/didn't receive the announce on the SA bump - bumped it this weekend.
well it segfault sometime.. lets say in 10%... but otherwise its ok. Patches are realy needed, shall i open request for them ?
(In reply to comment #9) > well it segfault sometime.. lets say in 10%... but otherwise its ok. > > Patches are realy needed, shall i open request for them ? > I've been working with upstream and there's a new version that's been released and he's supplied me with images which will cause the segfaults. I should get time to look at these issues some time this week, so I'll speak to the relevant maintainers and get it sorted.
I've just added the 2.3b ebuild to the tree it should be available on the mirrors in the next 30-60 mins. I've added a warning about the segfaults with links to the relevant bugs. I'll get working on the 3.4.2 ebuild soon.