First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 158445
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Tom Knight <tomk@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Jacob Lindberg <jni@laps.dk>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
spamassassin-fuzzyocr-3.5.0_rc1.ebuild spamassassin-fuzzyocr-3.5.0_rc1.ebuild text/plain Jacob Lindberg 2006-12-18 03:55 0000 1.89 KB Details
patchset2.patch patchset2.patch (patch for spamassassin-fuzzyocr) patch Jacob Lindberg 2006-12-18 03:55 0000 17.79 KB Details | Diff
MLDBM-Sync-0.30.ebuild New: dev-perl/MLDBM-Sync (MLDBM-Sync-0.30.ebuild) text/plain Jacob Lindberg 2006-12-18 03:56 0000 547 bytes Details
gocr-0.43.ebuild app-text/gocr (gocr-0.43.ebuild) text/plain Jacob Lindberg 2006-12-18 03:57 0000 1.27 KB Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild spamassassin-fuzzyocr-3.5.0_rc1.ebuild with dev-perl/DBI text/plain Jacob Lindberg 2006-12-18 04:29 0000 1.90 KB Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild spamassassin-fuzzyocr-3.5.0_rc1.ebuild with dev-perl/DBI text/plain Jacob Lindberg 2006-12-18 04:29 0000 1.90 KB Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild spamassassin-fuzzyocr-3.5.0_rc1.ebuild with dev-perl/DBI text/plain Jacob Lindberg 2006-12-18 04:29 0000 1.90 KB Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild fuzzy-ocr with various USE flags text/plain Juan 2006-12-25 19:49 0000 2.20 KB Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild New ebuild text/plain Jacob Lindberg 2006-12-26 15:24 0000 2.42 KB Details
enabletesseract.patch Enable tesseract config patch patch Jacob Lindberg 2006-12-26 15:25 0000 806 bytes Details | Diff
noocrad.patch Disable ocrad in config if ! use patch patch Jacob Lindberg 2006-12-26 15:26 0000 1.28 KB Details | Diff
enabletesseract.patch Newest tesseract config patch patch Juan 2006-12-26 16:16 0000 656 bytes Details | Diff
spamassassin-fuzzyocr-3.5.0_rc1.ebuild spamassassin-fuzzyocr-3.5.0_rc1.ebuild with support for 3 OCR engines text/plain Juan 2006-12-26 17:01 0000 3.36 KB Details
disablegocr.patch Disable gocr in config if ! use patch patch Juan 2006-12-26 17:02 0000 574 bytes Details | Diff
Tie-Cache-0.17.ebuild tie-cache ebuild for possible tie-cache dependency.... text/plain Juan 2006-12-26 23:37 0000 540 bytes Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild The latest ebuild text/plain Jacob Lindberg 2006-12-28 01:31 0000 3.98 KB Details
disableocrad.patch The renamed disableocrad.patch patch Jacob Lindberg 2006-12-28 01:32 0000 1.28 KB Details | Diff
fuzzyocr.logrotate The logrotate file text/plain Jacob Lindberg 2006-12-28 01:32 0000 194 bytes Details
spamassassin-fuzzyocr-3.5.0_rc1-r1.ebuild spamassassin-fuzzyocr-3.5.0_rc1-r1.ebuild text/plain Jacob Lindberg 2007-01-02 04:12 0000 4.33 KB Details
patchset1.patch patchset1.patch patch Jacob Lindberg 2007-01-02 04:29 0000 3.81 KB Details | Diff
patchset3.patch patchset3.patch patch Jacob Lindberg 2007-01-02 04:30 0000 17.68 KB Details | Diff
postgresql.patch postgresql.patch patch Juan 2007-01-04 09:56 0000 36.71 KB Details | Diff
postgresql.patch postgresql.patch patch Juan 2007-01-04 10:16 0000 36.71 KB Details | Diff
spamassassin-fuzzyocr-3.5.0_rc1.ebuild ebuild for review text/plain Juan 2007-01-04 10:21 0000 4.57 KB Details
spamassassin-fuzzyocr-3.5.0_rc1.ebuild ebuild for review text/plain Juan 2007-01-04 10:27 0000 4.60 KB Details
spamassassin-fuzzyocr-3.5.1.ebuild test ebuild for sql hash storage text/plain Paul B. Henson 2007-01-28 21:50 0000 3.22 KB Details
spamassassin-fuzzyocr-3.5.1.ebuild spamassassin-fuzzyocr-3.5.1.ebuild text/plain Patrick McLean 2007-02-02 03:54 0000 4.33 KB Details
spamassassin-fuzzyocr-3.5.1.ebuild spamassassin-fuzzyocr-3.5.1.ebuild text/plain Patrick McLean 2007-02-02 16:30 0000 4.30 KB Details
MLDBM-Sync-0.30.ebuild MLDBM-Sync-0.30.ebuild text/plain Tom Knight 2007-02-06 22:42 0000 433 bytes Details
postgres.tar.bz2 Files modified to work with postgres instead of mysql application/octet-stream aelber@207-237-10-120.c3-0.nyw-ubr3.nyr-nyw.ny.cable.rcn.com 2007-05-20 19:34 0000 15.77 KB Details
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 158445 depends on: 146390 170430 Show dependency tree
Show dependency graph
Bug 158445 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2006-12-18 03:54 0000
I had to create all this in one bug, otherwise it wouldn't make sense.

A RC was released of this fine tool, so I decided to ebuild it. 

There is a new perl-module dependency called dev-perl/MLDBM-Sync. I have
created a seperate ebuild for this (it doesn't exist at the moment). Please put
that in the tree also. Without MLDBM 'spamassassin --lint' will nag about
missing it.

I have changed the warning about app-text/gocr since I have created a new
ebuild for this also. This fixes the old segfaulting and heavy loading problem.

------- Comment #1 From Jacob Lindberg 2006-12-18 03:55:21 0000 -------
Created an attachment (id=104267) [edit]
spamassassin-fuzzyocr-3.5.0_rc1.ebuild

------- Comment #2 From Jacob Lindberg 2006-12-18 03:55:54 0000 -------
Created an attachment (id=104268) [edit]
patchset2.patch (patch for spamassassin-fuzzyocr)

------- Comment #3 From Jacob Lindberg 2006-12-18 03:56:41 0000 -------
Created an attachment (id=104269) [edit]
New: dev-perl/MLDBM-Sync (MLDBM-Sync-0.30.ebuild)

------- Comment #4 From Jacob Lindberg 2006-12-18 03:57:17 0000 -------
Created an attachment (id=104270) [edit]
app-text/gocr (gocr-0.43.ebuild)

------- Comment #5 From Jacob Lindberg 2006-12-18 03:58:47 0000 -------
-Without MLDBM 'spamassassin --lint' will nag about missing it.
+Without MLDBM-Sync 'spamassassin --lint' will nag about missing it.

------- Comment #6 From Jacob Lindberg 2006-12-18 04:29:23 0000 -------
Created an attachment (id=104274) [edit]
spamassassin-fuzzyocr-3.5.0_rc1.ebuild with dev-perl/DBI

I forgot dev-perl/DBI as dependcy. Here it is included.

------- Comment #7 From Jacob Lindberg 2006-12-18 04:29:35 0000 -------
Created an attachment (id=104275) [edit]
spamassassin-fuzzyocr-3.5.0_rc1.ebuild with dev-perl/DBI

I forgot dev-perl/DBI as dependcy. Here it is included.

------- Comment #8 From Jacob Lindberg 2006-12-18 04:29:46 0000 -------
Created an attachment (id=104276) [edit]
spamassassin-fuzzyocr-3.5.0_rc1.ebuild with dev-perl/DBI

I forgot dev-perl/DBI as dependency. Here it is included.

------- Comment #9 From Jacob Lindberg 2006-12-18 04:41:29 0000 -------
(From update of attachment 104275 [edit])
Obsoleted since bugs.gentoo.org was momo..

------- Comment #10 From Michael Kefeder 2006-12-20 01:38:53 0000 -------
Thanks for all the work. I can confirm that installing with your ebuilds
successfully completed.

Now I hope the plugin does what it promises ;)

------- Comment #11 From Jacob Lindberg 2006-12-20 06:58:31 0000 -------
Good to hear.

I had some issues with the old one. Scoring was not that good. This one seems
to be doing the job alright ;)

------- Comment #12 From Juan 2006-12-25 14:52:21 0000 -------
Is it possible that spamassassin-fuzzyocr includes USE flags for the following
(optional) packages:

app-text/ocrad (already in portage)
media-gfx/gifsicle (already in portage)

There are some commented settings in FuzzyOcr.cf that would require these
packages to be installed, if uncommented.

Also, this requested ebuild should have as much attention paid to it as the
rest as this can also be an optional package:

media-gfx/tesseract-ocr

Bug report for the tesseract-ocr ebuild @
https://bugs.gentoo.org/show_bug.cgi?id=146390

Note that the source package is tesseract-<version>.**

Lastly, this is most likely an upstream bug but when used with ocrad, my log
reports the following which occurs with all 4 ocrad scansets:

2006-12-25 14:45:30 [19871] Errors in Scanset "ocrad-decolorize"
2006-12-25 14:45:30 [19871] Return code: 1, Error: /usr/bin/ocrad: invalid
option -- s
                      Try `/usr/bin/ocrad --help' for more information.

2006-12-25 14:45:30 [19871] Skipping scanset because of errors, trying next...

I looked through all the fuzzy-ocr perl modules looking for the line to patch
but couldn't find it.

------- Comment #13 From Juan 2006-12-25 14:55:52 0000 -------
I should also add that this fine SpamAssassin plugin set (fuzzy-ocr 3.5.0 /
gocr 0.43 / MLDBM-Sync) works just fine, regardless of the ocrad errors.

------- Comment #14 From Juan 2006-12-25 16:17:54 0000 -------
The ocrad errors I was receiving are documented at:

http://fuzzyocr.own-hero.net/wiki/OcradWrongParameters

------- Comment #15 From Juan 2006-12-25 19:49:46 0000 -------
Created an attachment (id=104727) [edit]
fuzzy-ocr with various USE flags

I have modified the current ebuild as it does not provide the flexibility
available with fuzzy-ocr.

This ebuild contains the following USE flags:

dbm gocr log mysql ocrad tesseract

I am not an ebuild dev but this works fine for me. The following conditions
must be met:

If you enable gocr, you must use the gocr-0.43 ebuild.
If you enable dbm, you must use the MLDBM-Sync ebuild.
If you enabled tesseract, you must use this ebuild =>
https://bugs.gentoo.org/show_bug.cgi?id=146390
If you use ocrad, you can use ocrad currently in portage or use this: =>
https://bugs.gentoo.org/show_bug.cgi?id=154579

The ebuild could probably use some extra touches such as
commenting/uncommenting out scansets (gocr/ocrad/tesseract) from
FuzzyOcr.scansets.

------- Comment #16 From Jacob Lindberg 2006-12-26 15:24:10 0000 -------
Juan,

Good work. I discovered the same things just before leaving for Christmas.

Concerning the "Return code: 1, Error: /usr/bin/ocrad: invalid option -- s" the
fix on the web page has already been applied in the scanset file. It's a matter
of the version of ocrad. If you use 0.10 instead of 0.15, you will be missing "
 -s, --scale=[-]<n>       scale input image by [1/]<n>" in options.

You allready fixed this with:
ocrad? ( >=app-text/ocrad-0.14 )
in the ebuild :-)

My log now:
2006-12-26 23:33:00 [2348] Scanset Order: ocrad(0) ocrad-invert(0)
ocrad-decolorize-invert(0) ocrad-decolorize(0) gocr(0) gocr-180(0)
2006-12-26 23:33:00 [2378] Exec  : /usr/bin/ocrad -s5
/var/amavis/tmp/.spamassassin23483W6okRtmp/pic01.gif.pnm
2006-12-26 23:33:00 [2348] Saved pid: 2378
2006-12-26 23:33:00 [2378] Stdout:
>/var/amavis/tmp/.spamassassin23483W6okRtmp/scanset.ocrad.out
2006-12-26 23:33:00 [2378] Stderr:
>/var/amavis/tmp/.spamassassin23483W6okRtmp/scanset.ocrad.err

No problem here.

About the USE flags. I like the idea, but we can't make both ocrad and gocr USE
flag, since the plugin doesn't make any sense without one. I suggest to make
gocr static, and ocrad as USE flag. Gocr was static in earlier version.

Since MLDBM, MLDBM-Sync and DB_File are very much required, I have removed the
USE flag dbm again. The config is checking for these, and complaining in case
they don't exist. Meaning you loose functionality if they are not there.

I have added perl-core/DB_File as dependency also since it was in the config
too.

I have added ">=app-text/gocr-0.43" as dependency and removed the warning about
earlier buggy version now that we are forcing a good version.

I also did some patching when enabling tesseract and disabling ocrad.

Next step is the mysql part.

Tell me what you think so far.

------- Comment #17 From Jacob Lindberg 2006-12-26 15:24:59 0000 -------
Created an attachment (id=104771) [edit]
New ebuild

------- Comment #18 From Jacob Lindberg 2006-12-26 15:25:36 0000 -------
Created an attachment (id=104773) [edit]
Enable tesseract config patch

------- Comment #19 From Jacob Lindberg 2006-12-26 15:26:12 0000 -------
Created an attachment (id=104774) [edit]
Disable ocrad in config if ! use patch

------- Comment #20 From Juan 2006-12-26 15:41:04 0000 -------
Jacob,

Awesome, I am going to leave my production server in peace and install the
ebuilds related to this bug report on my laptop to test the hell out of it.

FYI, I am adding PostgreSQL support to FuzzyOCR myself so expect a pgsql USE
flag in the near future (hopefully for the 3.5 stable release).

Juan

------- Comment #21 From Juan 2006-12-26 15:48:19 0000 -------
gocr is not required so it doesn't make any sense to make it a dependency. One
OCR is required. There are 3 to choose from. I think it would make more sense
to nag the user when all 3 OCR flags are disabled OR simply make gocr a
dependancy *IF* all 3 OCR flags are disabled.

But let's not lock people into having to install gocr. =)

------- Comment #22 From Juan 2006-12-26 16:16:45 0000 -------
Created an attachment (id=104776) [edit]
Newest tesseract config patch

For some reason, the current tesseract patch doesn't work for me... I've
attached the one I created that works (for me)...

------- Comment #23 From Jacob Lindberg 2006-12-26 16:34:29 0000 -------
Juan,

Ofcoz you are right about the number of OCR. Let's do something about that. 
(gocr, ocrad and tesseract).

Good to hear about PostgreSQL support. Most probably a lot of people will love
that.

I need to get some sleep now. I will look into the 3 OCR issue tomorrow. It's
1:30 am here.

About the tesseract patch, this is the only patch I didn't test! I admit that.
I just did a diff. 

Is the tesseract software any good?

------- Comment #24 From Juan 2006-12-26 16:58:04 0000 -------
Jabob,

Cool. Well, I am one step ahead of you as I have created the newest ebuild to
support all 3 OCR engines and nag if all OCR engine USE flags are disabled
(perhaps a better approach is in order here)....

I currently have Fuzzy on my production server using PgSQL. It's a nasty, quick
and super dirty hack but it works for now. Since I now have fuzzy on my laptop,
I'll be able to add PgSQL support more easily.

About tesseract.. I think it's probably just as good as ocrad. It was
open-sourced last year by HP and UNLV for what it's worth... Both ocrad and
tesseract catch my custom spam images I've made for testing....

------- Comment #25 From Juan 2006-12-26 17:01:20 0000 -------
Created an attachment (id=104778) [edit]
spamassassin-fuzzyocr-3.5.0_rc1.ebuild with support for 3 OCR engines

Newest ebuild that simply makes gocr a USE flag. If all 3 OCR engine USE flags
are disabled, the ebuild complains then dies.

------- Comment #26 From Juan 2006-12-26 17:02:09 0000 -------
Created an attachment (id=104779) [edit]
Patch to disable gocr scansets

------- Comment #27 From Juan 2006-12-26 18:54:18 0000 -------
>> Jabob wrote:
    Next step is the mysql part.

Oh yea.

my only guess would to drop the sql schemas into fuzzy's home dir (in
/etc/mail/sa) and point users to those files with some post install message..
eh?

------- Comment #28 From Juan 2006-12-26 23:37:39 0000 -------
Created an attachment (id=104790) [edit]
tie-cache ebuild for possible tie-cache dependency....

So apparently, the perl modules Tie-Cache is required. But I don't get it.
Tie-Cache is not in portage and I never installed it on my server but Fuzzy
works without issues. Move along to my laptop, I had to manually install
Tie-Cache to get it Fuzzy to work.

Is Tie-Cache masked as some other package in portage? If not, it is required so
an ebuild for Tie-Cache will be needed.

In case it is, here is the overlay ebuild for Tie-Cache 0.17

Can anyone confirm this? FuzzyOcr source does call on tie-cache so... hmm

------- Comment #29 From Juan 2006-12-27 00:04:39 0000 -------
(From update of attachment 104790 [edit])
># Copyright 1999-2006 Gentoo Foundation
># Distributed under the terms of the GNU General Public License v2
># $Header: /var/cvsroot/gentoo-x86/perl-core/Tie-Cache/Tie-Cache-0.17.ebuild,v 1.9 2006/08/04 13:30:56 mcummings Exp $
>
>inherit perl-module
>
>DESCRIPTION="The Perl LRU Cache Memory Module"
>HOMEPAGE="http://search.cpan.org/~chamas/Tie-Cache-0.17/Cache.pm"
>SRC_URI="mirror://search.cpan.org/CPAN/authors/id/C/CH/CHAMAS/${P}.tar.gz"
>
>LICENSE="|| ( Artistic GPL-2 )"
>SLOT="0"
>KEYWORDS="~x86"
>IUSE=""
>
>SRC_TEST="do"
>
>DEPEND="dev-lang/perl"

------- Comment #30 From Jacob Lindberg 2006-12-27 06:13:25 0000 -------
Juan,

Isn't it tesseract which is depending on Tie::Cache? I'm not using tesseract at
the moment, and I don't see any warnings or any requirement of this from
Fuzzyocr. Please test that.

About the SQL files, good idea!

Please make sure that x86, ppc and ppc64 is included as KEYWORDS in your
ebuild(s). I'm using all 3 archs when testing :)

Tomorrow I will be finished with another update of the ebuild which also
include enabling log and logrotate in USE flags. Right now the log doesn't do
much. Also I will restructure the DEPEND and RDEPEND since it's a mess right
now :)

------- Comment #31 From Juan 2006-12-27 12:18:43 0000 -------
Jacob,

FYI: I removed SA/Fuzzy and all dependencies as to start fresh. And as of now,
all is working as expected.

One Tie:Cache.. When I received this error, I must admit that I installed Fuzzy
then applied my PgSQL patches before doing ANY testing to confirm
functionality. Installing Tie:Cache solved that issue this particular time.
Now, after a fresh, squeaky clean install of SA and Fuzzy (no PgSQL patches but
using my ebuild), no Tie::Cache related errors. I then patched Fuzzy with my
PgSQL files and still no error. However, I did end up installing Storable on my
laptop (Storable is installed on the server) but I don't see why not having
Storable installed on my laptop would give me Tie::Cache related errors. I
assume Storable would be used for file based hashing(????).

I have ocrad and tesseract as the OCR engines so it appears that Tie:Cache
isn't a requirement for Tesseract since no more errors. All I can say is weird
and that I cannot reproduce the errors.

On KEYWORDS... I will be sure to remember to add more than just my arch..
hehe..

On Log-Agent... I'm not so sure where/how this comes into place. But on Fuzzy's
site, he states that Log-Agent *might* be required for MLDBM-Sync but that some
users have reported no issues without it. I have removed it on this new install
and have no errors/issues without it. It might be safe to remove that as a DEP.
You can read about it here:
http://fuzzyocr.own-hero.net/wiki/Installation-3.5.x

The log USE flag should have been log-agent and not log, my bad.

Lastly, and off topic.. Are you using MySQL for hashing? If so, do you get the
following error which occurs when repeat offending images are updated in the
hash table (of course, you'd see DBD::mysql):

warn: DBD::Pg::db do failed: ERROR: syntax error at or near "check" at
character

I find it weird that i would get a simple update query error that wasn't caught
with mysql testing since the SQL queries are very basic queries. In the case
above, this is the query which doesn't seem incorrect in any way as all cells
exist as does the image being updated:

update hash set match = '1', check = '1167249571' where key =
'255:255:255:255:173820::0:0:0:0:14680'

------- Comment #32 From Juan 2006-12-27 17:52:21 0000 -------
Jacob,

My SQL error was Pg specific. It appears 'check' is a reserved word so I
changed 'check' to 'last_seen' and it works. So no worries about errors...

In any case, I have submitted my PgSQL patchset to the devs @ FuzzyOcr.
Hopefully it'll make it for the 3.5 stable release...

http://fuzzyocr.own-hero.net/ticket/34

It works as it should with PgSQL. Can't comment on MySQL functionality. This
project kicks ass!

------- Comment #33 From Juan 2006-12-27 17:55:29 0000 -------
Since you're fine tuning the ebuild, don't forget to add DBD-mysql when USE
mysql is enabled. Same for PgSQL once that gets into source.

Storable appears to be required when hashing to files, not SQL. Can you confirm
this?

------- Comment #34 From Jacob Lindberg 2006-12-28 01:30:53 0000 -------
Juan,

Nice job there. I hope to see your patch go into stable 3.5 :)

Do you think we should add your patch to the ebuild?

I took your ebuild as reference. Now we are in 'sync'.

I have made some changes and enhancement to the ebuild now. I will list a small
ChangeLog here:

--------
- Changed dev-db/mysql to dev-perl/DBD-Mysql in mysql USE FLAG
- Added dev-perl-core/Storable to RDEPEND
- Removed dev-perl/Log-Agent from DEPEND
- Changed the eerror to a little more user friendly message
- Added USE flag log which will change "#focr_logfile /tmp/FuzzyOcr.log" to
"focr_logfile /var/log/FuzzyOcr.log" in FuzzyOcr.cf
- Renamed noocrad.patch to diableocrad.patch in files
- Changed DEPEND to only consist of dev-lang/perl and
>=mail-filter/spamassassin-3.0.0. The rest is in RDEPEND since there is no need
when building the package
- Added /var/lib/FuzzyOcr to handle all file dbs + changing
/etc/mail/spamassassin to /var/lib/FuzzyOcr in FuzzyOcr.cf
--------

As you can see I removed the Log::Agent, and used the log USE flag for enabling
logging from FuzzyOcr. I hope this is okay with you.

I have enabled image hashing (option 2) in my test setup and it all works like
a charm. I even moved it to production now. 

I can't use mysql since my servers are too loaded for doing SQL queries at the
moment. I will have to trust you, Juan, on that one :)

I found this in Hashes.pm: "use MLDBM qw(DB_File Storable);" and this in
Config.pm: "use constant HAS_STORABLE => eval { require Storable; };". So you
are obviously right about Storable as dependency.

Are we missing something else?

------- Comment #35 From Jacob Lindberg 2006-12-28 01:31:40 0000 -------
Created an attachment (id=104837) [edit]
The latest ebuild

------- Comment #36 From Jacob Lindberg 2006-12-28 01:32:22 0000 -------
Created an attachment (id=104838) [edit]
The renamed disableocrad.patch

------- Comment #37 From Jacob Lindberg 2006-12-28 01:32:50 0000 -------
Created an attachment (id=104839) [edit]
The logrotate file

------- Comment #38 From Juan 2006-12-28 01:57:04 0000 -------
Jacob,

Ebuild looks nice. I think an ebuild could be created with my patch to at least
have it tested. I do have mine in production working fine but it would be nice
to have input from others to pass on to the Fuzzy devs if needed. USE postgre
would need to be added as well as a new DEP, DBD-Pg. I think that might be best
since I don't want to be the only Pg tester for the world to depend on... =)

Also, pkg_postinst should copy the SQL files somewhere and instruct the user to
use/import files located in X dir. Not sure where to drop those files though...

It's 2am. I'm off to bed.

Juan

------- Comment #39 From Juan 2006-12-28 10:54:54 0000 -------
Jacob,

The newest ebuild works great!

Now, I've been trying to apply my patch to the ebuild but am failing miserably.
Is there anything special that I need to do to the pgsql.patch file I drop into
${FILESDIR}?

It is failing at:

 * Applying pgsql.patch ...

 * Failed Patch: pgsql.patch !

------- Comment #40 From Jacob Lindberg 2007-01-01 22:34:25 0000 -------
Juan,

When I look through your patch it changed the logging facility in the config
file. This is something your patch should not do. It should give the ability to
use pgsql, but nothing else.

Can you create a new patch? Or provide me the one you want to use? 

I will help you make it work in the ebuild.

------- Comment #41 From Vieri 2007-01-02 02:02:06 0000 -------
Hi,
I'm new to this plugin but am really interested to try it out.
I'm recurring to this ebuild because the official website says that the
"stable" version is not recommended.

Your latest ebuild has:
epatch "${FILESDIR}"/patchset2.patch
The web site lists a patchset3.
I suppose it should be updated.

Also I saw that you are using the amavis user permissions on some files. My
system doesn't use amavis. Is it necessary?

------- Comment #42 From Jacob Lindberg 2007-01-02 04:11:53 0000 -------
Hi,

Thanks for your observations. I have created a new ebuild with all patches
(1,2,3), and a warning about the amavis user. I need to do some thinking about
this issue, since we can't make sure that the amavis user actually exists.

------- Comment #43 From Jacob Lindberg 2007-01-02 04:12:52 0000 -------
Created an attachment (id=105149) [edit]
spamassassin-fuzzyocr-3.5.0_rc1-r1.ebuild

------- Comment #44 From Jacob Lindberg 2007-01-02 04:29:30 0000 -------
Created an attachment (id=105150) [edit]
patchset1.patch

------- Comment #45 From Jacob Lindberg 2007-01-02 04:30:13 0000 -------
Created an attachment (id=105151) [edit]
patchset3.patch

------- Comment #46 From Vieri 2007-01-02 05:48:13 0000 -------
(In reply to comment #42)

thanks.
I also noticed that the ebuild requires:
>=mail-filter/spamassassin-3.0.0

however fuzzyocr-3.5 seems to require version 3.1.4 or higher
(http://fuzzyocr.own-hero.net/wiki/Installation-3.5.x).

------- Comment #47 From Vieri 2007-01-02 05:50:41 0000 -------
(In reply to comment #46)
> (In reply to comment #42)
[EDIT]: I just saw the "if has_version '<mail-filter/spamassassin-3.1.4';"

------- Comment #48 From Jacob Lindberg 2007-01-02 23:03:45 0000 -------
Hi again

Well about spamassassin, the oldest version available in portage is 3.1.3. This
will most probably dissapear before this ebuild goes in the tree.

------- Comment #49 From Jacob Lindberg 2007-01-02 23:38:23 0000 -------
And by the way:

        # if we're using spamassassin < 3.1.4 we need to set this variable
        if has_version '<mail-filter/spamassassin-3.1.4'; then
            sed -ie "s:^#focr_pre314 0.0:focr_pre314 1:" FuzzyOcr.cf
        fi

...

------- Comment #50 From Vieri 2007-01-03 00:37:21 0000 -------
(In reply to comment #42)
> a warning about the amavis user. I need to do some thinking about
> this issue, since we can't make sure that the amavis user actually exists.

How about moving fperms and fowners to pkg_config() so that the user can
specify which user spamassassin is running under? I find it tricky for the
ebuild to correctly autodetect the spamassassin system user but if you find a
way then that would be great.

------- Comment #51 From Juan 2007-01-03 10:24:54 0000 -------
(In reply to comment #40)
> Juan,
> 
> When I look through your patch it changed the logging facility in the config
> file. This is something your patch should not do. It should give the ability to
> use pgsql, but nothing else.
> 
> Can you create a new patch? Or provide me the one you want to use? 
> 
> I will help you make it work in the ebuild.
> 

Jacob,

I will post my PostgreSQL patch in a bit (later today).

------- Comment #52 From Juan 2007-01-04 09:56:55 0000 -------
Created an attachment (id=105396) [edit]
postgresql.patch

------- Comment #53 From Juan 2007-01-04 10:16:00 0000 -------
Created an attachment (id=105398) [edit]
postgresql.patch

------- Comment #54 From Juan 2007-01-04 10:21:53 0000 -------
Created an attachment (id=105399) [edit]
ebuild for review

Jacob,

I've attached a new ebuild for you to look at (see below). I've also uploaded
the postgresql patch that I've been trying to get to work. 

Also, a couple of things about the ebuild. If you're going to assume everyone
uses amavis, add an amavis flag. I don't use amavis so I have no amavis user.

So you're keeping both log and logrotate flags? I think that is rather
redundant and should probably stick with logrotate.

------- Comment #55 From Juan 2007-01-04 10:27:44 0000 -------
Created an attachment (id=105400) [edit]
ebuild for review

------- Comment #56 From Paul B. Henson 2007-01-28 01:41:58 0000 -------
I'm putting together a new postfix/amavisd-new/clamav/spamassassin system, and
came across your ebuild in progress. A few initial comments:

FuzzyOCR 3.5.1 is out, so the 3 patchsets are obsolete. Looks like the tarball
has a -devel in the name now.

If you're only storing hashes to SQL, I don't think there's a need for the DBM
packages. Why not put back the dbm use flag? It seems the three choices are 
-dbm -*sql, no hashing. dbm -*sql, depend on dbm packages, local file hash
storage. -dbm *sql, depend on DBI/appropriate DBD, sql hash storage. The
current ebuild depends on dev-perl/MLDBM-Sync and dev-perl/DBI regardless of
use flags, which will result in extra cruft installed. It looks like
perl-core/Storable is only needed for dbm support too. I'm planning on storing
hashes in mysql, and don't want to install unnecessary packages (one of the
things I like about Gentoo versus precompiled dists is that flexibility). I
guess a fourth choice would be dbm *sql, install it all...

Where does the direct dependency on virtual/perl-Digest-MD5 come from? I don't
see anything in the FuzzyOCR code itself that uses it.

How about media-gfx/imagemagick? FuzzyOCR doesn't seem to depend on it
directly. It looks like it is only needed for tesseract support? If so, it
should only be included if tesseract is used.

/var/lib/FuzzyOcr is only needed when dbm is used.

Just a personal opinion, but I'm not sure why the default for the .words,
.scansets, and .preps files is /etc/spamassassin rather than
/etc/spamassassin/FuzzyOcr. FuzzyOcr.cf itself clearly needs to go into
/etc/spamassassin, but the other files seem better located in the subdir. I'll
probably install them there and update the .cf file. Actually, the files
currently going into /etc/spamassassin/FuzzyOcr seem to be perl modules, not
config files. Why shouldn't those go into ${VENDOR_LIB}/FuzzyOcr with all the
other perl modules? Makes more sense than /etc. Looks like the ebuild already
installs FuzzyOcr.pm into the spamassassin plugin dir instead of
/etc/spamassassin, might as well relocate the other modules to a more
appropriate spot.

Well, I guess I'll go see how well my tweaked ebuild works out.

Thanks...

------- Comment #57 From Jacob Lindberg 2007-01-28 09:55:23 0000 -------
Sounds good to me. I kindda lost time to finish this project. At least for the
moment. I still have it running in production though.

------- Comment #58 From Marco Nierlich 2007-01-28 10:04:56 0000 -------
Paul, would you mind attaching your tweaked ebuild?

------- Comment #59 From Paul B. Henson 2007-01-28 21:50:28 0000 -------
Created an attachment (id=108421) [edit]
test ebuild for sql hash storage

------- Comment #60 From Paul B. Henson 2007-01-28 21:54:03 0000 -------
Ok, I attached the ebuild I've been playing with. Note it is not meant as a
replacement for the last proposed ebuild, I've only tested it with the use
flags I wanted, and I ripped out some of the logging stuff I didn't need rather
than trying to fix it. Also, it doesn't automatically fix the paths in
FuzzyOcr.cf like it should, I edited that file by hand afterward. However, I am
running it on a test system successfully storing hashes into mysql, without any
dbm related packages, with perl code located in /usr/lib/perl, and no
complaints/problems so far.

------- Comment #61 From Patrick McLean 2007-02-02 03:54:58 0000 -------
Created an attachment (id=108904) [edit]
spamassassin-fuzzyocr-3.5.1.ebuild

Ebuild for spamassassin-fuzzyocr-3.5.1, this is fixed up a bit, mostly small
stuff from the previous ebuilds posted here.

The postgresql patch doesn't apply anymore, I have it commented out for now, if
you make up a new one for me I can add it in again. I will give this a few days
testing, and hopefully to get a new postgres patch then talk to tomk about
adding this to portage.

------- Comment #62 From Patrick McLean 2007-02-02 16:30:34 0000 -------
Created an attachment (id=108944) [edit]
spamassassin-fuzzyocr-3.5.1.ebuild

Some cleanups, change the tesseract dep from media-gfx/tesseract to
app-text/tesseract since all the other OCR apps in portage are in app-text.

------- Comment #63 From Tom Knight 2007-02-02 18:58:30 0000 -------
Sorry guys, been really busy recently. Thanks for all the work you've put into
the ebuild(s), I'll have a look this weekend.

------- Comment #64 From Jacob Lindberg 2007-02-03 21:09:47 0000 -------
Patrick, thanks for continuing this project. I will see if I get some time next
week to help you out.

------- Comment #65 From Tom Knight 2007-02-06 22:39:04 0000 -------
(From update of attachment 104270 [edit])
0.43 has been added to the tree, see bug 145624.

------- Comment #66 From Tom Knight 2007-02-06 22:42:43 0000 -------
Created an attachment (id=109385) [edit]
MLDBM-Sync-0.30.ebuild

Fixed the LICENSE and KEYWORDS (we can't add arches which we haven't tested on
although we can request that the arch teams add their ~ARCH keywords)

------- Comment #67 From Jacob Lindberg 2007-02-07 07:46:52 0000 -------
So my ppc, ppc64 and x86 setups doesn't count for MLDBM-Sync?

Please fix this again, Tom. It has been tested fully and actually running at
the moment :-)

------- Comment #68 From Tom Knight 2007-02-07 17:42:23 0000 -------
(In reply to comment #67)
> So my ppc, ppc64 and x86 setups doesn't count for MLDBM-Sync?
> 
> Please fix this again, Tom. It has been tested fully and actually running at
> the moment :-)
> 

Although you've tested it on those arches it's Gentoo policy that when adding a
new package only ~ARCH keywords for arches that the dev(s) has tested it on
should be included.

Once it's been added I'll file another bug to get the arch teams to add their
~ARCH keywords if it works correctly on those arches, I'll mention that you've
tested it on those arches.

------- Comment #69 From Jacob Lindberg 2007-02-08 11:24:39 0000 -------
Tom, 
Okay; not a problem.

------- Comment #70 From Tom Knight 2007-02-23 16:22:15 0000 -------
I've tested this out and made a few modifications to the ebuild, once the SPARC
team have added keywords for the required dependencies in bug 168060 bug 168062
and bug 168063 (which I've been told will be done by tomorrow evening) I'll add
the 3.5.1 ebuild to the tree.

Thanks for everyone's patience and hard work that's gone into this.

------- Comment #71 From Paul B. Henson 2007-02-23 22:13:36 0000 -------
Tom,

Glad to hear this is about to go into portage. It doesn't look like you posted
the final version of the ebuild you plan to add, I was just wondering if you
had the chance to incorporate any of the suggestions I made in comments #56/60.

Thanks much...

------- Comment #72 From Tom Knight 2007-03-11 15:53:50 0000 -------
(In reply to comment #71)
> I was just wondering if you
> had the chance to incorporate any of the suggestions I made in comments #56/60.

Yes, I've removed the un-needed requirements on virtual/perl-Digest-MD5 and
media-gfx/imagemagick they were left over from the previous version and are no
longer needed.

I've also re-added the dbm USE flag to control the requirements needed for
hashing support.

------- Comment #73 From Jason Phillips 2007-03-12 14:55:32 0000 -------
(In reply to comment #72)
> Yes, I've removed the un-needed requirements on virtual/perl-Digest-MD5 and
> media-gfx/imagemagick they were left over from the previous version and are no
> longer needed.
> I've also re-added the dbm USE flag to control the requirements needed for
> hashing support.

Hi Tom. If the ebuild isn't going into Portage shortly, would you mind posting
your latest version here? Thanks, Jason.

------- Comment #74 From Tom Knight 2007-03-12 19:03:02 0000 -------
(In reply to comment #73)
> Hi Tom. If the ebuild isn't going into Portage shortly, would you mind posting
> your latest version here? Thanks, Jason.
> 

Too late, I've just added it to the tree :) It will show up on the mirrors
within the next hour. Thanks to everyone who helped out.

------- Comment #75 From aelber@207-237-10-120.c3-0.nyw-ubr3.nyr-nyw.ny.cable.rcn.com 2007-05-20 19:34:34 0000 -------
Created an attachment (id=119839) [edit]
Files modified to work with postgres instead of mysql

These are a sql schema, Config.pm, Hashing.pm, and FuzzyOcr.pm based on Juan's
postgres patch.  I haven't included the cf since there's no change.  There are
two changes in these files from Juan's patchset that should be noted:

First, the sql file does not try to drop the prior schema, and it assumes the
database user will be "spamassassin" rather than FuzzyOCR.  This conforms to
the instructions used when setting spamassassin itself to use Postgres.

Second, this disables the ability to use mysql.  Why does it do that?  Because
the code to check installation of the right DBD:: class doesn't work right, and
otherwise it throws an error every time it starts.

First Last Prev Next    No search results available      Search page      Enter new bug