Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 802189 - mail-filter/spamassassin: undeclared dependency on virtual/perl-DB_File
Summary: mail-filter/spamassassin: undeclared dependency on virtual/perl-DB_File
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Philippe Chaintreuil
URL:
Whiteboard:
Keywords: PullRequest
Depends on:
Blocks:
 
Reported: 2021-07-14 20:16 UTC by Scott Alfter
Modified: 2021-09-16 05:58 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Scott Alfter 2021-07-14 20:16:14 UTC
The Bayesian filter with SpamAssassin needs DB_File to work, but this is currently only pulled in with the berkdb USE flag.  Without berkdb, DB_File is unavailable and SpamAssassin is untrainable.

Reproducible: Always
Comment 1 Philippe Chaintreuil 2021-07-14 21:05:41 UTC
Hi Scott,

What solution are you looking for?


DB_file is stated as optional in SpamAssassin's INSTALL file:

====================================================================
Optional Modules
----------------

    [...]

  - DB_File (from CPAN, included in many distributions)

    Used to store data on-disk, for the Bayes-style logic, TxRep, and
    auto-whitelist.  *Much* more efficient than the other standard Perl
    database packages.  Strongly recommended.
====================================================================
https://raw.githubusercontent.com/apache/spamassassin/spamassassin_release_3_4_6/INSTALL

And it does state that you'll need it for "Bayes-style logic" (ie: training).  However, training is not required to run SpamAssassin.


DB_File is perl's module for accessing berkdb.  

====================================================================
DB_File is a module which allows Perl programs to make use of the facilities provided by Berkeley DB [...]
====================================================================
https://perldoc.perl.org/DB_File#DESCRIPTION

(virtual/perl-DB_File just requires that dev-lang/perl have USE="berkdb" set.)
Comment 2 Fabian Groffen gentoo-dev 2021-07-21 19:00:43 UTC
Real problem is that USE=berkdb needs to be enabled on dev-lang/perl, or you need to use MySQL/PostgreSQL in order to have a Bayes DB.
Comment 3 Philippe Chaintreuil 2021-07-21 20:04:14 UTC
I'm still not following what the issue or desired solution is.

If you want berkdb storage, you set berkdb (which then propagates +berkdb to dev-lang/perl via virtual/perl-DB_File, right?).

If you want SQL storage, you set of the SQL flags.

If you're the 1% that just wants downloaded rules -- there's nothing that needs to be stored, so you don't have to set any of them.
Comment 4 Fabian Groffen gentoo-dev 2021-07-22 06:38:42 UTC
In my case my setup just broke after upgrading perl, because berkdb is now dropped.  You could argue that that's my fault because of the berkdb flag change.  I'll buy that, fact remains that the bayesian filter only works with berkdb (not gdbm) and that bayes support unexpectedly broke.

Perhaps something like USE=bayes on spamassassin could have a required use of one or more of berkdb and sql flags.  I think the most important change is that by default it worked (because berkdb) and now it doesn't.
Comment 5 Philippe Chaintreuil 2021-07-26 21:53:12 UTC
(In reply to Fabian Groffen from comment #4)
> Perhaps something like USE=bayes on spamassassin could have a required use
> of one or more of berkdb and sql flags.  I think the most important change
> is that by default it worked (because berkdb) and now it doesn't.

@grobian, could I get any input you might have on https://github.com/gentoo/gentoo/pull/21801 ?  Thanks.
Comment 6 Fabian Groffen gentoo-dev 2021-07-27 06:34:58 UTC
I think the change suggested in the pull-request makes it explicit that some deps are necessary to enable bayes support.

Since bayes is auto-enabled, I support the +bayes construct.  I think none of the required flags are enabled by default though, so it will trigger a choice to be made by the user.

Perhaps, it would be better to also change the defaults in local.cf:
use_bayes 1/0 (based on USE=bayes)
bayes_store_module Mail::SpamAssassin::BayesStore::{MySQL,PgSQL,DBM}

and for SQL perhaps commented out suggestions for:
#bayes_sql_dsn DBI:mysql:sa_bayes:localhost:3306
#bayes_sql_username
#bayes_sql_password
#bayes_sql_override_username

I recently migrated from db to mysql, which took hours (almost a day) so it isn't a task to be taken lightly.

I guess many people who do not use sa-learn have not noticed that their bayes setup got broken recently, perhaps this ebuild change will also notify them to review their setup.
Comment 7 Philippe Chaintreuil 2021-09-15 19:36:10 UTC
That pull request doesn't seem to have been well received.

How would you feel instead about a post-install message that warns that bayes support is unavailable if none of the storage USE flags are set?
Comment 8 Fabian Groffen gentoo-dev 2021-09-16 05:58:55 UTC
It just needs a bit more work to actually enable/disable bayes in the config, such that people have to enable bayes explicitly (and then have a db option available).