Wide character in print at /usr/bin/sa-compile line 433, <$fh> line 1572. Wide character in print at /usr/bin/sa-compile line 433, <$fh> line 1573. Reproducible: Always just use sa-compile in current perl make this problem, is is re2c bug ?
What makes you think this is a Gentoo specific bug as opposed to an upstream SpamAsssassin bug?
Also, what are the exact USE flags you have set, and what are the steps needed to reproduce this? I ran sa-compile and don't see any errors in the output.
Please post your $(emerge --info mail-filter/spamassassin) output.
Created attachment 732527 [details] emerge --info as requested
today "sa-update && sa-compile" shows the problem now, no external rule sets needed
its not gentoo specifik problem i found that OLEVBMacro plugin miss dev-perl/Archive-Zip dev-perl/IO-String in RDEPEND i will try to build spamassassin trunk here with RDEPEND fix, wget it all and make the dist tarball and build is what i like to make localy, so i can see if its solved or not perl can solve it with -CSDA with will make it use unicode always, but i dont know if this can be done in sa-compile main problem is that use bytes; hmm
perl -ne 'print "$. $_" if m/[\x80-\xFF]/' <FILENAME.CF> can detect it.
(In reply to Benny Pedersen from comment #6) > its not gentoo specifik problem > > i found that OLEVBMacro plugin miss > > dev-perl/Archive-Zip > dev-perl/IO-String > > in RDEPEND Hi Benny. All the optional plugins, like OLEVBMacro, depend on you installing the dependencies yourself. Since they're optional, their dependencies aren't installed for everyone. And my understanding is I'm not supposed to add USE flags for them since they're runtime dependencies: > The usage of a USE flag should not control runtime dependencies when the package does not link to it. Doing so will create extra configuration for the package and re-compilation for no underlying file change on disk. [1] You'll note that right above where you enable OLEVBMacro in v343.pre, there's a comment about it: > # OLEVBMacro - Detects both OLE macros and VB code inside Office documents > # > # It tries to discern between safe and malicious code but due to the threat > # macros present to security, many places block these type of documents outright. > # > # For this plugin to work, Archive::Zip and IO::String modules are required. > # loadplugin Mail::SpamAssassin::Plugin::OLEVBMacro I'm unable to reproduce the bug yet on my machine. Your perl snippet above matches on 50_scores.cf (for example): ============================================================================== $ perl -ne 'print "$. $_" if m/[\x80-\xFF]/' /var/lib/spamassassin/3.004006/updates_spamassassin_org/50_scores.cf 526 # Validity (née ReturnPath) Certified ============================================================================== But I don't get the error from an sa-compile run: https://pastebin.com/raw/HXvV2rXL [1] https://devmanual.gentoo.org/general-concepts/use-flags/index.html#when-not-to-use-use-flags
I noticed you're set to LANG="C". I've seen reference that "C" only allows pure ASCII characters. I'm not 100% sure if that's true or not, but it's certainly within the realm of possibility. It also would match up with your statement that "main problem is that use bytes". Perhaps switching to LANG="C.utf8" or LANG="da_DK.utf8" would help allow more characters?
I should have included these in that last comment: https://wiki.gentoo.org/wiki/UTF-8 https://wiki.gentoo.org/wiki/Localization/Guide