Go to:
Gentoo Home
Documentation
Forums
Lists
Bugs
Planet
Store
Wiki
Get Gentoo!
Gentoo's Bugzilla – Attachment 67008 Details for
Bug 103947
bfilter-0.9.4.ebuild (New Package)
Home
|
New
–
[Ex]
|
Browse
|
Search
|
Privacy Policy
|
[?]
|
Reports
|
Requests
|
Help
|
New Account
|
Log In
[x]
|
Forgot Password
Login:
[x]
bfilter.8
bfilter.8 (text/plain), 9.21 KB, created by
Alan Swanson
on 2005-08-27 12:17:11 UTC
(
hide
)
Description:
bfilter.8
Filename:
MIME Type:
Creator:
Alan Swanson
Created:
2005-08-27 12:17:11 UTC
Size:
9.21 KB
patch
obsolete
>.\" Man Page for BFILTER >.\" groff -man -Tascii bfilter.1 > >.TH BFILTER 1 "August 2005" > >.SH NAME >bfilter \- An ad-filtering web proxy using heuristic ad-detection algorithms > >.SH SYNOPSIS >.B bfilter >[-c DIRECTORY] >[-r DIRECTORY] >[-u USER] >[-g GROUP] >[-n] >[-h] >[-v] > >.SH "DESCRIPTION" >.PP >.B bfilter >is a web proxy that uses effective heuristic ad-detection algorithms to remove >banner adverts, popups and webbugs from web pages. The traditional blocklist >based approach is also implemented, but it is mostly used for dealing with false >positives. Unlike other tools that require constant updates of their >blocklists, bfilter manages to remove over 90% of adverts even with an empty >blocklist! >.P >All processing is done on the fly, it doesn't load the whole page or image >before processing. It uses heuristic and regex-based approaches to detect >adverts and webbugs. It also uses a Javascript engine to combat Javascript >generated adverts and popups. >.P >The web proxy supports the following features; >.PP >.B o >HTTP/0.9 - HTTP/1.1 support >.br >.B o >Persistent connections (HTTP/1.1 only) >.br >.B o >Pipelining (HTTP/1.1 only) >.br >.B o >HTTP compression >.br >.B o >Forwarding to another proxy >.P >However, it does >.B not >support CONNECT requests typically used for HTTPS. > >.SH OPTIONS >.TP >.B -c [...] >Set custom config directory >.TP >.B -r [...] >Set chroot directory >.TP >.B -u [...] >Set unprivileged user >.TP >.B -g [...] >Set unprivileged group >.TP >.B -n >Disable background daemon mode >.TP >.B -h >Show help >.TP >.B -v >Print version > >.SH RESOURCES >.HP >.B /etc/bfilter/config >.br >.I listen_address = host:port >.br >The address to bind the proxy to. If unspecified, bind to all interfaces. >.br >.I client_compression = yes | no >.br >If set to yes, all the textual data with "Content-Type: text/*" will be >compressed before sending it to the client. This option can be useful if you >are on a slow connection and you set up bfilter somewhere on a fast connection. >In other cases, setting this option to yes will just introduce additional >latency to the loading process. >.br >.I ad_border = rrggbb | none >.br >The default behavior is to draw borders around removed adverts. You may want >to change the border color or turn the borders off. >.br >.I no_flash = yes | no >.br >This option is for people who don't want to install a Flash plugin and don't >want to be constantly prompted to do so. Setting it to yes will cause all >Flash objects to be replaced with transparent GIF's. (You can't use rules to >achieve the same effect because a Flash advert is normally replaced with a >blank Flash object that loads the original into itself when you click on it.) >.br >.I use_proxy = yes | no >.br >.I proxy_host = host >.br >.I proxy_port = port >.br >When use_proxy is set to yes, you may specify a proxy for bfilter to forward >requests onto. >.br >.I no_proxy_for = host, host, host >.br >When use_proxy is set to yes, you may specify some hosts to be contacted >directly. The separator may be either a comma or a semicolon. If a host starts >or ends with a dot it is assumed that any prefix or suffix can be appended to >it, so for example "no_proxy_for = .mydomain.com, 192.168."). Note however >that .mydomain.com won't cover mydomain.com itself but only its subdomains. >(When matching no_proxy_for hosts, no DNS queries are being made. That means >127.0.0.1 won't act as localhost or the other way around.) > >.HP >.B /etc/bfilter/rules >.br >.I filter=0|1 >.br >Enable filtering. >.br >0: Serve the page as is >.br >1: (Default) Check for ads and apply the appropriate transformations >.br >.I ad=0|1|2 >.br >Advert detection options. >.br >0: (Default) Standard procedure for is_ad decision >.br >1: Force negative is_ad decision >.br >2: Force positive is_ad decision >.br >.I scripts=0|1|2|3|4|5|6|7 >.br >Javascript filtering options. The default value of 3 is effective against >js-generated ads, but breaks some sites which are too much dependent on >Javascript. Fortunately, the built-in Javascript engine mostly solves this >problem. >.br >0: Leave as is >.br >1: Remove 3rd party scripts except in header >.br >2: Remove 3rd party scripts from everywhere >.br >3: (Default) Only allow scripts in header and those 1st party scripts that >don't contain ".write" >.br >4: Only allow scripts in header and those 1st party scripts that contain >"function " >.br >5: Only allow scripts in header >.br >6: Only allow 1st party scripts and only in header >.br >7: Remove all scripts >.br >.br >.I jsengine=0|1 >.br >Enable Javascript engine. When the Javascript engine is used, the scripts >parameter is ignored. The output of a script (generated by document.write or >writeln) is directed to the standard advert detector. If it detects an advert, >the script gets removed. >.br >0: Don't use >.br >1: (Default) Use if possible >.br >.I target_blank=0|1 >.br >New window attribue for link option. A link may be marked to be opened in a new >window if target="_blank" is specified as attribute of an <A> tag. >.br >0: (Default) Leave as is >.br >1: Remove attribute >.br >.I [regex] >.br >For applying specific options to specific sites. Used after defaults have been >setup. See >.B RULES >section for further information. >.br >.HP >.B /etc/bfilter/rules.local >.br >For local rules and redefining the global parameters. Uses the same syntax as >for the global rules file. > >.SH RULES >Rules are used for blocking ads which aren't automatically detected and/or for >dealing with false positives. The rule format is: >.P >[regex] >.br >param1=val1 >.br >param2=val2 >.P >The regex gets converted to "^http://"+regex+"$" and uses the POSIX extended >syntax. For those unexperienced with regular expressions, a few explanations: > >.B . >means any character >.br >.B \e. >means the "." character >.br >.B \e? >means the "?" character >.br >.B .* >means any number of any characters including none >.br >.B (this|that) >means "this" or "that" >.br >.B (something)? >means "something" or nothing >.P >You may use any of the global parameters such as filter, ad, scripts or jsengine >in rules. The parameters you don't specify are implicitly set to the >corresponding default value. >.P >It is possible to have several rules match a single url. In this case the lowest >values for each parameter are used. That is, the values for different parameters >may be taken from different rules. > >.SH RULES RELATIONSHIP >.B Question: >What is the relationship between rules and rules.local files? Do records in >rules.local override the ones in rules or supplement them? >.br >.B Answer: >It's a rather complex relationship which will be shown in the following >example. >.HP >Suppose the rules file looks like this: >.br >filter=1 >.br >jsengine=1 >.br ># Other parameters are omited >.br >[regex1] >.br >filter=0 >.HP >And the rules.local file looks like this: >.br >jsengine=0 >.br >[regex2] >.br >filter=0 >.P >First of all, the default >.I filter=1 >parameter from rules is also implicitly present in rules.local as it's not >overriden there. Then, although only one parameter is associated with each >regex in this example, all of the other parameters are also implicitly >associated with them and their values are taken from defaults of the >corresponding file. So in reality the [regex1] record also contains >.I jsengine=1 >and the [regex2] record also contains >.I jsengine=0. >.P >Now suppose we want to get the jsengine parameter for an URL that matches >regex1. First we look for a matching regex in rules.local. Having found none >we continue to look in rules where we find the [regex1] record that matches the >given URL. This record has an implicit >.I jsengine=1 >parameter which we were looking for. If our URL doesn't match any of the >regexes, we take the default parameter from rules.local which is >.I jsengine=0 >\/. > >.SH EXAMPLES >.B 1) >All images from hosts or paths with standard advert hostnames or paths are >classified as adverts and filtered. >.P >[(.*/)?banners?(/|\\.).*] >.br >ad=2 >.br >[(.*/)?ad[sv]?(/|\\.).*] >.br >ad=2 >.br >[(.*\\.)?ad[0-9]*\\..*] >.br >ad=2 >.P >.B 2) >Allow images from the distrbuted content provider Akamai. >.P >[.*\\.akamai.net/.*] >.br >ad=1 >.P >.B 3) >Disable Javascript engine for the Hitweb tracker and uses scripts rules >setting instead for filtering. >.P >[(www\\.)?hitweb\\.info/Download\\.asp\\?\/.*] >.br >jsengine=0 >.P >.B 4) >Allow images used to count page views for projects hosted on SourceForge. >.P >[(www\\.)?sourceforge.net/sflogo.php\\?.*] >.br >ad=1 > >.SH CONTROLLING >Restart bfilter to reload configuration files. >.P >Sending a >.B SIGUSR1 >to all bfilter processes will cause the child processes only to exit after >handling their last request. > >.SH NOTES >If the HTML processor is in doubt about an image or a Flash file, it defers >the decision until the browser has requested that file. The response is then >analyzed (redirects, cookies) as well as the file itself. For an image, the >analyzer checks its dimensions and whether it's animated or not. For Flash >files, the analyzer is trying to find a button that covers most of the object's >area and has a getURL action associated with it. Depending on the results, >the object is either forwarded to the client, or substituted with a generated >replacement. (Unfortunately, analyzing objects that are placed with Javascript >doesn't work, as their URLs in javascript source cannot be altered.) > >.SH BUGS >Please report any bugs you may find to: >.P >.B http://sourceforge.net/projects/bfilter > >.SH AUTHOR >Joseph Artsimovich <joseph_a@mail.ru> >.br >http://bfilter.sourceforge.net > >.SH SEE ALSO >regex(7) >.I http://mozilla.org/js/spidermonkey/ >.I http://www.iki.fi/vl/tre/
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Raw
Actions:
View
Attachments on
bug 103947
:
67006
|
67008
|
67009
|
67010
|
67013
|
67058
|
67063