Go to:
Gentoo Home
Documentation
Forums
Lists
Bugs
Planet
Store
Wiki
Get Gentoo!
Gentoo's Bugzilla – Attachment 70496 Details for
Bug 109061
webdruid-0.5.4.ebuild (new package)
Home
|
New
–
[Ex]
|
Browse
|
Search
|
Privacy Policy
|
[?]
|
Reports
|
Requests
|
Help
|
New Account
|
Log In
[x]
|
Forgot Password
Login:
[x]
files/0.5.4/webdruid.conf
webdruid.conf (text/plain), 24.70 KB, created by
Niko
on 2005-10-12 14:59:00 UTC
(
hide
)
Description:
files/0.5.4/webdruid.conf
Filename:
MIME Type:
Creator:
Niko
Created:
2005-10-12 14:59:00 UTC
Size:
24.70 KB
patch
obsolete
># ># Sample WebDruid configuration file ># Copyright 2003, 2004 by Fabien Chevalier <fabien@juliana-multimedia.com> ># ># Distributed under the GNU General Public License. See the ># files "Copyright" and "COPYING" provided with the webalizer ># distribution for additional information. ># ># This is a sample configuration file for the WebDruid (ver 0.5.0) ># Lines starting with pound signs '#' are comment lines and are ># ignored. Blank lines are skipped as well. Other lines are considered ># as configuration lines, and have the form "ConfigOption Value" where ># ConfigOption is a valid configuration keyword, and Value is the value ># to assign that configuration option. Invalid keyword/values are ># ignored, with appropriate warnings being displayed. There must be ># at least one space or tab between the keyword and its value. ># ># The WebDruid will look for a 'default' configuration ># file named "webdruid.conf" in the current directory, and if not found ># there, will look for "/etc/webdruid/webdruid.conf". > > ># LogFile defines the web server log(s) file(s) to use. ># To use multiple logs files (for instance to generate statistics for ># a cluster of Apache servers), you can use the directive LogFile more ># than oncee. ># If no 'LogFile' is specified ># here or on the command line, input will default to STDIN. If ># the log filename ends in '.gz' (ie: a gzip compressed file), it will ># be decompressed on the fly as it is being read. > >LogFile /var/log/apache/access_log > ># LogType defines the log type being processed. Normally, the WebDruid ># expects a CLF or Combined web server log as input. Using this option, ># you can process w3c logs (IIS) as well. Values can ># be 'clf' or 'w3c', with 'clf' the default. > >#LogType clf > ># OutputDir is where you want to put the output files. This should ># should be a full path name, however relative ones might work as well. ># If no output directory is specified, the current directory will be used. > >OutputDir DIR > >###################################################################### ># Development Area ># These options will be part of a future release of the WebDruid ># DO NOT EDIT >###################################################################### > ># Turns on new XML output engine. ># This is used for development tests. ># Defaults to no. > >#XMLOutput yes > ># Many people are only interested in a subset of WebDruid's statistics, ># or do not like the way output is done. ># So The WebDruid has the ability to support more than one output format. ># They are called 'Skins', as some popular music playing software ;-) ># Currently only one skin is shipped with, the 'classic' skin. ># You can also totally disable human readable output support, by ># setting 'Skin' to 'null'. You will then have to deal manually with raw ># XML files. > >#Skin null > >###################################################################### > ># Incremental processing allows multiple partial log files to be used ># instead of one huge one. Useful for large sites that have to rotate ># their log files more than once a month. The WebDruid will save its ># internal state before exiting, and restore it the next time run, in ># order to continue processing where it left off. This mode also causes ># The WebDruid to scan for and ignore duplicate records (records already ># processed by a previous run). See the README file for additional ># information. The value may be 'yes' or 'no', with a default of 'no'. ># The file 'incremental.1' is used to store the current state data, ># and is located in the output directory of the program. Please read at ># least the section on Incremental processing in the README file before ># you enable this option. > >#Incremental no > ># FontFace is the True Type font face name. This font is used for ># drawing text inside charts. This is NOT the file name, ># this is the name that you would use in your favourite word processor ># when selecting this font. True Type fonts are searched in the path ># defined by the GDFONTPATH environment variable if it is set. ># If GDFONTPATH is not set, The WebDruid will look for fonts in the ># following directories: ># - /usr/share/fonts/truetype/freefont ># - /usr/share/fonts/truetype ># The default fontname is "FreeSerif". It is part of the FreeFonts package, ># available at http://savannah.nongnu.org/projects/freefont/ . > >#FontFace FreeSerif > ># ReportTitle is the text to display as the title. The hostname ># (unless blank) is appended to the end of this string (seperated with ># a space) to generate the final full title string. ># Default is (for english) "Usage Statistics for". > >#ReportTitle Usage Statistics for > ># HostName defines the hostname for the report. This is used in ># the title, and is prepended to the URL table items. This allows ># clicking on URL's in the report to go to the proper location in ># the event you are running the report on a 'virtual' web server, ># or for a server different than the one the report resides on. ># If not specified here, or on the command line, webalizer will ># try to get the hostname via a uname system call. If that fails, ># it will default to "localhost". > >#HostName localhost > ># HTMLExtension allows you to specify the filename extension to use ># for generated HTML pages. Normally, this defaults to "html", but ># can be changed for sites who need it (like for PHP embeded pages). > >#HTMLExtension html > ># PageType lets you tell the WebDruid what types of URL's you ># consider a 'page'. Most people consider html and cgi documents ># as pages, while not images and audio files. If no types are ># specified, defaults will be used ('htm*','php*', 'cgi' and HTMLExtension ># if different for web logs). > >PageType htm* >PageType cgi >#PageType phtml >#PageType php3 >#PageType pl >#PageType jsp > ># UseHTTPS should be used if the analysis is being run on a ># secure server, and links to urls should use 'https://' instead ># of the default 'http://'. If you need this, set it to 'yes'. ># Default is 'no'. This only changes the behaviour of the 'Top ># URL's' table. > >#UseHTTPS no > ># DNSCache specifies the DNS cache filename to use for reverse DNS lookups. ># This file must be specified if you wish to perform name lookups on any IP ># addresses found in the log file. If an absolute path is not given as ># part of the filename (ie: starts with a leading '/'), then the name is ># relative to the default output directory. See the DNS.README file for ># additional information. > >#DNSCache dns_cache.db > ># DNSChildren allows you to specify how many "children" processes are ># run to perform DNS lookups to create or update the DNS cache file. ># If a number is specified, the DNS cache file will be created/updated ># each time the WebDruid is run, immediately prior to normal processing, ># by running the specified number of "children" processes to perform ># DNS lookups. If used, the DNS cache filename MUST be specified as ># well. The default value is zero (0), which disables DNS cache file ># creation/updates at run time. The number of children processes to ># run may be anywhere from 1 to 100, however a large number may effect ># normal system operations. Reasonable values should be between 5 and ># 20. See the DNS.README file for additional information. > >#DNSChildren 0 > ># HTMLPre defines HTML code to insert at the very beginning of the ># file. Default is the DOCTYPE line shown below. Max line length ># is 80 characters, so use multiple HTMLPre lines if you need more. > >#HTMLPre <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> > ># HTMLHead defines HTML code to insert within the <HEAD></HEAD> ># block, immediately after the <TITLE> line. Maximum line length ># is 80 characters, so use multiple lines if needed. > >#HTMLHead <META NAME="author" CONTENT="The WebDruid"> > ># HTMLBody defined the HTML code to be inserted, starting with the ># <BODY> tag. If not specified, the default is shown below. If ># used, you MUST include your own <BODY> tag as the first line. ># Maximum line length is 80 char, use multiple lines if needed. > >#HTMLBody <BODY BGCOLOR="#E8E8E8" TEXT="#000000" LINK="#0000FF" VLINK="#FF0000"> > ># HTMLPost defines the HTML code to insert immediately before the ># first <HR> on the document, which is just after the title and ># "summary period"-"Generated on:" lines. If anything, this should ># be used to clean up in case an image was inserted with HTMLBody. ># As with HTMLHead, you can define as many of these as you want and ># they will be inserted in the output stream in order of apperance. ># Max string size is 80 characters. Use multiple lines if you need to. > >#HTMLPost <BR CLEAR="all"> > ># HTMLTail defines the HTML code to insert at the bottom of each ># HTML document, usually to include a link back to your home ># page or insert a small graphic. It is inserted as a table ># data element (ie: <TD> your code here </TD>) and is right ># alligned with the page. Max string size is 80 characters. > >#HTMLTail <IMG SRC="msfree.png" ALT="100% Micro$oft free!"> > ># HTMLEnd defines the HTML code to add at the very end of the ># generated files. It defaults to what is shown below. If ># used, you MUST specify the </BODY> and </HTML> closing tags ># as the last lines. Max string length is 80 characters. > >#HTMLEnd </BODY></HTML> > ># The Quiet option suppresses output messages... Useful when run ># as a cron job to prevent bogus e-mails. Values can be either ># "yes" or "no". Default is "no". Note: this does not suppress ># warnings and errors (which are printed to stderr). > >#Quiet no > ># ReallyQuiet will supress all messages including errors and ># warnings. Values can be 'yes' or 'no' with 'no' being the ># default. If 'yes' is used here, it cannot be overriden from ># the command line, so use with caution. A value of 'no' has ># no effect. > >#ReallyQuiet no > ># TimeMe allows you to force the display of timing information ># at the end of processing. A value of 'yes' will force the ># timing information to be displayed. A value of 'no' has no ># effect. > >#TimeMe no > ># GMTTime allows reports to show GMT (UTC) time instead of local ># time. Default is to display the time the report was generated ># in the timezone of the local machine, such as EDT or PST. This ># keyword allows you to have times displayed in UTC instead. Use ># only if you really have a good reason, since it will probably ># screw up the reporting periods by however many hours your local ># time zone is off of GMT. > >#GMTTime no > ># Debug prints additional information for error messages. This ># will cause webalizer to dump bad records/fields instead of just ># telling you it found a bad one. As usual, the value can be ># either "yes" or "no". The default is "no". It shouldn't be ># needed unless you start getting a lot of Warning or Error ># messages and want to see why. (Note: warning and error messages ># are printed to stderr, not stdout like normal messages). > >#Debug no > ># FoldSeqErr forces the WebDruid to ignore sequence errors. ># This is useful for Netscape and other web servers that cache ># the writing of log records and do not guarentee that they ># will be in chronological order. The use of the FoldSeqErr ># option will cause out of sequence log records to be treated ># as if they had the same time stamp as the last valid record. ># Default is to ignore out of sequence log records. > >#FoldSeqErr no > ># VisitTimeout allows you to set the default timeout for a visit ># (sometimes called a 'session'). The default is 30 minutes, ># which should be fine for most sites. ># Visits are determined by looking at the time of the current ># request, and the time of the last request from the site. If ># the time difference is greater than the VisitTimeout value, it ># is considered a new visit, and visit totals are incremented. ># Value is the number of seconds to timeout (default=1800=30min) > >#VisitTimeout 1800 > ># IgnoreHist shouldn't be used in a config file, but it is here ># just because it might be usefull in certain situations. If the ># history file is ignored, the main "index.html" file will only ># report on the current log files contents. Usefull only when you ># want to reproduce the reports from scratch. USE WITH CAUTION! ># Valid values are "yes" or "no". Default is "no". > >#IgnoreHist no > >#This keyword is used to either enable or disable the creation >#and display of the path graph. Path is...really a path, i mean >#a list of pages the user went to. What is displayed is only >#paths that were taken more than once. The drawback of this >#criteria is that if 200 users follow exactly the same path >#in your site, except for the last pages, they won't appear >#here because their path was different at the end. If anyone >#has an good algorithm to solve this issue, he should drop me >#a note. Another issue is because of client side and proxy >#caching. You can never know when a user goes back in your site. >#Default is "yes". > >#PathGraph no > >#This keyword is used to limit the number of path displayed by >#the path graph. Indicating here an inapropriate value will >#create a graph difficult to read, and dot will spend a lot >#CPU to create it. Default is 10. > >#PathGraphMaxPaths 10 > >#This keyword is used to either enable or disable the creation >#and display of the users flow graph. This graph tries to show >#you the flows of the users in your web site. >#Each node represents a page, and each edge is balanced by the >#number of hits from a page to another page. Default is "yes". > >#UsersFlow no > >#This keyword is used to limit the number of edges displayed by >#the users flow graph. Indicating here an inapropriate value will >#create a graph difficult to read, and dot will spend a lot >#CPU to create it. Default is 30. > >#UsersFlowMaxEdges 30 > >#Full path to dot utility location. Useful only if PathGraph or >#UsersFlow is set to yes. Default is /usr/bin/dot > >#DotLocation /usr/local/bin/dot > ># Country Graph allows the usage by country graph to be disabled. ># Values can be 'yes' or 'no', default is 'yes'. > >#CountryGraph yes > ># DailyGraph and DailyStats allows the daily statistics graph ># and statistics table to be disabled (not displayed). Values ># may be "yes" or "no". Default is "yes". > >#DailyGraph yes >#DailyStats yes > ># HourlyGraph and HourlyStats allows the hourly statistics graph ># and statistics table to be disabled (not displayed). Values ># may be "yes" or "no". Default is "yes". > >#HourlyGraph yes >#HourlyStats yes > ># GraphLegend allows the color coded legends to be turned on or off ># in the graphs. The default is for them to be displayed. This only ># toggles the color coded legends, the other legends are not changed. ># If you think they are hideous and ugly, say 'no' here :) > >#GraphLegend yes > ># GraphLines allows you to have index lines drawn behind the graphs. ># I personally am not crazy about them, but a lot of people requested ># them and they weren't a big deal to add. The number represents the ># number of lines you want displayed. Default is 2, you can disable ># the lines by using a value of zero ('0'). [max is 20] ># Note, due to rounding errors, some values don't work quite right. ># The lower the better, with 1,2,3,4,6 and 10 producing nice results. > >#GraphLines 2 > ># The "Top" options below define the number of entries for each table. ># Defaults are Sites=30, URL's=30, Referrers=30 and Agents=15, and ># Countries=30. TopKSites and TopKURLs (by KByte tables) both default ># to 10, as do the top entry/exit tables (TopEntry/TopExit). The top ># search strings and usernames default to 20. Tables may be disabled ># by using zero (0) for the value. > >#TopSites 30 >#TopKSites 10 >#TopURLs 30 >#TopKURLs 10 >#TopReferrers 30 >#TopAgents 15 >#TopCountries 30 >#TopEntry 10 >#TopExit 10 >#TopSearch 20 >#TopUsers 20 > ># The All* keywords allow the display of all URL's, Sites, Referrers ># User Agents, Search Strings and Usernames. If enabled, a seperate ># HTML page will be created, and a link will be added to the bottom ># of the appropriate "Top" table. There are a couple of conditions ># for this to occur.. First, there must be more items than will fit ># in the "Top" table (otherwise it would just be duplicating what is ># already displayed). Second, the listing will only show those items ># that are normally visable, which means it will not show any hidden ># items. Grouped entries will be listed first, followed by individual ># items. The value for these keywords can be either 'yes' or 'no', ># with the default being 'no'. Please be aware that these pages can ># be quite large in size, particularly the sites page, and seperate ># pages are generated for each month, which can consume quite a lot ># of disk space depending on the traffic to your site. > >#AllSites no >#AllURLs no >#AllReferrers no >#AllAgents no >#AllSearchStr no >#AllUsers no > ># The WebDruid normally strips the string 'index.' off the end of ># URL's in order to consolidate URL totals. For example, the URL ># /somedir/index.html is turned into /somedir/ which is really the ># same URL. This option allows you to specify additional strings ># to treat in the same way. You don't need to specify 'index.' as ># it is always scanned for by The WebDruid, this option is just to ># specify _additional_ strings if needed. If you don't need any, ># don't specify any as each string will be scanned for in EVERY ># log record... A bunch of them will degrade performance. Also, ># the string is scanned for anywhere in the URL, so a string of ># 'home' would turn the URL /somedir/homepages/brad/home.html into ># just /somedir/ which is probably not what was intended. > >#IndexAlias home.htm >#IndexAlias homepage.htm > ># The Hide*, Group* and Ignore* and Include* keywords allow you to ># change the way Sites, URL's, Referrers, User Agents and Usernames ># are manipulated. The Ignore* keywords will cause The WebDruid to ># completely ignore records as if they didn't exist (and thus not ># counted in the main site totals). The Hide* keywords will prevent ># things from being displayed in the 'Top' tables, but will still be ># counted in the main totals. The Group* keywords allow grouping ># similar objects as if they were one. Grouped records are displayed ># in the 'Top' tables and can optionally be displayed in BOLD and/or ># shaded. Groups cannot be hidden, and are not counted in the main ># totals. The Group* options do not, by default, hide all the items ># that it matches. If you want to hide the records that match (so just ># the grouping record is displayed), follow with an identical Hide* ># keyword with the same value. (see example below) In addition, ># Group* keywords may have an optional label which will be displayed ># instead of the keywords value. The label should be seperated from ># the value by at least one 'white-space' character, such as a space ># or tab. ># ># The value can have either a leading or trailing '*' wildcard ># character. If no wildcard is found, a match can occur anywhere ># in the string. Given a string "www.yourmama.com", the values "your", ># "*mama.com" and "www.your*" will all match. > ># Your own site should be hidden >#HideSite *mrunix.net >#HideSite localhost > ># Your own site gives most referrals >#HideReferrer mrunix.net/ > ># This one hides non-referrers ("-" Direct requests) >#HideReferrer Direct Request > ># Usually you want to hide these >HideURL *.gif >HideURL *.GIF >HideURL *.jpg >HideURL *.JPG >HideURL *.png >HideURL *.PNG >HideURL *.ra > ># Hiding agents is kind of futile >#HideAgent RealPlayer > ># You can also hide based on authenticated username >#HideUser root >#HideUser admin > ># Grouping options >#GroupURL /cgi-bin/* CGI Scripts >#GroupURL /images/* Images > >#GroupSite *.aol.com >#GroupSite *.compuserve.com > >#GroupReferrer yahoo.com/ Yahoo! >#GroupReferrer excite.com/ Excite >#GroupReferrer infoseek.com/ InfoSeek >#GroupReferrer webcrawler.com/ WebCrawler > >#GroupUser root Admin users >#GroupUser admin Admin users >#GroupUser wheel Admin users > ># The following is a great way to get an overall total ># for browsers, and not display all the detail records. ># (You should use MangleAgent to refine further...) > >#GroupAgent MSIE Micro$oft Internet Exploder >#HideAgent MSIE >#GroupAgent Mozilla Netscape >#HideAgent Mozilla >#GroupAgent Lynx* Lynx >#HideAgent Lynx* > ># HideAllSites allows forcing individual sites to be hidden in the ># report. This is particularly useful when used in conjunction ># with the "GroupDomain" feature, but could be useful in other ># situations as well, such as when you only want to display grouped ># sites (with the GroupSite keywords...). The value for this ># keyword can be either 'yes' or 'no', with 'no' the default, ># allowing individual sites to be displayed. > >#HideAllSites no > ># The GroupDomains keyword allows you to group individual hostnames ># into their respective domains. The value specifies the level of ># grouping to perform, and can be thought of as 'the number of dots' ># that will be displayed. For example, if a visiting host is named ># cust1.tnt.mia.uu.net, a domain grouping of 1 will result in just ># "uu.net" being displayed, while a 2 will result in "mia.uu.net". ># The default value of zero disable this feature. Domains will only ># be grouped if they do not match any existing "GroupSite" records, ># which allows overriding this feature with your own if desired. > >#GroupDomains 0 > ># The GroupShading allows grouped rows to be shaded in the report. ># Useful if you have lots of groups and individual records that ># intermingle in the report, and you want to diferentiate the group ># records a little more. Value can be 'yes' or 'no', with 'yes' ># being the default. > >#GroupShading yes > ># GroupHighlight allows the group record to be displayed in BOLD. ># Can be either 'yes' or 'no' with the default 'yes'. > >#GroupHighlight yes > ># The Ignore* keywords allow you to completely ignore log records based ># on hostname, URL, user agent, referrer or username. I hessitated in ># adding these, since the WebDruid was designed to generate _accurate_ ># statistics about a web servers performance. By choosing to ignore ># records, the accuracy of reports become skewed, negating why I wrote ># this program in the first place. However, due to popular demand, here ># they are. Use the same as the Hide* keywords, where the value can have ># a leading or trailing wildcard '*'. Use at your own risk ;) > >#IgnoreSite bad.site.net >#IgnoreURL /test* >#IgnoreReferrer file:/* >#IgnoreAgent RealPlayer >#IgnoreUser root > ># The Include* keywords allow you to force the inclusion of log records ># based on hostname, URL, user agent, referrer or username. They take ># precidence over the Ignore* keywords. Note: Using Ignore/Include ># combinations to selectivly process parts of a web site is _extremely ># inefficent_!!! Avoid doing so if possible (ie: grep the records to a ># seperate file if you really want that kind of report). > ># Example: Only show stats on Joe User's pages... >#IgnoreURL * >#IncludeURL ~joeuser* > ># Or based on an authenticated username >#IgnoreUser * >#IncludeUser someuser > ># The MangleAgents allows you to specify how much, if any, The WebDruid ># should mangle user agent names. This allows several levels of detail ># to be produced when reporting user agent statistics. There are six ># levels that can be specified, which define different levels of detail ># supression. Level 5 shows only the browser name (MSIE or Mozilla) ># and the major version number. Level 4 adds the minor version number ># (single decimal place). Level 3 displays the minor version to two ># decimal places. Level 2 will add any sub-level designation (such ># as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will attempt to also add ># the system type if it is specified. The default Level 0 displays the ># full user agent field without modification and produces the greatest ># amount of detail. User agent names that can't be mangled will be ># left unmodified. > >#MangleAgents 0 > ># The Dump* keywords allow the dumping of Sites, URL's, Referrers ># User Agents, Usernames and Search strings to seperate tab delimited ># text files, suitable for import into most database or spreadsheet ># programs. > ># DumpPath specifies the path to dump the files. If not specified, ># it will default to the current output directory. Do not use a ># trailing slash ('/'). > >DumpPath /var/log/apache > ># The DumpHeader keyword specifies if a header record should be ># written to the file. A header record is the first record of the ># file, and contains the labels for each field written. Normally, ># files that are intended to be imported into a database system ># will not need a header record, while spreadsheets usually do. ># Value can be either 'yes' or 'no', with 'no' being the default. > >#DumpHeader no > ># DumpExtension allow you to specify the dump filename extension ># to use. The default is "tab", but some programs are pickey about ># the filenames they use, so you may change it here (for example, ># some people may prefer to use "csv"). > >#DumpExtension tab > ># These control the dumping of each individual table. The value ># can be either 'yes' or 'no'.. the default is 'no'. > >#DumpSites no >#DumpURLs no >#DumpReferrers no >#DumpAgents no >#DumpUsers no >#DumpSearchStr no > ># End of configuration file... Have a nice day!
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Raw
Actions:
View
Attachments on
bug 109061
:
70492
|
70493
|
70494
|
70495
| 70496