I might be using Tenshi the wrong way, but I have it setup to report/trash stuff that I expect (and which is not critical), and send emails for everything else. This has mainly helped me determine if a service dies or has some problems. The problem I have is that some services are really noisy. Barnyard for example (part of snort), kept dumping errors to the log when it couldn't read the snort log. The next day I checked my mail, I had almost overstepped by quota. What I propose is a limiting option for emails sent out per log line. This way, If barnyard starts dumping critical information every 10 seconds, I could only get say, 10 emails at first, and then perhaps a notification from Tenshi that it will suppress output for that line.. maybe only outputting a summary email every N times it happens afterwards (or perhaps every N units of time in case a service gets REALLY noisy). I'm not sure if this just means i'm using Tenshi in the wrong way, but if anyone has some ideas to help me correct this behavior, just let me know. Reproducible: Always Steps to Reproduce: 1. 2. 3.
I don't think you really need that feature, here are 3 suggestions on how making things work without that: - summarize as much as possible, this way messages will be consolidated - don't use [now] in queue specifications, have periodic burts (especially for the alerts that you know will annoy you) and tune the crontab specifications accordingly - use the 'set limit' option in order to prevent huge warnings In my experience on all kind of servers I've found that tuning accordingly the regexp list and the queue timing solves this kind of problems and there's no need of complicating stuff with some thresholding feature. Please let us know what you think, I'll be happy to assist you in getting a good setup.
Eric, any thoughts about my comment?
closing as NEEDINFO for now.
Sorry for taking so long to get back to you. about your suggestions: 1) summarize as much as possible I do this, it's fantastic! 2) don't use [now] I still need to use [now] mostly because I don't know what output to expect from certain programs (also the kernel has very diverse output). Mostly though, [now] is my catchall because if something out of the ordinary happens, I really want to know. 3) set limit I haven't set this yet because the length of the emails hasn't been a problem yet, just the frequency. Recently I just haven't had any more unexpectedly noisy daemons, so I haven't done anything to combat this problem. I suppose I could have an hourly queue for the catchall with a limit set, but it might hamper my response time if there was a real emergency. Anyway, thanks for the ideas.