[lug] Open source tools to monitor distributed services
George Sexton
gsexton at mhsoftware.com
Fri Dec 1 12:21:11 MST 2006
Just for grins, one of the services/products that we've been considering
developing is a monitoring service as an ASP. Perhaps we'd also tie it
in as secondary MX/DNS hosting services as well.
We know there are a lot of monitoring solutions out there, but I know
from experience that they're fiddly, and tough to set up.
What does everyone think? Would you pay for a monitoring service that
could verify your DNS, SMTP, POP3/IMAP, and web services are running?
Would the additional service of secondary MX and DNS make it more useful?
Vince Dean wrote:
> I am managing a sized distributed system that depends
> on services running on several Unix hosts. I've found that one
> of the best ways to monitor the health of the system is to
> run independent tests: to check that a port is open on a given
> host, a file has been modified in the last two hours, an
> HTTP URL can be retrieved, and so on. There are a few dozen
> dozen conditions, distributed among eight machines, that I want to
> check every few minutes.
>
> I'm using ad-hoc scripts and cron jobs but I feel the
> need for a more general, configurable solution. I suspect that
> this is a well-studied problem. Are there any solutions
> that you can recommend?
>
> My ideal solution:
> - is open source
> - runs on Linux, but preferably is portable to other
> Unix systems and Windows
> (i.e. written in Java, Python, Ruby, or Perl)
> - is easily configured for some standard types of tests:
> - FTP server is running
> - HTTP URL can be retrieved
> - a given port is open on a given host
> - a given file exists and has been recently modified
> - a process is running with a given name
> - etc.
> - is easily extended by custom code to check for
> application-specific conditions
> - notifies by email and/or writes messages to a log file when a test fails
> - checks at a configurable interval and suppresses redundant messages
> (doesn't tell me the same service is down every minute)
> - notifies me when a service is back up
>
> I'll be grateful for any suggestions.
>
> Vince
>
--
George Sexton
MH Software, Inc.
Voice: +1 303 438 9585
URL: http://www.mhsoftware.com/
More information about the LUG
mailing list