[lug] Open source tools to monitor distributed services

durist at frii.com durist at frii.com
Thu Nov 30 19:28:29 MST 2006


On Thursday 30 November 2006 18:10, Vince Dean wrote:

> I am managing a sized distributed system that depends
> on services running on several Unix hosts. I've found that one
> of the best ways to monitor the health of the system is to
> run independent tests:  to check that a port is open on a given
> host, a file has been modified in the last two hours, an
> HTTP URL can be retrieved, and so on.  There are a few dozen
> dozen conditions, distributed among eight machines, that I want to
> check every few minutes.
>
> I'm using ad-hoc scripts and cron jobs but I  feel the
> need for a more general, configurable solution.  I suspect that
> this is a well-studied problem.  Are there any solutions
> that you can recommend?

I've found that nagios (http://nagios.org/) works pretty well.

-- 
Dan Urist
durist at frii.com



More information about the LUG mailing list