[lug] Open source tools to monitor distributed services

Collins Richey crichey at gmail.com
Thu Nov 30 20:39:52 MST 2006

On 11/30/06, durist at frii.com <durist at frii.com> wrote:
> On Thursday 30 November 2006 18:10, Vince Dean wrote:
> > I am managing a sized distributed system that depends
> > on services running on several Unix hosts. I've found that one
> > of the best ways to monitor the health of the system is to
> > run independent tests:  to check that a port is open on a given
> > host, a file has been modified in the last two hours, an
> > HTTP URL can be retrieved, and so on.  There are a few dozen
> > dozen conditions, distributed among eight machines, that I want to
> > check every few minutes.
> >
> > I'm using ad-hoc scripts and cron jobs but I  feel the
> > need for a more general, configurable solution.  I suspect that
> > this is a well-studied problem.  Are there any solutions
> > that you can recommend?
> I've found that nagios (http://nagios.org/) works pretty well.
Big Sister meets most of your requirements. It's a PHP/web based
reporter with eminently customizable little agents that run on
individual servers. My associate is in the process of doing
customizations that will integrate with the HP OpenView system used at
the NOC (Network Operations Center) at corporate headquarters. The
main reporting screen is a matrix of green-yellow-red status buttons
with drill down for any server.


Collins Richey
     If you fill your heart with regrets of yesterday and the worries
     of tomorrow, you have no today to be thankful for.

More information about the LUG mailing list