2007/06/04

Ganglia Monitoring System

Ganglia is a very light weight monitoring system for monitoring computer resources, it is mainly used in cluster environment to monitor the health of each node in a cluster.
Ganglia has 3 component, gmond, an information collecting daemon, gmetad, agent for collecting machine data for the machine it is running on, and a php web interface to see the data.
How it works?
1. gmond needs a configuration file /etc/gmond.conf, in the file, it defines the cluster name and location and to collect what kind of data from this host. This daemon need to be ran on every machines in the cluster.
2. gmetad needs a configuration file /etc/gmetad.conf, the file has the host list of the cluster, and the daemon will keep polling those hosts to get their status and various other information and keep in a rrd database.
3. the web interface basically read the stored data and present to the system administrator in a nice way.
Let me run them for a whole night and see any good information tomorrow.