Nagios and Munin

Finally got some server management software running at A4D, we’d been previously flying be the seat of our pants (not really; AWS has some rudimentary graphing functionality built in, but it’s not too useful for finding why a server crashed).

Now, we’ve got Nagios and Munin running. Nagios is a poller that sends us emails when a server crashes, and Munin provides fancy logs for all our applications. Setting them up didn’t take long, and the ability to see all our servers’ health in real-time on one page is priceless.

These days, I’m fascinated by innovative log examination approaches; I’ve had logstalgia running all week, and I’m fiddling with gltail. Even more than just pretty pictures, these really give you the chance to visualize your hotspots that are causing headaches. Are there any other cool tools I’m missing? Someday soon, I’d like to plug in a few spare LCDs just to display pretty server stats :)

blog comments powered by Disqus
Data Recovery Software