The Rub


Poor Man’s FreeBSD gmirror Monitoring

gmirror is a simple-to-use RAID-1 implementation for FreeBSD. One common issue with RAID is that people may not actually know when a disk failure has occurred, in business and at home (RAID does little good if the disks are not replaced upon failure).

When I run gmirror at work, I hook it up to our centralized monitoring (Nagios), and all is well. At home, I need a quick way to be alerted to disk failures, without spending a lot of time setting up (and maintaining) monitoring.  Here is the cron job that I use:

*/5 * * * * (gmirror status | grep -q COMPLETE) || (gmirror status | mail -s "Array on pepin is not COMPLETE" [email protected])

I like this solution because it is effective, and it does not require any 3rd party packages or external scripts. Of course, if the box is down, it doesn’t help. But if the box is down, you have a bigger problem.

Let’s dissect the cron job.

*/5 * * * *

Run every 5 minutes.

(gmirror status | grep -q COMPLETE)

‘gmirror status’ will display the status of the array.  grep -q will look for the string “COMPLETE” in the output and set the exit code to 0 on success. The result of this entire command, then, is that it will return exit code 0 if the gmirror array is COMPLETE.

One caveat here is that if you have multiple gmirrors, this will happily return success if any of the arrays are complete. You probably need a cron line for each array, with an extra grep to isolate each array individually.

|| (gmirror status | mail -s "Array on hostname is not COMPLETE" [email protected])
  is an “or”. If the previous part is successful (returns 0), then the second part is not executed.

In the failure case, simply, the output of gmirror status is mailed to the system administrator. I guarantee that an email every 5 minutes will be noticed.