Get Started with Nagios and NRPE

Often things are installed, configured, and they work as they are supposed to for months. We lose track of it — well, as sysadmins our job is to really configure and deploy tens of servers for various workloads per month. And often it’s not possible for us to keep a track of all these servers over the months.

Then out of the blue in the middle of the night we get a call — “The server is down. Fix it!”

“But what’s wrong?”

“Not sure. You’re the engineer. Just fix it, please!”

Well, if only we could get a heads up :-(

Anyway, a cool solution that I recommend — in fact, configure and deploy — nowadays for any mission-critical server at a customer’s end is Nagios.

Nagios, as many of we know is a popular open source monitoring software. It “watches” hosts and services, and alerts users as and when things go wrong — and again when things are back on track. Well, we all know or have heard about it — but haven’t really taken it’s skillsets very seriously, if you’re as busy (or lazy like me), till date.

But, honestly, it could come in as a very handy if you don’t wanna be woken up, unprepared in the middle of the night — with the only notes: “something is wrong, fix it!”

Why Nagios? Because, it gives a heads up in times of trouble :-)

First, get it off the Net

…and on to your system.

To begin with, let’s create a directory:

root@tux-amit:~# mkdir /opt/tuxamit/nagios

But, why create a directory — that too out of my “home” turf, you ask? Well, sorry guys — I sometime like to put on my g33k cap and love installing stuff from source. This is just to impress the heck out of the ones around ;-)

Anyway, let us now download the real deal — Nagios and its associated plugins from it’s official website. Over here, if you’re looking for a direct link, and save it in the directory we created earlier. By the way, the last time I visited that link in question, I got two files: nagios-3.0.1.tar.gz, and nagios-plugins-1.4.11.tar.gz. Your version mileage may vary.

Take care of the prerequisites

Make sure Apache is working — by pointing your browser to http://localhost.

If you get a “Unable to connect” note or something similar on your browser immediately — that means it ain’t installed. And no, I won’t tell you how to do it — it’s a prerequisite after all ;-)

Next up, ensure that gcc and gd is installed. If not, you know what to do.

Why GCC, and GD, you ask? Well, we’ll compile the Nagios thingy, remember?

So, launch a terminal, su - to become root, and create user and group for nagios:

root@tux-amit:~# passwd nagios
root@tux-amit:~# groupadd nagcmd
root@tux-amit:~# usermod -G nagcmd nagios
root@tux-amit:~# usermod -G nagcmd apache

The real deal

Yup, the real deal is well, installing Nagios — for now, at least. Go ahead and run the following set of commands, one by one, in your terminal as root. [Don’t get scared of the creepy texts that keep crawling on your screen every time you run one of these ;-) I’m telling you, people get really impressed with your geekdom when they see such scary text scrolling automatically on your screen.]

root@tux-amit:~# tar xvf nagios-3.0.1.tar.gz
root@tux-amit:~# cd nagios-3.0.1
root@tux-amit:~# ./configure --with-command-group=nagcmd
root@tux-amit:~# make all
root@tux-amit:~# make install
root@tux-amit:~# make install-config
root@tux-amit:~# make install-commandmode
root@tux-amit:~# make install-init

All done. Now what?

Well, remember how I wanted you to check whether Apache (or any other web server for that matter) is installed on your system? Now comes the part the web server will play. Nagios, after all, has a Web interface…

root@tux-amit:~# make install-webconf
root@tux-amit:~# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin

“Wait a sec! What’s with that htpasswd command?”

Well, using that we’re simply restricting the Web interface of Nagios to authorised users.

Time to compile and install Nagios plugins now. Yup, go ahead and copy the following commands on your terminal scren as root.

root@tux-amit:~# tar xvf nagios-plugins-1.4.11.tar.gz
root@tux-amit:~# cd nagios-plugins-1.4.11
root@tux-amit:~# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
root@tux-amit:~# make
root@tux-amit:~# make install
root@tux-amit:~# ./configure --with-nagios-user=nagios --with-nagios-group=nagios --enable-redhat-pthread-workaround

Ensure that the nagios services is stated when the system boots.

root@tux-amit:~# chkconfig --add nagios
root@tux-amit:~# chkconfig nagios on

Well, now that the service is configured to start at system boot, are we sure everything is as we want them to be?

Yes, we better verify and make sure there are no errors in the nagios configuration file:

root@tux-amit:~# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Total Warnings: 0
Total Errors:   0
Things look okay - No serious problems were detected during the pre-flight check

Cool! It’s now time to start nagios:

root@tux-amit:~# service nagios start

Now what? Remember the Web interface?

Login to web interface

Nagios Web URL: http://localhost/nagios/. Use the userid, password that was created earlier.

Installation and Configuration of NRPE on host to be monitored

You need the openssl-devel package installed to compile NRPE with SSL support.

root@server1:~# yum  install openssl-devel
root@server1:~#./configure

General Options:
-------------------------
NRPE port:    5666
NRPE user:    nagios
NRPE group:   nagios
Nagios user:  nagios
Nagios group: nagios

root@server1:~# make all
root@server1:~# make install-plugin
root@server1:~# make install-daemon
root@server1:~# make install-daemon-config
root@server1:~# make install-xinetd

Post NRPE Configuration

Edit Xinetd NRPE entry to add Nagios Monitoring server to the only_from directive — open the /etc/xinetd.d/nrpe file and add the following directive:

only_from = 127.0.0.1 <nagios_ip_address>

Now, add entry for nrpe daemon to /etc/services file:

nrpe      5666/tcp    # NRPE

Restart Xinetd and set to start at boot:

root@server1:~# chkconfig xinetd on
root@server1:~# service xinetd restart

Test NRPE Daemon Install

Check NRPE daemon is running and listening on port 5666:

root@server1:~# netstat -at |grep nrpe

Output should be:

tcp    0    0 *:nrpe    *.*    LISTEN

Check NRPE daemon is functioning:

root@server1:~# /usr/local/nagios/libexec/check_nrpe -H localhost

Output should be NRPE version:

NRPE v2.12

Next step is to test connection of the NRPE daemon of Remote server on Nagios Server:

root@tux-amit:~#/usr/local/nagios/libexec/check_nrpe -H <IP of Remote Server>
NRPE v2.12

Create NRPE Command Definition

A command definition needs to be created in order for the check_nrpe plugin to be used by Nagios.

Open the /usr/local/nagios/etc/objects/commands.cfg file and add the following:

######################################################
# NRPE CHECK COMMAND
#
# Command to use NRPE to check remote host systems
######################################################

define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

Adding host in Nagios server to monitor

In order to be able to add the remote Linux machine to Nagios we need to create an object template file and add some object definitions.

Let us first create a new station1.example.com template — open /usr/local/nagios/etc/objects/station1.example.com.cfg for editing and add the following and replace the values host_name, alias, address with the values that match your setup:

** The "host_name" you set for the "define_host" section must match the "host_name" in the "define_service" section **
################
define host{
name                  station1              ; Name of this template
use                   generic-host          ; Inherit default values
check_period          24x7
check_interval        5
retry_interval        1
max_check_attempts    10
check_command         check-host-alive
notification_period   24x7
notification_interval 30
notification_options  d,r
contact_groups        admins
register              0          ; DONT REGISTER THIS - ITS A TEMPLATE
}

define host{
use       station1     ; Inherit default values from a template
host_name station1     ; The name we're giving to this server
alias     station1 ; A longer name for the server
address   192.168.0.5   ; IP address of the server
}

define service{
use                 generic-service
host_name           station1
service_description CPU Load
check_command       check_nrpe!check_load
}
define service{
use                 generic-service
host_name           station1
service_description Current Users
check_command       check_nrpe!check_users
}
define service{
use                 generic-service
host_name            station1
service_description /dev/sda1 Free Space
check_command       check_nrpe!check_sda1
}
define service{
use                 generic-service
host_name           station1
service_description Total Processes
check_command       check_nrpe!check_total_procs
}
define service{
use                 generic-service
host_name           station1
service_description Zombie Processes
check_command       check_nrpe!check_zombie_procs
}
####

Activate the server1.example.com.cfg template — open /usr/local/nagios/etc/nagios.cfg and add:

cfg_file=/usr/local/nagios/etc/objects/server1.example.com.cfg

Verify Nagios Configuration Files:

root@tux-amit:~# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Total Warnings: 0
Total Errors:   0

Now, that everything looks fine, let’s restart Nagios:

root@tux-amit:~# service nagios restart

All done! Please login to Nagios Web interface and check your remote server status :-)

8 Comments

  1. Abhishek says:

    thanks Amit :)

  2. Tux Amit says:

    Your Most Welcome Abhishek.. :-)

  3. Amit Singh says:

    Hi Amit,

    I was one of the student who got the opportunity to learn at Fostering Linux under the umbrella of Mr. Varad Gupta, Praveen Thakur and Ashu sir.

    I read this article about and i would like to know one thing about the configuration of messaging part, if anything goes wrong. And what if we need to add more than 50 servers ?

    If you can help me on this section than it will help me to implement this in my organization on the demo mode, Also would help me to increase my knowledge.

    Thanks & Regards,
    Amit kumar Singh

    • Tux Amit says:

      Dear Amit Singh,

      first of all you truly are lucky to gets such mentors .

      About your querys, What part of Nagios Messaging you would like to know. SMS/Email ? secondly its depends upon your nagios server vesion to allow adding more then 50 servers.

      Pls let me know if i can help you on this ..

      Regards
      Tux Amit

  4. Prince says:

    thanks

  5. Rajeev Kumar says:

    Hey,
    Amit can you tell me from where can I download NRPE and RHEVM.
    Thanx

  6. Saurabh says:

    Very knowledgeable, Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *