Nagios installation and configuration - The quick and DIRTY guide
------------------------------------------------------------------

Nagios is a commonly used free tool to monitor systems and their devices over 
the network. It's probably
one of the best, simplest, most expandable, and cheapest network monitoring tool 
out there.

This install will cover FreeBSD 6.2, with Nagios 2.6

Install nagios from ports:-

# cd /usr/ports/net-mgmt/nagios
# make install

Nagios will then create a nagios user and group for itself.
You have to ensure that nagios is enabled at boot time with the following 
rc.conf entry:-

	nagios_enable="YES"

Nagios relies on a webserver (commonly apache) to use to display its monitoring 
results.

The following configuration lines in httpd.conf are important for nagios to 
work.
Nagios should be secure and ONLY accessed by people with the proper authority to 
do so.

So first we add some extra lines to implement HTACCESS:-

----------------------------------------------------------
DocumentRoot /usr/local/www/nagios       


	Options None
	AllowOverride None
	Order allow,deny
	Allow from all
	AuthName "Nagios Access"
	AuthType Basic
	AuthUserFile /usr/local/etc/nagios/htpasswd.users
	Require valid-user



	Options ExecCGI
	AllowOverride None
	Order allow,deny
	Allow from all
	AuthName "Nagios Access"
	AuthType Basic
	AuthUserFile /usr/local/etc/nagios/htpasswd.users
	Require valid-user


ScriptAlias /nagios/cgi-bin/ /usr/local/www/nagios/cgi-bin/
Alias /nagios/ /usr/local/www/nagios/

----------------------------------------------------------------------------------------

We will create a password file in nagios configuration directory.
First we create the nagiosadmin user, mainly because the default configurations 
expects this user to be available for full CGI control of nagios (should you 
allow it later on).

# htpasswd -c /usr/local/etc/nagios/htpasswd.users nagiosadmin

and you can create more users with the same syntax (just drop the -c option, 
because it creates a new user file, and we already have one now)

# htpasswd /usr/local/etc/nagios/htpasswd.users 

Nagios is configured by default to use authentication.

Apache will need a restart. Make sure you add apache_enable="YES" (or 
apache22_enable="YES" if you are using the latest apache) to /etc/rc.conf, and 
restart apache with the start script in /usr/local/etc/rc.d 
If you get log files created in /var/log, you'll be able to check that it's up 
and running. If nothing happens at all, you've probably mistyped in the rc.conf 
enable setting.

From here a bunch of sample configuration files are found in /usr/local/etc/nagios

Most of them will need to be "de-sampled" before nagios can work in earnest

First we have to configure nagios - this means de-sampling all the sample 
configuration files for starters. (Don't delete them. We may need them if we 
accidently delete our real configurations.)

# cp cgi.cfg-sample cgi.cfg
# cp nagios.cfg-sample nagios.cfg
# cp commands.cfg-sample commands.cfg
# cp localhost.cfg-sample localhost.cfg
# cp resource.cfg-sample resource.cfg

Nagios itself can be started with the start script in /usr/local/etc/rc.d
(make sure that the nagios_enable="YES" /etc/rc.conf line has been added first)

Of course at this point, nagios is up and working but you have no configuration 
for any servers or services yet. So that's the next job.

nagios.cfg
---------------

The main configuration file for nagios. Handles location of logs files and 
configuration files mainly. A few changes will probably need to be made here. 

Nagios' configuration files can be made to be incredibly complex. You can either 
have everything in a single file, or you can split up your configurations into 
seperate directories with seperate log files. It's all a little overwhelming to 
configure how you want to configure things (at the risk of sounding redundant) 
in the nagios.cfg file.

In older versions of Nagios, the default install gave you lots of sample 
configuration files and templates to rely on when setting up hosts, groups, 
notifications, etc. Well, those days are gone. 
Have a look at the templates in localhost.cfg to see what can be done.

You can tell nagios to read specific files with object definition templates on 
them (with the cfg_file option) or you can specify whole directories of config 
files. You can have multiple files and directories where you can have nagios 
search for files. This means that you can organise object templates anywhere you 
like. Nagios can read whole directories of random files, and if it finds a 
template it recognises, it deals with it effectively. Just as long as the 
template is a valid one, it doesn't care what the file is called - just as long 
as the file extension is .cfg and nagios is configured to read the file.
More on object definitions later..

check_external_commands

Allows you to specify whether or not Nagios should check for external commands. 
This will most likely need to be set to "1" if we intend to enable users in 
nagios to execute commands. The default options for external commands are good 
enough to leave alone.

cgi.cfg
------------

nagios_check_command 

This is something you may want to uncomment because it checks the
status of the nagios process. If you don't use it, you'll see warning messages 
in the CGIs about the Nagios process not running, and you won't be able to 
execute any commands from the web interface.

The following directives give privileges to users on the system. For the most 
part, you will want to uncomment and edit these, because by default, nagios 
won't let any user see any of the host information:-

authorized_for_system_information

List of users authorised to see Nagios process information from webpage.
By default, nobody is allowed to.

authorized_for_configuration

List of users who have full access to view all configuration information
By default, nobody is allowed to

authorized_for_all

List of users that can view information for all hosts and services monitored.
By default, users can only see info for hosts and services they are contacts 
for.

authorized_for_all_service_commands
authorized_for_all_host_commands

List of users who can get at all commands from the CGI
By default users can only issue commands for hosts or services they are contacts 
for.

ping_syntax

For FreeBSD ports and Linux, ping syntax may need not be edited, but for other 
BSDs, Solaris, and other weird UNIXes this may need to be changed

refresh_rate

by default, the nagios CGIs refresh every 90 seconds, which is good enough. This 
option can be manhandled if you want to change it.

resource.cfg
-------------

To most users, this file will remain untouched. If you are doing some fancy 
monitoring of MySQL or something like that which requires rather sensitive 
information in order to make it all work, this is the file for it. Otherwise if 
you are just doing some basic ping and disk monitoring, ignore this file.

commands.cfg
-----------------------

This file houses all the command definitions for all of the host and service 
checks that Nagios can carry out. The commands are defined by a command_name 
(which nagios remembers its tests by), and a command_line (which is the command 
and parameters that are run to complete a particular test.
This file holds all the default nagios commands. It's best to leave it untouched 
and create an extra config file where you can set up all your other 3rd party 
commands and plugins.

**** Templates and object inheritance ****

Nagios allows you to define values within a specific template, and then to use 
those templates within other templates of the same type to save a lot of typing. 
Understanding this is crucial to configuring nagios.

For example, there is a "generic-service" service template in the localhost.cfg 
file in the nagios configuration directory. This service template (and 
subsequently all its values) can be used within other service templates.
This allows templates to be reused for multiple objects.

The generic-service template hosts values such as "active_checks_enabled" and 
"notifications_enabled" and these values can be automatically injected into 
parent service temples (for real services) to save typing in these same
values for each and every host on our system.

Note that the "generic-service" service template has the line "register 0" as a 
final line. This line exists so that Nagios KNOWS that generic-service is only a 
TEMPLATE define, and NOT an actual host! The generic-service template does not 
carry a service_name or command,  nor other options which would make it a 
fully fledged service. Without the "register 0" option, nagios would produce 
configuration errors upon startup (unable to find the service name).

**** Object definition files ****

Object definition files and their locations are defined in nagios.cfg
All object definition files should have the file extension of .cfg or nagios 
will most likley skip over them.

the file localhost.cfg is an object definition file which conatins many 
different examples and templates 
of definitions you can use to create your own configurations. Either you can 
edit this file directly, or create
seperate files to ease administration.

Already, there are generic-host, generic-service, and local-service templates 
available for use in our own configurations.

Here are the main templates we need to concern ourselves with:-

* define host - You can define the names of hosts and what their IP addresses 
are
* define hostgroups - You can create groups of hosts to better organise your 
machines on the nagios display
* define contacts - You define individual administrators and their contact 
information for notifications
* define contactgroups - You define groups of administrators for notification 
(so they all get the same notification)
* define command - You define commands, based upon actual unix command line 
tools
* define services- You define services to test with, commands, and what hosts to 
run the tests on 
* define servicegroups - You define groups of services so they all group 
together

There are a bunch of other extension, dependancy, and escalation defines, but 
for the most part they are all advanced settings and beyond the scope of normal 
nagios usage. ( so I won't cover them ). Also, there is a timeperiod define, but 
a default "24x7" template is already available by default - so unless we intend 
to not monitor 24 hours a day, 7 days a week... we can mostly ignore it.

In many cases, "services" are the highest ranking defines that draw all the 
other defines in to actually do the work.
To define a service, you need hosts, hostsgroups, contacts and contact groups, 
timeperiods, commands, etc to carry out the service testing.

Also, as we have already seen, there are a lot of default commands for services 
set up in command.cfg. I recommend making another seperate file for customized 
commands (and other 3rd party plugins) to avoid confusion.

**** Really basic configuration - just as a demo ****

Ok, here I have a basic configuration to simply PING two hosts at a normal 
interval. Please note that I'm making use of the "general-host" and 
"general-service" templates from the "localhost.cfg" file in the default nagios 
configuration directory. Also note that there is a default "nagiosadmin" contact 
and "admin" contact group available by default.

(As you can guess from the config, I set up this test environment on vmware)

HOSTS

define host{
	host_name		dummy1
	alias			vmware
	address			192.168.217.129
	max_check_attempts	20
	check_period		24x7
	contact_groups		sysadmin
	notification_interval	60
	notification_period	24x7
	notification_options	d,u,r
	use			generic-host
}

define host{
	host_name		dummy2
	alias			othervmware
	address			192.168.217.132
	max_check_attempts	20
	check_period		24x7
	contact_groups		sysadmin
	notification_interval	60
	notification_period	24x7
	notification_options	d,u,r
	use			generic-host
}

HOSTGROUPS

define hostgroup{
	hostgroup_name	virtualmachines
	alias		Dummy Group!
	members		dummy1,dummy2
}

CONTACT

define contact{
	contact_name			naynay
	alias				Nathan
	service_notification_period		24x7
	host_notification_period		24x7
	service_notification_options		w,u,c,r
	host_notification_options		d,u,r
	service_notification_commands	notify-by-email
	host_notification_commands		host-notify-by-email
	email				naynay@localhost
}

CONTACTGROUPS

define contactgroup{
	contactgroup_name	sysadmin
	alias			Default admin group
	members			naynay
}

SERVICE
define service{	
	host_name		dummy1,obsd
	service_description	PING
	check_command		check_ping!80.0,20%!500.0,50%
	check_period		24x7
	max_check_attempts	3
	normal_check_interval	5
	retry_check_interval	1
	contact_groups		sysadmin
	notification_interval	240
	notification_period	24x7
	notification_options	c,r
	use			generic-service
}

Nagios should produce some nice displays about the ping status of the hosts now.

**** adding in extra-plugins ****

Nagios comes with many standard plugins that can perform numerous kinds of check 
on the hosts on my system. Nagios is very expandable in the sense that anyone 
can design and implement plugins for Nagios, in addition to the ones that come 
with it. You can download and install the most useful plugins (which require 
compiling) with the nagios-plugins port (find it at 
/usr/ports/net-mgmt/nagios-plugins). This is strongly recommended.

Over the years, people have made all kinds of new plugins for Nagios, to check 
all sorts of programs and devices. There's a plug in I made for Nagios to check 
the status of FreeBSD GEOM devices on hosts within my network.
Implementing extra 3rd party plugins is as simple as copying these files to the 
/usr/local/libexec/nagios

# cp  /usr/local/libexec/nagios
# chmod 555 /usr/local/libexec/nagios 

From there, you can configure and define individual commands using these plugins 
using the command templates. How you configure them depends on the plugin in 
question.

***** NRPE - Doing things remotely *****

Nagios can easily do network tests which don't require access the the machine 
(such as ping tests and http tests). However, there will be times when you need 
information from a server that can only be accessed internally with a valid
log in (such a disk size, process availability, etc). When you need to do tests 
like these, you need "NRPE"

NRPE is a seperate port at /usr/ports/net-mgmt/nrpe2. (The reason it is called 
nrpe2, is that nrpe is the old version that worked with "netsaint" - a precursor 
to nagios. Needless to say, nrpe and nrpe2 and not compatible. Make sure you get
nrpe2 for nagios)

nrpe functions on a client machine as a daemon that the Nagios server can talk 
with. Note that you don't need to run a nrpe daemon on your Nagios server 
(nagios can access its own hosting server quite adequately). Your Nagios server 
then uses its own "check_nrpe2" plugin to communicate with nrpe daemons on 
client machines. On your Nagios server, you need to make sure that check_nrpe2 
is a recognized command, so you need to create a command define:-

COMMAND

define command{
	command_name	check_nrpe2
	command_line	/usr/local/libexec/nagios/check_nrpe2 -h $HOSTADDRESS$ -c $ARG1$
}

Of course, you can define other local 3rd party plugins in exactly the same way. 
Each plugin has different arguments however. check_nrpe2 expects a target host 
address (-h) and command to run on nrpe on that host (-c)

nrpe on client machines is set up with the /usr/ports/net-mgmt/nrpe2 port, as 
well as the nagios-plugins port.

First, make sure you enable it at boot time from /etc/rc.conf

nrpe2_enable="YES"

Copy the sample config file in /usr/local/etc

# cp /usr/local/etc/nrpe.cfg-sample /usr/local/etc/nrpe.cfg

configuring nrpe is not hard. First you should edit the allowed_hosts parameter. 
Set this so that it reflects only localhost and your Nagios server. This helps 
prevent attacks on the daemon.

allowed_hosts=127.0.0.1,

Towards the bottom of the config file you get to configure commands that will 
run locally on the client, but can be called into operation by the nagios 
server. This is done using the "command" parameter. Below we register the 
vaguely named example "someplugin" to the remote command "check_someplugin". 
Note that the someplugin local command has a preset argument. It is possible to 
allow arguments from the nagios server, if you enable the aptly named 
dont_blame_nrpe option.

command[check_someplugin]=/usr/local/libexec/nagios/someplugin someargument

Finally, you can set up service checks to this remote command by creating a 
service template on the Nagios server. All you have to do is remember to add the 
following line to your service config:-

check_command		check_nrpe2!check_someplugin

And that does the job!

**** Manipulating sudo on the Nagios host server *****

Nagios usually executes while running as the underprivileged "nagios" user. As a
result, any commands which require root to run just won't. Of course, 
anything running with root privileges constitutes a security risk. The sudo 
command is your trade off. You can install that from /usr/ports/security/sudo

It's more of a hassle (and a larger security risk) to have nagios run as root, 
so it is infinitely better to selectively use sudo for only the commands that really need them.

What we have to do is to create a sub-directory in /usr/local/libexec/nagios
(I'll call it "sudo") and then copy all your sudo-requiring scripts there. Then 
to lock the sudo directory down and only allow root access. You don't want your plugin commands to be hijacked and able to run root commands of an 
underprivileged user, so security on the plugin directory needs to be 
escalated.

# chmod -R 700 /usr/local/libexec/nagios/sudo

Then, add the sudo rule for that special directory with visudo (don't forget the 
trailing slash on the path)

nagios	ALL=(ALL) NOPASSWD:/usr/local/libexec/nagios/sudo/

Then define your command to use sudo first on your command line:-

define command {
        command_name    check_needsroot
        command_line    /usr/local/bin/sudo /usr/local/libexec/nagios/sudo/check
_needsroot
}

**** Enabling sudo for nrpe *****

In the nrpe.cfg file, there is an option for enabling sudo (command_prefix=/usr/local/bin/sudo), but this will execute every nrpe2 plugin as root, which isn't necessary. So forget about it.

Otherwise, having remote servers run nrpe and execute sudo scripts is almost identical to the way you would set it up on the local nagios server.
You create a special nagios sudo plugin directory, add the nagios user to sudo, place your sudo requiring scripts in there, and lock it with chmod.

The only special thing you have to do is to edit nrpe.cfg, and ensure that the command definition requiring root calls sudo first before the plugin.

command[check_needsroot]=sudo /usr/local/libexec/nagios/sudo/check_needsroot

Hosted by www.Geocities.ws

1