| Monitoring a Cisco Router with Nagios |
| Server - Nagios |
|
1. Create a List of MIBs In order to start work on monitoring a device you need to manually scan the MIBs for information that you can mine out for monitoring. This process can be intimidating but you must keep digging in order to discover information that you can use. The MIB list created for this router listed 2692 MIBs. Obviously you will not use all of these but you want to scan the list to begin the process of discovery. Be sure that you have these applications installed to provide the tools you need to discover information about the router. yum install -y net-snmp-utils
Create a text file with this command: snmpwalk -v2c -c public 192.168.5.230 > ciscodevice.txt
SNMPv2-MIB::sysName.0 = STRING: MT
SNMPv2-MIB::sysORLastChange.0 = Timeticks: (0) 0:00:00.00
IF-MIB::ifDescr.1 = STRING: ATM0 IF-MIB::ifDescr.2 = STRING: Ethernet0
IF-MIB::ifMtu.1 = INTEGER: 1500 IF-MIB::ifMtu.2 = INTEGER: 1500
IF-MIB::ifAdminStatus.1 = INTEGER: down(2) IF-MIB::ifAdminStatus.2 = INTEGER: up(1)
IF-MIB::ifOperStatus.1 = INTEGER: down(2) IF-MIB::ifOperStatus.2 = INTEGER: up(1)
IF-MIB::ifLastChange.1 = Timeticks: (30680) 0:05:06.80 IF-MIB::ifLastChange.2 = Timeticks: (292242) 0:48:42.42
IF-MIB::ifInOctets.1 = Counter32: 0 IF-MIB::ifInOctets.2 = Counter32: 370239
IF-MIB::ifInUcastPkts.1 = Counter32: 0 IF-MIB::ifInUcastPkts.2 = Counter32: 3352
IF-MIB::ifInUnknownProtos.1 = Counter32: 0 IF-MIB::ifInUnknownProtos.2 = Counter32: 924
IF-MIB::ifOutOctets.1 = Counter32: 0 IF-MIB::ifOutOctets.2 = Counter32: 358481
IF-MIB::ifOutUcastPkts.1 = Counter32: 0 IF-MIB::ifOutUcastPkts.2 = Counter32: 3673
IF-MIB::ifOutErrors.1 = Counter32: 0 IF-MIB::ifOutErrors.2 = Counter32: 43
RFC1213-MIB::atIfIndex.2.1.192.168.5.33 = INTEGER: 2 RFC1213-MIB::atIfIndex.2.1.192.168.5.230 = INTEGER: 2 RFC1213-MIB::atPhysAddress.2.1.192.168.5.33 = Hex-STRING: 08 00 27 60 A0 A6 RFC1213-MIB::atPhysAddress.2.1.192.168.5.230 = Hex-STRING: 00 04 27 FC CD AD RFC1213-MIB::atNetAddress.2.1.192.168.5.33 = Network Address: C0:A8:05:21 RFC1213-MIB::atNetAddress.2.1.192.168.5.230 = Network Address: C0:A8:05:E6
IP-MIB::ipForwarding.0 = INTEGER: notForwarding(2)
IP-MIB::ipAdEntAddr.192.168.5.230 = IpAddress: 192.168.5.230 IP-MIB::ipAdEntIfIndex.192.168.5.230 = INTEGER: 2 IP-MIB::ipAdEntNetMask.192.168.5.230 = IpAddress: 255.255.255.0
UDP-MIB::udpLocalAddress.192.168.5.230.67 = IpAddress: 192.168.5.230 UDP-MIB::udpLocalAddress.192.168.5.230.161 = IpAddress: 192.168.5.230 UDP-MIB::udpLocalAddress.192.168.5.230.162 = IpAddress: 192.168.5.230 UDP-MIB::udpLocalAddress.192.168.5.230.55224 = IpAddress: 192.168.5.230 UDP-MIB::udpLocalPort.192.168.5.230.67 = INTEGER: 67 UDP-MIB::udpLocalPort.192.168.5.230.161 = INTEGER: 161 UDP-MIB::udpLocalPort.192.168.5.230.162 = INTEGER: 162 UDP-MIB::udpLocalPort.192.168.5.230.55224 = INTEGER: 55224
SNMPv2-MIB::snmpOutTraps.0 = Counter32: 0
IF-MIB::ifName.1 = STRING: AT0 IF-MIB::ifName.2 = STRING: Et0
IF-MIB::ifLinkUpDownTrapEnable.1 = INTEGER: enabled(1) IF-MIB::ifLinkUpDownTrapEnable.2 = INTEGER: enabled(1)
This information may also be gather by using the grep command on the text file created.
The advantage of a text file is that it provides a way to scan the options visually or with a command to provide searches. You also have the advantage of printing the document created. Here is an example of using grep with the case insensitive option “i” to search for the “ifoperstatus”. The search locates three interfaces with an indication of whether they are up or down.
grep -i ifoperstatus ciscodevice.txt IF-MIB::ifOperStatus.1 = INTEGER: down(2) IF-MIB::ifOperStatus.2 = INTEGER: up(1) IF-MIB::ifOperStatus.3 = INTEGER: up(1) IF-MIB::ifOperStatus.4 = INTEGER: down(2) IF-MIB::ifOperStatus.5 = INTEGER: down(2) IF-MIB::ifOperStatus.6 = INTEGER: down(2) IF-MIB::ifOperStatus.7 = INTEGER: down(2) IF-MIB::ifOperStatus.8 = INTEGER: down(2) IF-MIB::ifOperStatus.9 = INTEGER: down(2) IF-MIB::ifOperStatus.10 = INTEGER: down(2)
This search will help in determining the cause of an interface being up or down as this will show if the interface was shut down by an administrator.
grep -i admin ciscodevice.txt IF-MIB::ifAdminStatus.1 = INTEGER: down(2) IF-MIB::ifAdminStatus.2 = INTEGER: up(1) IF-MIB::ifAdminStatus.3 = INTEGER: up(1) IF-MIB::ifAdminStatus.4 = INTEGER: down(2) IF-MIB::ifAdminStatus.5 = INTEGER: down(2) IF-MIB::ifAdminStatus.6 = INTEGER: down(2) IF-MIB::ifAdminStatus.7 = INTEGER: down(2) IF-MIB::ifAdminStatus.8 = INTEGER: down(2) IF-MIB::ifAdminStatus.9 = INTEGER: down(2) IF-MIB::ifAdminStatus.10 = INTEGER: down(2)
This search will locate another helpful piece of information which is how long has the router been up.
grep -i sysuptimeinstance ciscodevice.txt DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (297239) 0:49:32.39
If you were interested in the total number of packets that had been requested to be transmitted that were not multicast or broadcast you could use “ ifOutUcastPkts”. This router has a number of interfaces so there are several returns.
IF-MIB::ifOutUcastPkts.1 = Counter32: 0 IF-MIB::ifOutUcastPkts.2 = Counter32: 3673 IF-MIB::ifOutUcastPkts.3 = Counter32: 0 IF-MIB::ifOutUcastPkts.9 = Counter32: 0 IF-MIB::ifOutUcastPkts.10 = Counter32: 0
To locate all of the options related to outgoing packets you could use this command. Depending upon the needs of the administrator this may not all be valuable. You may have to research the Cisco site in order to determine which each MIB represents.
grep -i ifout ciscodevice.txt IF-MIB::ifOutOctets.1 = Counter32: 0 IF-MIB::ifOutOctets.2 = Counter32: 358481 IF-MIB::ifOutOctets.3 = Counter32: 0 IF-MIB::ifOutOctets.9 = Counter32: 0 IF-MIB::ifOutOctets.10 = Counter32: 0 IF-MIB::ifOutUcastPkts.1 = Counter32: 0 IF-MIB::ifOutUcastPkts.2 = Counter32: 3673 IF-MIB::ifOutUcastPkts.3 = Counter32: 0 IF-MIB::ifOutUcastPkts.9 = Counter32: 0 IF-MIB::ifOutUcastPkts.10 = Counter32: 0 IF-MIB::ifOutNUcastPkts.1 = Counter32: 0 IF-MIB::ifOutNUcastPkts.2 = Counter32: 95 IF-MIB::ifOutNUcastPkts.3 = Counter32: 0 IF-MIB::ifOutDiscards.1 = Counter32: 0 IF-MIB::ifOutDiscards.2 = Counter32: 0 IF-MIB::ifOutDiscards.3 = Counter32: 0 IF-MIB::ifOutErrors.1 = Counter32: 0 IF-MIB::ifOutErrors.2 = Counter32: 43 IF-MIB::ifOutErrors.3 = Counter32: 0 IF-MIB::ifOutQLen.1 = Gauge32: 0 IF-MIB::ifOutQLen.2 = Gauge32: 0 IF-MIB::ifOutQLen.3 = Gauge32: 0 IF-MIB::ifOutMulticastPkts.9 = Counter32: 0 IF-MIB::ifOutMulticastPkts.10 = Counter32: 0 IF-MIB::ifOutBroadcastPkts.9 = Counter32: 0 IF-MIB::ifOutBroadcastPkts.10 = Counter32: 0
Here are the MIBs you would use to track errors on an interface.
IF-MIB::ifOutErrors.1 = Counter32: 0 IF-MIB::ifOutErrors.2 = Counter32: 43 IF-MIB::ifOutErrors.3 = Counter32: 0
Of course the reverse is true as well, these are MIBs with errors that are incoming.
IF-MIB::ifInErrors.1 = Counter32: 0 IF-MIB::ifInErrors.2 = Counter32: 0 IF-MIB::ifInErrors.3 = Counter32: 0
The tool snmpwalk can also be used by itself with text strings to locate MIBs. Here the interfaces are searched for.
snmpwalk 192.168.5.220 -v1 -c public mib-2.interfaces IF-MIB::ifDescr.1 = STRING: Ethernet0 IF-MIB::ifOperStatus.1 = INTEGER: up(1) IF-MIB::ifOperStatus.2 = INTEGER: down(2) IF-MIB::ifOperStatus.3 = INTEGER: down(2) IF-MIB::ifLastChange.1 = Timeticks: (185769) 0:30:57.69 IF-MIB::ifLastChange.2 = Timeticks: (1824) 0:00:18.24 IF-MIB::ifLastChange.3 = Timeticks: (1824) 0:00:18.24
Search for system information.
snmpwalk -v2c -c public 192.168.5.230 system SNMPv2-MIB::sysDescr.0 = STRING: Cisco Internetwork Operating System Software IOS (tm) C820 Software (C820-Y6-M), Version 12.1(5)YB1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1) TAC:Home:SW:IOS:Specials for info Copyright (c) 1986-2001 by cisco Systems, Inc. Compiled Wed 14-Mar-01 16:30 SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.9.1.284 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (8276265) 22:59:22.65 SNMPv2-MIB::sysContact.0 = STRING: SNMPv2-MIB::sysName.0 = STRING: MT SNMPv2-MIB::sysLocation.0 = STRING: SNMPv2-MIB::sysServices.0 = INTEGER: 6 SNMPv2-MIB::sysORLastChange.0 = Timeticks: (0) 0:00:00.00
2. Verify a Command Entry Once the MIBs that are required to be monitored have been determined you will need to convert that information into a command and then a service definition. Here is the command definition which needs to be located in the commands.cfg file which will be located in the objects directory. This definition will probably already be entered so you will not need to add it, just verify it exists.
define command{ command_name check_snmp command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$ }
3. Create a Host Entry Each device that will be monitored must have a host entry. Note that this host is using two templates. The first is generic-switch and the second is PNP4Nagios for graphing.
define host{ use generic-switch,host-pnp host_name cisco827 alias cisco router address 192.168.5.230 }
The graphing provides an intuitive feature that is important to understanding the cycles, or flow of traffic, on a network.
4. Create Service Entries A standard service check that can provide valuable information, especially if you graph the output is the simple command ping. This simple service definition will help get started monitoring a network device. Note the WARNING level is 200 miliseconds or a 20% loss of packets and the CRITICAL state is 600 miliseconds or 60% loss of packets. Adjust this so it reflects your network needs.
define service{ use generic-service host_name cisco827 service_description PING check_command check_ping!200.0,20%!600.0,60%
}
What follows are service definitions that could be used with the information that you discover using snmpwalk. The check_snmp plugin is used for each of these service definitions. The examples show a router with a community string of “public”, adjust this to your network strings if needed. The “-o” indicates that an OID or object identifier will follow. In the first example the Ethernet port which is “1” will issue a WARNING if there are 45 errors and a CRITICAL state if there are 100 errors on the interface.
define service{ use generic-service host_name cisco827 service_description Errors Eth0 - Out check_command check_snmp!-C public -o ifOutErrors.2 -w 45 -c 10 0 }
This service description is checking the same interface for errors recorded that are incoming.
define service{ use generic-service host_name cisco827 service_description Errors Eth0 - In check_command check_snmp!-C public -o ifInErrors.2 -w 45 -c 100 }
The “Packets Out” service definition simply lists the number of packets that have gone out this interface as a total.
define service{ use generic-service host_name cisco827 service_description Packets Out Ethernet check_command check_snmp!-C public -o ifOutUcastPkts.2 }
Uptime may be a valuable piece of information when troubleshooting. Using this service definition the uptime is constantly listed in the plugin output. Note two routers are using the same service definition.
define service{ use generic-service host_name cisco,cisco827 service_description Uptime check_command check_snmp!-C public -o sysUpTimeInstance }
If you wanted to save on resources so the Nagios server did not have to translate the OID sysUpTimeInstance to the actual number you can use the number which will retrieve the same information faster and more efficiently.
define service{ use generic-service host_name cisco,cisco827 service_description OID Uptime check_command check_snmp!-C public -o .1.3.6.1.2.1.1.3.0 }
Check the status of port number one, the ATM0 and port number two the Ethernet0 port..
define service{ use generic-service host_name cisco827 service_description Port 1 Status check_command check_snmp!-C public -o ifOperStatus.2 } define service{ use generic-service host_name cisco827 service_description Port 2 Status check_command check_snmp!-C public -o ifOperStatus.2 }
All of these service definitions may not provide the solutions required and some may not be that useful but it does demonstrate the process of locating the information you need and converting that into a service definition. |