Nagios 3: Responding to Known Problems

by Mike on March 28, 2009

in Nagios

As an administrator the major interest that you will have with the web interface is the ability to recognize and respond to problems.  The quickest access to all of the recognized problems is the “Service Problems” page.  This  page provides a summary of all problems related to services that Nagios detects.

service

Here you can see a list of all of the service problems that exist.  If the problem is only a a service, as you can see in the second servicer listed, it is grayed out.

If an administrator wants to respond to an outage, the host can be selected and then at the bottom of the page a response option is available.

nocomment

Here the administrator can “Add a new comment” so that the next administrator recognizes that this problem is in the process of being resolved.

The administrator can now add a comment to indicate the information that is know about the server.

confirm_down

Once this is entered other administrators will be able to see the situation and not repeat the steps that have already been take.

comment

This way administrators can communicate about the situation.

Responsibility for Action
When an administrator is going to take responsibility to solve the problem they can select the “Acknowledge this problem” option in Service Commands.

acknowldege

When the Commands Options opens you have several options.  The “Sticky Acknowledgement” when it is checked, will prevent further notifications if the problem continues.  The “Send Notifications” when checked, will be sure to notify the other administrators so that they do not take action on something that is already being fixed.

acknowledge2

“Persistent Comment” in Nagios 3 will retain the comment even after a reboot and must be manually unchecked when it is fixed.  If you leave it unchecked Nagios will remove the comment when a solution is found.

Previous post:

Next post: