nagios
Monitoring account creation/password resets
Tue, 11/10/2009 - 16:03 — jaklein.ncsu.eduIt's taken a while, but we finally are monitoring the Sybase tables that manage the Unity account creation, renames, and password resets. In the past, if our account management code in either Novell or AD land failed for some reason, the first indication we would get would be furious screaming from the dbas,
https://sysnews.ncsu.edu/tools-bin/server-status?info=ncsu-accounts-ad
Down to 8 servers that can't be monitored
Tue, 09/29/2009 - 12:45 — jaklein.ncsu.eduFirewall contexts set to allow new nagios access
Fri, 09/25/2009 - 08:41 — jaklein.ncsu.eduAs we work to bring the new Nagios system on-line, there's a lot of firewall changes that need to get made.
Comtech has established a range of addresses for all OIT devices meant for monitoring, and the intention is to open this address range (referred to as "OIT-Monitor") for incoming access for monitoring protocols for all datacenter subnets. This will make firewall settings easier and quicker for ComTech, and allow everyone less back and forth in setting up firewall rules for new VLANS.
license02 software firewall adjusted for flexLM monitoring
Thu, 09/10/2009 - 15:30 — jaklein.ncsu.eduFYI, I've adjusted the software firewall on license02 so that the new Nagii can monitor it properly.
Windows servers now checked every 90 min for backup agents
Tue, 09/01/2009 - 15:40 — jaklein.ncsu.eduWe've added a new check, "backup agent" to the "windows-hosted" servers.
This check tests to see if either the NetBackup OR Avamar agents is loaded. If one is, then it return green, if neither are found or on any errors, a yellow alert is generated.
We'll be deploying this on other Windows machines shortly.
Clearing errors on Windows 'uptime' and 'load' checks in Nagios
Wed, 08/26/2009 - 15:40 — jaklein.ncsu.eduWe had a couple of Windows hosts give the error "Could not get value" when checking uptime, and the extra mysterous "ERROR: Could not get data for 1 perhaps we don't collect data this far back?" while checking load.
Turns out the Windows performance counters had gotten sidewise, and need to be rebuilt. With some help from the event log and http://eventid.net, we discovered
Nagios tests
Wed, 08/26/2009 - 08:54 — jaklein.ncsu.eduChecks that can be done on our Nagios Systems, including information about what tests ("commands") are available, and how to configure them.
Nagios tests - check_mssql_health
Wed, 08/26/2009 - 08:47 — jaklein.ncsu.eduThis plugin alows you to monitor a Microsoft SQL Server by checking more than 20 metrics.
Nagios Server Side config
In the old Microsys repo there is a RHEL5 package for the Nagios server side of this plugin. Note that this plugin uses the perl module DBD::Sybase.
Install the package check_mssql_health and/or put it in the kickstart file for the Nagios server for automatic installation.
Uninstalling the older NSClient++ plugin for Nagios (Windows Servers)
Thu, 08/20/2009 - 14:41 — jaklein.ncsu.eduWe've run into several machines that can't upgrade to the latest NSClient++ plugin (the software that implements NRPE, the Nagios Remote Plugin Executor "protocol") because the installed version doesn't appear in Add/Remove programs.
Bizzare!
To uninstall it manually, assuming it's located in "C:\Program Files\NC State University\NSClientPP" issue the folllowing commands from a cmd prompt:
"C:\Program Files\NC State University\NSClientPP\nsclient++.exe" /uninstall rd /q "C:\Program Files\NC State University\NSClientPP"
Windows Terminal Server specific nagios config
Tue, 08/18/2009 - 15:20 — jaklein.ncsu.eduTo monitor Windows Terminal Server Licenses, we do the following
