PNP4Nagios
官方首頁:https://www.pnp4nagios.org/
NOTE: 必須事先安裝及設定好 Nagios
開始安裝
# 相依性套件
yum install gcc perl-Time-HiRes rrdtool-perl make
tar xzf pnp4nagios-0.6.21.tar.gz
cd pnp4nagios-0.6.21
./configure
如果 Nagios 的系統帳號與群組不是預設的 nagios,必須加上參數
./configure --with-nagios-user=icinga --with-nagios-group=icinga
如果出現以下訊息,表示 configure 完成。
*** Configuration summary for pnp4nagios-0.6.21 03-24-2013 ***
General Options:
------------------------- -------------------
Nagios user/group: nagios nagios
Install directory: /usr/local/pnp4nagios
HTML Dir: /usr/local/pnp4nagios/share
Config Dir: /usr/local/pnp4nagios/etc
Location of rrdtool binary: /usr/bin/rrdtool Version 1.3.8
RRDs Perl Modules: FOUND (Version 1.3008)
RRD Files stored in: /usr/local/pnp4nagios/var/perfdata
process_perfdata.pl Logfile: /usr/local/pnp4nagios/var/perfdata.log
Perfdata files (NPCD) stored in: /usr/local/pnp4nagios/var/spoolWeb Interface Options:
------------------------- -------------------
HTML URL: http://localhost/pnp4nagios
Apache Config File: /etc/httpd/conf.d/pnp4nagios.conf
Review the options above for accuracy. If they look okay,
type 'make all' to compile.
開始編譯
make all
make fullinstall
主要設定
編輯 /etc/httpd/conf.d/pnp4nagios.conf
...
AuthUserFile /etc/nagios/htpasswd.users <-- 將這行改成與 Nagio 設定相同
...
瀏覽首頁:http://xxx.xxx.xxx.xxx/pnp4nagios/
如果頁面的內容沒有出現錯誤,將以下檔案作更名
mv /usr/local/pnp4nagios/share/install.php /usr/local/pnp4nagios/share/install.php.xxx
Nagios 設定
編輯 /etc/nagios/nagios.cfg
process_performance_data=1
service_perfdata_command=process-service-perfdata
host_perfdata_command=process-host-perfdata
編輯 /etc/nagios/objects/commands.cfg
define command {
command_name process-host-perfdata
command_line /usr/bin/perl /usr/local/pnp4nagios/libexec/process_perfdata.pl -d HO
STPERFDATA
}
define command {
command_name process-service-perfdata
command_line /usr/bin/perl /usr/local/pnp4nagios/libexec/process_perfdata.pl
}
編輯 /etc/nagios/objects/templates.cfg
將 generic-host 與 generic-service 的 process_perf_data 改為 0,否則預設為 1 時,所有的 host 與 服務都會自動啟用這功能。
define host {
name generic-host
...
process_perf_data 0
...
}
define service {
name generic-service
...
process_perf_data 0
...
}
對特定 host 或 service 啟用圖形功能
編輯 /etc/nagios/objects/MES-servers.cfg,在 host 或 service 的設定裡加上 process_perf_data 1
註:MES-server.cfg 是以筆者環境為例
define host {
use generic-host
host_name ap1
...
process_perf_data 1
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
...
}
define service {
use generic-service
host_name ap1
service_description PING
...
process_perf_data 1
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
...
}
實際案例:DB2 的資料表空間使用率監控
MES-server.cfg:
define service{
use generic-service
host_name bdb1
service_description MMDB_MMTBS01
contact_groups adm-alang
notifications_enabled 0
check_command check_db2_tbs_usage!-d MMDB -t MMTBS01 -u istflr -p istflr
max_check_attempts 1
normal_check_interval 60
process_perf_data 1
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=$SERVICEDESC$
}
commands.cfg:
# 'check_db2_tbs_usage' command definition
define command{
command_name check_db2_tbs_usage
command_line sudo -u db2inst ksh -c "~/bin/db2_check_tbs_usage.sh $ARG1$ "
}
~dn2inst/bin/db2_check_tbs_usage.sh:
#!/bin/ksh
##############################################################
# Author: Felipe Alkain de Souza
#
# Script Name: db2_check_tbs_usage.sh
#
# Functionality: This script checks DB2 tablespace utilization
#
# Usage: ./db2_check_tbs_usage.sh -d <database_name> -t <tbs_name> -u <db_user> -p <db_pass>
#
# Requisite settings:
# - visudo
# #Defaults requiretty <== comment out this line
# nagios ALL=(ALL) NOPASSWD: ALL
#
# - Create DB2 catalog for the DBs that are monitored.
#
#
# Update:
# 2013/9/17 by A-Lang
#
##############################################################
. $HOME/sqllib/db2profile
### Nagios RCs Variables
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
Usage() { echo "Usage: $0 [-d <databas_name>] [-t <tbs_name>] [-u <db_user>] [-p <db_pass>]"; }
ToUpper() {
echo $1 | tr "[:lower:]" "[:upper:]"
}
GetOut(){
db2 terminate > /dev/null 2>&1
sleep 2
exit $1
}
while getopts ":d:t:u:p:" o; do
case "$o" in
d)
d=$OPTARG
;;
t)
t=$OPTARG
;;
u)
u=$OPTARG
;;
p)
p=$OPTARG
;;
\?)
echo "Invalid option: -$OPTARG"
Usage
;;
:)
echo "Option -$OPTARG requires an argument."
Usage
;;
esac
done
if [ $OPTIND -ne 9 ]; then
echo "Invalid options entered."
Usage
exit $STATE_UNKNOWN
fi
DB_NAME=$(ToUpper $d)
DB_TBS=$(ToUpper $t)
DB_USER=$u
DB_PASS=$p
db2 terminate > /dev/null 2>&1
db2 connect to $DB_NAME user $DB_USER using $DB_PASS > /dev/null 2>&1
if [ $? -ne 0 ]
then
echo "DB2 CRITICAL - The database $DB_NAME did not connect!"
GetOut $STATE_CRITICAL
fi
TBS_USAGE=`db2 -x "select ' ' CONCAT (SUBSTR(CHAR(DECIMAL(USED_PAGES, 10, 2)/ \
DECIMAL(TOTAL_PAGES,10,2)*100),9,5)) CONCAT '%' as PERCENT_USED \
from table (snapshot_tbs_cfg('${DB_NAME}', 0)) as t \
where TABLESPACE_TYPE=0 and TABLESPACE_NAME='${DB_TBS}'" | sed -e 's/%//g' -e 's/ //g'`
#echo $TBS_USAGE
if [ -z $TBS_USAGE -o $? -ne 0 ]
then
echo "Unknown Tablespace $DB_TBS !"
GetOut $STATE_UNKNOWN
fi
PERF_DATA="|'Disk Utilization'=${TBS_USAGE}%;90;95;"
if [ $TBS_USAGE -lt 90 ]; then
echo "TABLESPACE OK - The database $DB_NAME is healthy now , the used disk space of the tablespace $DB_TBS is ${TBS_USAGE}% . $PERF_DATA"
GetOut $STATE_OK
elif [ $TBS_USAGE -gt 90 -a $TBS_USAGE -lt 95 ]; then
echo "TABLESPACE WARNING - The used disk space of the tablespace $DB_TBS is ${TBS_USAGE}%, crossing the threshold. $PERF_DATA"
GetOut $STATE_WARNING
else
echo "TABLESPACE CRITICAL - The used disk space of the tablespace $DB_TBS is ${TBS_USAGE}%, crossing the threshold. $PERF_DATA"
GetOut $STATE_CRITICAL
fi
#db2 terminate
db2 terminate > /dev/null 2>&1
sleep 1
設定 popup 顯示(optional)
從 pnp4nagios 安裝程式裡複製 status-header.ssi
cp <pnp4nagios 原始程式目錄>/contrib/ssi/status-header.ssi /usr/share/nagios/ssi
NOTE:
此檔不可有執行的權限
/usr/share/nagios/ssi 此目錄會因為 nagios 安裝版本不同有所差異
編輯 /etc/nagios/objects/MES-servers.cfg,改變 action_url
define host {
use generic-host
host_name ap1
...
process_perf_data 1
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=_HOST_
...
}
define service {
use generic-service
host_name ap1
service_description PING
...
process_perf_data 1
action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$' class='tips' rel='/pnp4nagios/index.php/popup?host=$HOSTNAME$&srv=$SERVICEDESC$
...
}
Performance Data 客制化
Performance Data 格式
label=value[UOM];[warning-range];[critical-range];[min];[max]
HTTP 輸出資訊範例
HTTP OK: HTTP/1.1 200 OK - 46869 bytes in 0.294 second response time | time=0.294561s;;;0 size=46869B;;;0
Tip: 資訊內容從 | 符號以後的就是 Performance Data
Performance Data 格式更多詳細資訊如下:
- space separated list of label/value pairs
- label can contain any characters
- the single quotes for the label are optional. Required if spaces, = or ' are in the label
- label length is arbitrary, but ideally the first 19 characters are unique (due to a limitation in RRD). Be aware of a limitation in the amount of data that NRPE returns to Nagios
- to specify a quote character, use two single quotes
- warn, crit, min/ or max/ may be null (for example, if the threshold is not defined or min and max do not apply). Trailing unfilled semicolons can be dropped
- min and max are not required if UOM=%
- value, min and max in class [-0-9.]. Must all be the same UOM
- warn and crit are in the range format (see Section 2.5). Must be the same UOM
UOM (unit of measurement) is one of:- no unit specified - assume a number (int or float) of things (eg, users, processes, load averages)
- s - seconds (also us, ms)
- % - percentage
- B - bytes (also KB, MB, TB, GB?)
- c - a continous counter (such as bytes transmitted on an interface)