The Perfect Load-Balanced & High-Availability Web Cluster With 2 Servers Running Xen On Ubuntu 8.04 Hardy Heron - Page 9
15. Custom scripts for monitoring (lb1, lb2, web1, web2)I made a few bash script to monitor the whole setup (they are a bit ugly but they work). If you make them better, feel free to mail them to me!
15.1 Monitoring from lb1.example.comFirst we must install sendmail so lb1.example.com will be able to send mail : apt-get install sendmail The first script will check if the backup load balancer (lb2.example.com) is still available to takeover : vi /root/lb2_check #!/bin/bash
# Backup load balancer check
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does 1 verification ###
### 1) Check if backup load balancer failed and send mail notification ###
### To be modified ###
EMAIL="admin@example.com"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
### To restore to original when problem fixed ###
if [ $1 ]; then
if [ $1=="fix" ]; then
rm /root/lb2_problem.txt
> /var/log/ha-log
exit 1;
fi
fi
### Check if already notified ###
cd /root
if [ -f lb2_problem.txt ]; then
exit 1;
fi
### Check if Heartbeat is running on hot standby ###
tail /var/log/ha-log 2>&1 | grep "Asking other side for ping node count"
if [ "$?" -ne "1" ]; then
echo "Backup load balancer failed" > /root/lb2_problem.txt
$MAIL -s "Backup load balancer problem" $EMAIL < /root/lb2_problem.txt
fi
We make this script executable : chmod +x /root/lb2_check If the lb2.example.com fails, then it will create a file /root/lb2_problem.txt and send a mail notification. Until the file lb2_problem.txt is there, it won't check again. Also we must empty the log file once the problem is fixed for the script to work properly. Once the problem is fixed on lb2.example.com, please manually run : /root/lb2_check fix The next script will check if any ports failed on either web1 or web2 by checking the ldirectord log file. There is already a mail notification with ldirectord but it sends millions of notification, mine only send one until you fix the problem : vi /root/ports_failed and make it look like this : #!/bin/bash
# Ldirectord ports failure check
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does 1 verification ###
### 1) Check for port failure on load balanced servers ###
### To be modified ###
EMAIL="admin@example.com"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
#to restore to original when problem fixed
if [ $1 ]; then
if [ $1=="fix" ]; then
rm /root/port_problem.txt
> /var/log/ldirectord.log
fi
fi
###check if already notified###
cd /root
if [ -f port_problem.txt ]; then
cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
exit 1;
fi
### Check if port failed ###
cat /var/log/ldirectord.log 2>&1 | grep Deleted
if [ "$?" -ne "1" ]; then
cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
cat "Ports problem see logfile /var/log/port_problem.log" > /root/port_problem.txt
$MAIL -s "Some ports failed" $EMAIL < /root/port_problem.txt
fi
We make it executable : chmod +x /root/ports_failed This is the same as the first script, once the problem is fixed you must run : /root/ports_failed fix in order to make the script running again. Now add both scripts to your crontab : crontab -e * * * * * /root/ports_failed >/dev/null 2>&1 * * * * * /root/lb2_check >/dev/null 2>&1
15.2 Monitoring from lb2.example.comMonitoring the second load balancer is important because it will tell us if the master load balancer failed and if it did, keep an eye for ports failure on web1 and web2. First we must install sendmail so lb2.example.com will be able to send mail : apt-get install sendmail vi /root/ports_check And paste this script :#!/bin/bash
# Ldirectord ports failure check
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does 2 verifications ###
### 1) check if master load balancer failed and send mail notification ###
### 2) If master load balancer failed, check for port failure on load balanced servers ###
### To be modified ###
EMAIL="admin@example.com"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
### Date ###
NOW=$(date)
### To restore to original when problem fixed ###
if [ $1 ]; then
cd /root/
if [ $1=="fix" ]; then
if [ -f lb1_problem.txt ]; then
rm /root/lb1_problem.txt
fi
if [ -f port_problem.txt ]; then
rm /root/port_problem.txt
fi
if [ -f /root/server_problem_notified.txt ]; then
rm /root/server_problem_notified.txt
fi
> /var/log/ldirectord.log
> /var/log/ha-log
exit 1;
fi
fi
#check if ldirectord is running on lb2.example.com (means that lb1.example.com failed)
#$LDIRECTORD /etc/ha.d/ldirectord.cf status 2>&1 | grep running
cat /var/log/ha-log | grep "takeover complete" > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
###check if already notified###
cd /root
if [ -f port_problem.txt ]; then
cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
exit 1;
fi
### Check if port failed ###
cat /var/log/ldirectord.log 2>&1 | grep Deleted
if [ "$?" -ne "1" ]; then
cat /var/log/ldirectord.log | grep Deleted > /var/log/port_problem.log
echo "Ports problem see logfile /var/log/port_problem.log" > /root/port_problem.txt
$MAIL -s "Some ports failed" $EMAIL < /root/port_problem.txt
fi
### Check if already notified that master load balancer failed ###
cd /root
if [ -f server_problem_notified.txt ]; then
exit 1;
fi
### Notify that master load balancer failed ###
cd /root
MESSAGE="$NOW : Master load balancer failed"
echo $MESSAGE > lb1_problem.txt
$MAIL -s "Master load balancer failed" $EMAIL < /root/lb1_problem.txt
echo "notified" > server_problem_notified.txt
fi
We make it executable : chmod +x /root/ports_check And we add it to our crontab : crontab -e * * * * * /root/ports_failed >/dev/null 2>&1 When you get a notification from the script, please run afterward : /root/ports_check fix
15.3 Monitoring from web1 & web2Monitoring of web cluster is already partially done with monit and munin. The part that is not covered yet is the monitoring of MySQL replication. Please read the following article : Repair MySQL master-master replicationMySQL monitoring is optional but on a production server, problems can happend with MySQL replication so I really recommend using those scripts or something similar to check databases consistency.
15.4 Monitoring from remote serverThis part is adding extra security by checking important ports (25,53,80,443) from a remote server (install dns-utils for dig): #!/bin/bash
# Script to check important port on remote webserver
# Copyright (c) 2008 blogama.org
# This script is licensed under GNU GPL version 2.0 or above
# ---------------------------------------------------------------------
### This script does a verification on port 25, 53, 80 and 443 ###
### After 2 failed check it will send a mail notification ###
### To be modified ###
WEBSERVERIP="192.168.1.106"
MAILSERVERIP="192.168.1.106"
EMAIL="admin@example.com"
DNSSERVERIP="192.168.1.106"
DOMAINTOCHECKDNS="example.com"
DOMAINIP="192.168.1.106"
###### Do not make modifications below ######
### Binaries ###
MAIL=$(which mail)
TELNET=$(which telnet)
DIG=$(which dig)
### Check if already notified###
cd /root
if [ -f server_problem.txt ]; then
exit 1;
fi
### Test SMTP ###
(
echo "quit"
) | $TELNET $MAILSERVERIP 25 | grep Connected > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
echo "PORT CONNECTED"
else
if [ -f server_problem_first_time_25.txt ]; then
echo "PORT 25 NOT CONNECTED" >> /root/server_problem.txt
else
echo "NOT CONNECTED" > /root/server_problem_first_time_25.txt
fi
fi
### Test HTTP ###
(
echo "quit"
) | $TELNET $WEBSERVERIP 80 | grep Connected > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
echo "PORT CONNECTED"
else
if [ -f server_problem_first_time_80.txt ]; then
echo "PORT 80 NOT CONNECTED" >> /root/server_problem.txt
else
echo "NOT CONNECTED" > /root/server_problem_first_time_80.txt
fi
fi
### Test HTTPS###
(
echo "quit"
) | $TELNET $WEBSERVERIP 443 | grep Connected > /dev/null 2>&1
if [ "$?" -ne "1" ]; then
echo "PORT CONNECTED"
else
if [ -f server_problem_first_time_443.txt ]; then
echo "PORT 81 NOT CONNECTED" >> /root/server_problem.txt
else
echo "NOT CONNECTED" > /root/server_problem_first_time_443.txt
fi
fi
### Test DNS ###
$DIG $DOMAINTOCHECKDNS @$DNSSERVERIP | grep $DOMAINIP
if [ "$?" -ne "1" ]; then
echo "PORT CONNECTED"
else
if [ -f server_problem_first_time_53.txt ]; then
echo "PORT 53 NOT CONNECTED" >> /root/server_problem.txt
else
echo "NOT CONNECTED" > /root/server_problem_first_time_53.txt
fi
fi
### Send mail notification after 2 failed check ###
if [ -f server_problem.txt ]; then
$MAIL -s "Server problem" $EMAIL < /root/server_problem.txt
fi
Et voila! Feel free to send me private emails at admin [at] marchost.com or post comments here or on my page : blogama.org
|
www.seamlessenterprise.com
One number. One voicemail. Seize the lead. Sprint Mobile Integration.
www.seamlessenterprise.com
One Number. One Voicemail.
Make it easier for clients to reach you. Turn your desk phone and mobile phone into one with Sprint Mobile Integration.
www.seamlessenterprise.com
One number. One voicemail. Sprint Mobile Integration.
www.seamlessenterprise.com
AT&T Synaptic Compute as a Service. Boost your power on demand.
Trial: IBM Cognos Express Reporting, Analysis & Planning




print: 

Recent comments
13 hours 42 min ago
17 hours 36 min ago
18 hours 22 min ago
21 hours 46 min ago
1 day 1 hour ago
1 day 2 hours ago
1 day 4 hours ago
1 day 9 hours ago
1 day 9 hours ago
1 day 13 hours ago