生产环境WEB服务管理脚本之监控脚本

环境说明:

公司是做在线教育的互联网企业,WEB架构为:前端使用LVS + Heartbeat做负载均衡,后端主要是Apache/Nginx + Tomcat,缓存有redis和Memcached,数据库使用的Oracle和Mysql。

脚本实现目的:

通过在服务器本地运行脚本,当检测到服务不可用时,自动重启相关服务,并发邮件通知管理员。

通过脚本检测服务关键配置文件,若文件丢失,则根据备份自动恢复文件,并重启。

脚本思路:

通过脚本访问事先定义好的监控页面,,一共访问三次,每次访问间隔几秒钟,若三次访问都失败,则认为服务故障,重启服务并发邮件。配置文件的备份则是通过重启脚本来完成,所有的服务都是通过配置文件的方式传递给脚本,以方便批量部署。

脚本内容:

#!/bin/bash#Thisshell-scriptisuseforchecktomcatandapache/nginxalive#Createdin2012-11-01#Lastchangedin2013-05-25source~/.bash_profile&>/dev/nulldir=”dir=/tol/appdir2=/tol/scriptcd$dir2dt2=`date+”%Y-%m-%d”`ls$dir2/logs&>/dev/null||mkdir-p$dir2/logs&>/dev/nulllog=”$dir2/logs/tomcatwget-$dt2.log”host=`/sbin/ifconfig|grep”inetaddr”|cut-d’:’-f2|awk'{print$1}’|head-1`wgetfile=$dir2/tomcat_wgetfileconf=$dir2/tomcatwget.confconf3=$dir2/web.confsh_name=$0functionshijian(){dt=`date+”%Y-%m-%d-%H:%M:%S”`}functionapache_restart(){#stopuser=`ps-elf|grep-vgrep|grep-vroot|grep’/bin/httpd’|grep”$dir”|awk'{print$3}’|head-1`$dir/apache/bin/httpd-kstop&>/dev/nullsleep5opid=”opid=`ps-elf|grep-vgrep|grep”httpd”|grep”$dir”|awk'{print$4}’`iftest-n”$opid”;thenkill-9$opidfi#start/usr/bin/ipcs-s|grep”$user”|gawk'{print$2}’|xargs-n1ipcrmsem&>/dev/null$dir/apache/bin/httpd-kstart&>/dev/nullnpid=`ps-elf|grep-vgrep|grep”httpd”|greproot|grep”$dir”|awk'{print$4}’`shijianiftest-z”$npid”;thenecho”The$dt$hostapacherestartfailby$sh_name”>>$logecho”The$dt$hostapacherestartfailby$sh_name”|mail-s”check$hostapacherestartfail”jiankelseecho”The$dt$hostapacheisrestart,thenewpid=$npid”>>$logecho”The$dt$hostapacheisrestartby$sh_name,thenewpid=$npid”|mail-s”check$hostapacherestart”jiankfi}functionnginx_restart(){#stopopid=”opid=`cat$dir/nginx/logs/nginx.pid`kill-QUIT$opidsleep3opid=`ps-elf|grep-vgrep|grepprocess|grepnginx|awk'{print$4}’`iftest!-z”$opid”;thenkill-9$opidfi#start$dir/nginx/sbin/nginx-c$dir/nginx/conf/nginx.confnpid=”npid=`ps-elf|grep-vgrep|grepprocess|grepnginx|awk'{print$4}’`shijianiftest-z”$npid”;thenecho”The$dt$hostnginxrestartfailby$sh_name”>>$logecho”The$dt$hostnginxrestartfailby$sh_name”|mail-s”check$hostnginxrestartfail”jiankelseecho”The$dt$hostnginxisrestart,thenewpid=$npid”>>$logecho”The$dt$hostnginxisrestartby$sh_name,thenewpid=$npid”|mail-s”check$hostnginxrestart”jiankfi}functionhttp_restart(){iftest!-f$conf3;thenecho”The$conf3isnotexist”>>$logapache_restartsleep3nginx_restartelsewhilereadlinedodir=`echo$line|awk-F\;'{print$1}’`softname=`echo$line|awk-F\;'{print$2}’`$softname\_restartdone<$conf3fi}functionweb_reboot(){http_restartsh$dir2/tomcat-restart.sh$name&>/dev/nullexit0}#conf-checkiftest!-f$conf;thenecho”The$confisnotexist”exit0fishijianecho”$dt”>>$logecho”$host”>>$log##Apache-checkdomain=`/usr/bin/head-1$conf|awk-F\;'{print$1}’`iftest`echo$domain|grephttp`;thenhtml=”$domain”elsehtml=http://$domain/t-s.htmlficat/dev/null>$wgetfile/usr/bin/elinks-dump$html>>$wgetfile2>&1sleep3/usr/bin/elinks-dump$html>>$wgetfile2>&1sleep3/usr/bin/elinks-dump$html>>$wgetfile2>&1num=”num=`grep-i-c”ok”$wgetfile`if(($num<1));thenshijianecho”The$dt$host$htmlwgeterror”>>$logecho”The$dt$host$htmlwgeterror”|mail-s”checkthe$hostwgeterror”jiankhttp_restartelif(($num<3&&$num>=1));thenecho”The$host$htmlwgetoknumberis$num”|mail-s”$hostwgetoknumber”jiankelseecho”The$host$htmlwgetsuccess”>>$logfiwhilereadlinedoshijianfile1=”file1=`echo$line|awk-F\;'{print$2}’`file2=”file2=`echo$line|awk-F\;'{print$3}’`name=`echo$line|awk-F\;'{print$4}’`port=”port=`echo$line|awk-F\;'{print$5}’`functionpid(){pid=”pid=`ps-elf|grepjava|grep”$dir”|grep$name|awk'{print$4}’`}functiondown(){`echo$line|awk-F\;'{print$6}’`sleep3pidiftest-n”$pid”;thenkill-9$pidfirm-rf$dir/$name/work/Catalina/*rm-rf$dir/$name/temp/*}functionup(){`echo$line|awk-F\;'{print$7}’`}#web.xml-checkiftest”$file1″;theniftest!-f$file1;thenshijian\cp$file1.bak$file1&&chowntomcat.tomcat$file1&>/dev/nulliftest-f$file1;thenecho”The$dt$file1islostandwillberecovery”>>$logecho”The$dt$file1islostandwillberecovery”|mail-s”checkthe$host$file1lost”jiankweb_rebootelseecho”The$dt$file1islostandrecoveryfail”>>$logecho”The$dt$file1islostandrecoveryfail”|mail-s”checkthe$host$file1lost”jiankfielseecho”The$dt$file1isok”>>$logfifi#ROOT.xml-checkiftest”$file2″;theniftest!-f$file2;thenshijian\cp$file2.bak$file2&&chowntomcat.tomcat$file2&>/dev/nullecho”The$dt$file2donotexist”>>$logecho”The$dt$file2donotexist”|mail-s”checkthe$host$file2lost”jiankdownuppidiftest-n”$pid”;thenecho”$dt$host$file2islostandrestartby$sh_name”|mail-s”$host$file2islost”jiankelseupecho”The$dt$host$file2islostandrestartfailby$sh_name”>>$logecho”The$dt$host$file2islostandrestartfailby$sh_name”|mail-s”check$host$namerestartfail”jiankfielseecho”The$dt$file2isok”>>$logfifi#tomcat-checkiftest”$port”;theniftest`echo$port|grephttp`;thenjsp=”$port”elsejsp=http://127.0.0.1:$port/t-s.jspficat/dev/null>$wgetfile/usr/bin/elinks-dump$jsp>>$wgetfile2>&1sleep5/usr/bin/elinks-dump$jsp>>$wgetfile2>&1sleep5/usr/bin/elinks-dump$jsp>>$wgetfile2>&1num=”num=`grep-i-c”ok”$wgetfile`if(($num<1));thenshijianecho”The$dt$host$namewgeterror3timesandbegintorestart”>>$logdownuppidiftest-n”$pid”;thenecho”$dtRestart$namesuccesspid=”$pid>>$logecho”$dt$host$namewgeterrorandrestartby$sh_name”|mail-s”$host$namewgeterror”jiankelseupecho”The$dt$host$namewgeterrorandrestartfailby$sh_name”>>$logecho”The$dt$host$namewgeterrorandrestartfailby$sh_name”|mail-s”check$host$namerestartfail”jiankfielif(($num<3&&$num>=1));thenecho”The$host$namewgetoknumberis$num”|mail-s”$hosttomcatwgetoknumber”jiankelseecho”The$host$namewgetsuccess”>>$logfifidone<$conf绊脚石乃是进身之阶。

生产环境WEB服务管理脚本之监控脚本

相关文章:

你感兴趣的文章:

标签云: