您好,登錄后才能下訂單哦!
可能同學經(jīng)常會遇到生產(chǎn)環(huán)境下的某臺跑Java的服務器,在剛發(fā)布時的時候一切都很正常,在運行一段時間后就出現(xiàn)CPU占用很高或負載飆高等現(xiàn)象,好一點的負載或CPU一天比一天高,差的情況,就是隨機進行抖動,后又恢復正常,給運維及開發(fā)同學帶來了不少困擾。當然,出現(xiàn)此問題時,后續(xù)要如何改進,諸如:代碼上線前要進行review、相關(guān)強弱依賴服務隔離/降級等、單元測試、回歸測試、SQL上線審核、基礎(chǔ)及業(yè)務監(jiān)控、相關(guān)流程制度等。
若CPU使用率或負載飆高,且持續(xù)時間較長,網(wǎng)上也有大量的排查步驟
方法一
1.使用top定位占用CPU高的進程PID
top
2.獲取線程信息
ps -mp PID ?-o THREAD,tid,time | sort -rn?
3.將需要的線程ID轉(zhuǎn)換為16進制格式
printf "%x\n" tid
4.打印線程的堆棧信息
jstack pid ?|grep ?tid ?#這里的tid就是步驟3生成的 十六進制格式的tid
方法二(推薦)
可快速定位thread及thread的cpu使用率
#!/bin/bash #?@Function #?Find?out?the?most?cpu?consumed?threads?of?java,and?print?the?stack?trace?of?these?threads. # #?@Usage #???$./javacpu?-h # PROG=`basename?$0` usage(){ cat?<<EOF Usage:?${PROG}?[OPTION]?... Find?out?the?highest?cpu?consumed?threads?of?java,and?print?the?stack?of?these?threads. Example:?${PROG}?-c?10 Options: -p,--pid????find?out??highest?cpu?consumed?threads?from?the?specifed?java?process, default?from?all?java?process. -c,--count???set?the?thread?count?to?show,default?is?5 -h,--help????display?this?help?and?exit EOF exit?$1 } ARGS=`getopt?-n?"$PROG"?-a?-o?c:p:h?-l?count:,pid:,help?--?"$@"?` [?$??-ne?0?]?&&?usage?1 eval?set?--?"${ARGS}" while?true;do case?"$1"?in -c|--count) count="$2" shift?2 ;; -p|--pid) pid="$2" shift?2 ;; -h|--help) usage ;; --) shift break ;; esac done count=${count:-10} redEcho(){ [?-c?/dev/stdout?]?&&{ #?if?stdout?is?console,turn?on?color?output. echo??-ne?"\033[1;31m" echo?-n?"$@" echo?-e?"\033[0m" ? }?||?echo?"$@" } ##?check?jstack?cmd if?!?which?jstack?&>?/dev/null;?then [?-n?"$JAVA_HOME"?]?&&?[?-f?"$JAVA_HOME/bin/jstack"?]?&&?[?-x?"$JAVA_HOME/bin/jstack"?]?&&{ export?PATH="$JAVA_HOME/bin:$PATH" }?||?{ redEcho?"Error:jstack?nof?found?on?PATH?and?JAVA_HOME!" exit?1 } fi uuid=`date?+%s`_${RANDOM}_$$ cleanupWhenExit(){ rm?/tmp/${uuid}_*?&>?/dev/null } trap?"cleanupWhenExit"?EXIT printStackOfThread(){ while?read?threadLine?;?do pid=`echo?${threadLine}?|?awk?'{print?$1}'` threadId=`echo?${threadLine}?|?awk?'{print?$2}'` threadId0x=`printf?%x?${threadId}` user=`echo?${threadLine}??|?awk?'{print?$3}'` pcpu=`echo?${threadLine}???|?awk?'{print?$5}'` jstackFile=/tmp/${uuid}_${pid} [?!?-f?"${jstackFile}"??]?&&?{ jstack?${pid}?>?${jstackFile}?||{ redEcho?"Fail?to?jstack?java?process?${pid}!" rm?${jstackFile} continue } } redEcho?"The?stack?of?busy(${pcpu}%)?thread(${threadId}/0x${htreadId0x}) of?java?process(${pid})?of?user(${user}):" sed?"/nid=0x${threadId0x}/,/^$/p"?-n?${jstackFile} done } [?-z?"${pid}"?]?&&?{ ps?-Leo?pid,lwp,user,comm,pcpu?--no-headers|awk?'$4=="java"{print?$0}'?|sort?-k5?-r?-n?|head?--lines?"${count}"?|?printStackOfThread }?||?{ ps?-Leo?pid,lwp,user,comm,pcpu?--no-headers?|awk?-v?"pid=${pid}"?'$1==pid,$4=="java"{print?$0}'?|?sort?-k5?-r?-n?|head?--lines?"${count}"?|?printStackOfThread }
方法三(針對Java服務器的load負載隨機抖動情況)
#!/usr/bin/env?python import?os import?time,?datetime import?threading #?desc:?when?system?loadavg?1?min?load?lt?10,then?dump?java?jstack def?load_stat(): loadavg?=?{} f?=?open("/proc/loadavg") info?=?f.read().split() f.close() loadavg['lavg_1']?=?info[0] loadavg['lavg_5']=?info[1] loadavg['lavg_15']=?info[2] start_time?=?datetime.datetime.strptime(str(datetime.datetime.now().date())?+?'00:00',?'%Y-%m-%d%H:%M') curr_time?=?datetime.datetime.now() end_time?=?datetime.datetime.strptime(str(datetime.datetime.now().date()?+?datetime.timedelta(days=2))?+?'23:59',?'%Y-%m-%d%H:%M') if?(start_time?<=?curr_time??<=?end_time?)?: if?float(loadavg['lavg_1'])?>=?11: pid?=?os.popen("jps?|grep?-v?Jps|awk?'{print?$1}'").read() cmd?=?"jstack"?+?"?"?+?pid stack?=?os.popen(cmd).read() tm?=?time.strftime("%Y-%m-%d_%H-%M-%S",?time.localtime()) timeslog?=?'java_stack_'?+?tm?+?r'.txt' log_f?=?open(timeslog,?'w') log_f.write(stack) log_f.close() cmd_2="ps?-mp?"?+?pid.strip('\n')?+?"?-o?THREAD,tid,time?|?sort?-rn" top_tid_info=os.popen(cmd_2).read() cpu_tid_logs='tid_cpu_'?+?tm?+?r'.txt' log_f2?=?open(cpu_tid_logs,'w') log_f2.write(top_tid_info) log_f2.close() threading.Timer(5,?load_stat).start() else: threading.Timer(5,?load_stat).start() else: exit #return?loadavg load_stat()
免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。