您好,登錄后才能下訂單哦!
親愛(ài)的讀者朋友:
為了及時(shí)共享行業(yè)案例,通知共性問(wèn)題,達(dá)成共享和提前預(yù)防,我們整理和編輯了《云和恩墨技術(shù)通訊》,通過(guò)對(duì)過(guò)去一段時(shí)間的知識(shí)回顧,故障歸納,以期提供有價(jià)值的信息供大家參考。同時(shí),我們也希望能夠?qū)狳c(diǎn)事件、新的產(chǎn)品特性及其他有價(jià)值的信息聚集起來(lái),為您提供具有前瞻性的支持信息,保持對(duì)于當(dāng)前最新的數(shù)據(jù)庫(kù)新聞和事件的了解,其中包括重要數(shù)據(jù)庫(kù)產(chǎn)品發(fā)布、警報(bào)、更新、新版本、補(bǔ)丁等。
本期目錄:
新聞:2019年11月數(shù)據(jù)庫(kù)流行度排行
經(jīng)驗(yàn):Oracle RAC跨節(jié)點(diǎn)訪問(wèn)數(shù)據(jù)塊,節(jié)點(diǎn)長(zhǎng)事務(wù)加劇gc等待
經(jīng)驗(yàn):IBM MQ通道連接數(shù)達(dá)到最大故障分析
問(wèn)題:子游標(biāo)過(guò)多導(dǎo)致數(shù)據(jù)庫(kù)HANG
頻發(fā):再談Library Cache Lock
頻發(fā):故障排除之又見(jiàn)ORA-4031
警示:強(qiáng)制關(guān)閉OGG進(jìn)程觸發(fā)bug致abended
公告:首屆墨天輪年度十大突出貢獻(xiàn)人物評(píng)選活動(dòng)
云和恩墨技術(shù)通訊集錦: https://www.modb.pro/doc/topic/5927
部分精選-頻發(fā):故障排除之又見(jiàn) ORA-4031 作者:候靜遠(yuǎn)
當(dāng)遇到ORA-4031錯(cuò)誤時(shí),你會(huì)不會(huì)內(nèi)心一緊。Oracle進(jìn)程在向SGA申請(qǐng)內(nèi)存時(shí),如果申請(qǐng)失敗,則會(huì)拋出這個(gè)錯(cuò)誤,大部分情況下是在向SGA中的 shared pool申請(qǐng)內(nèi)存時(shí)失敗。嚴(yán)重情況下,可能導(dǎo)致數(shù)據(jù)庫(kù)出現(xiàn)異常崩潰。本文分享客戶近期碰到的一起由于ORA-4031問(wèn)題導(dǎo)致數(shù)據(jù)庫(kù)異常宕機(jī)的案例,供大家參考。
問(wèn)題描述
2019年9月4日凌晨3點(diǎn)左右,接到監(jiān)控系統(tǒng)告警:數(shù)據(jù)庫(kù)出現(xiàn)異常,無(wú)法連接。登陸到數(shù)據(jù)庫(kù)1節(jié)點(diǎn)查看后臺(tái)alert日志發(fā)現(xiàn)有大量ORA-04031報(bào)錯(cuò),2節(jié)點(diǎn)有少量報(bào)錯(cuò)。為了盡快恢復(fù)業(yè)務(wù),嘗試直接重啟1節(jié)點(diǎn)數(shù)據(jù)庫(kù),重啟完成之后恢復(fù)正常。
問(wèn)題分析
1. 節(jié)點(diǎn)后臺(tái)對(duì)應(yīng)alert日志:
Wed Sep 04 03:57:50 2019
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set ...","sga heap(2,0)","kglsim object batch")
Wed Sep 04 03:58:10 2019
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set ...","sga heap(1,0)","kglsim object batch")
Wed Sep 04 03:58:26 2019
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set ...","sga heap(7,0)","kglsim object batch")
Wed Sep 04 03:58:42 2019
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set ...","sga heap(6,0)","kglsim object batch")
Wed Sep 04 03:58:57 2019
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.mon_mods$ set ins...","sga heap(5,0)","kglsim object batch")
Wed Sep 04 03:59:08 2019
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_xxx0_42548.trc:
ORA-00604: error occurred at recursive SQL level 1
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select count(*) from sys.job...","sga heap(3,0)","kglsim object batch")
Wed Sep 04 03:59:10 2019
License high water mark = 97
USER (ospid: 28750): terminating the instance
統(tǒng)計(jì)1節(jié)點(diǎn)每個(gè)子池及duration出現(xiàn)04031的次數(shù),sga heap(n,0)-n代表第幾個(gè)子池,0代表是第幾個(gè)duration:
根據(jù)alert日志可以看出,所有的ora-4031都發(fā)生在shared pool子池的第0個(gè)duration上。
Summary of resize operations history:
shared pool start 3.19 GB now 3.19 GB 0 grows 0 shrinks
large pool start 0.50 GB now 0.50 GB 0 grows 0 shrinks
java pool start 0.50 GB now 0.50 GB 0 grows 0 shrinks
SGA Target start 32.00 GB now 32.00 GB 0 grows 0 shrinks
DEFAULT buffer cache start 27.59 GB now 27.59 GB 0 grows 0 shrinks
PGA Target start 11.00 GB now 11.00 GB 0 grows 0 shrinks
發(fā)現(xiàn)shared pool并沒(méi)有進(jìn)行resize。
==============================================
TOP 20 MEMORY USES ACROSS SGA HEAP 1 - 7
----------------------------------------------
"KGLH0 " 1103 MB 19%
"SQLA " 1081 MB 18%
"free memory " 835 MB 14%
"gcs resources " 794 MB 14%
"gcs shadows " 550 MB 9%
"db_block_hash_buckets " 178 MB 3%
"ASH buffers " 160 MB 3%
"KGLHD " 157 MB 3%
"Checkpoint queue " 156 MB 3%
"kglsim object batch " 90 MB 2%
"kglsim heap " 56 MB 1%
"ges resource " 53 MB 1%
"ges enqueues " 43 MB 1%
"KGLDA " 41 MB 1%
"dbwriter coalesce buffer " 40 MB 1%
"dirty object counts array " 40 MB 1%
"object queue " 35 MB 1%
"gcs res hash bucket " 32 MB 1%
"dbktb: trace buffer " 31 MB 1%
"FileOpenBlock " 30 MB 1%
TOTALS ---------------------------------------
Total free memory 830 MB
Total memory alloc. 5026 MB
Grand total 5856 MB
==============================================
2. 節(jié)點(diǎn)后臺(tái)對(duì)應(yīng)的alert日志:
Wed Sep 04 03:23:18 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q002_35378.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(4,0)","kglsim object batch")
Wed Sep 04 03:23:23 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q003_35453.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(1,0)","kglsim object batch")
Wed Sep 04 03:23:29 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q004_35725.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(5,0)","kglsim object batch")
Wed Sep 04 03:23:34 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q001_35778.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(2,0)","kglsim object batch")
Wed Sep 04 03:23:39 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q002_36069.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(6,0)","kglsim object batch")
Wed Sep 04 03:23:45 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q003_36151.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(3,0)","kglsim object batch")
Wed Sep 04 03:23:50 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q004_36242.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(7,0)","kglsim object batch")
Wed Sep 04 03:23:55 2019
Emon ping encountered error 12801
Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q001_36305.trc:
ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)
ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(4,0)","kglsim object batch")
統(tǒng)計(jì)2節(jié)點(diǎn)每個(gè)子池及duration出現(xiàn)04031的次數(shù):
根據(jù)alert日志可以看出,所有的ora-4031同樣都發(fā)生在shared pool子池的第0個(gè)duration上,導(dǎo)致4031的根本原因是因?yàn)閟hared pool子池的第0個(gè)duration內(nèi)存不足。
通過(guò)設(shè)置sga_target的ASMM管理后,共享池(shared_pool)和流池(streams pool)每個(gè)子池都是4個(gè)duration。它們分別是:instance,session,cursor,execution,只有第四個(gè)duration,也就是execution是可以resize的,而當(dāng)?shù)?個(gè)duration內(nèi)存不足的時(shí)候不能resize,就直接會(huì)報(bào)錯(cuò)ora-4031。
問(wèn)題解決
通過(guò)禁用duration,必須設(shè)置參數(shù)"_enable_shared_pool_durations=fales",并重啟數(shù)據(jù)庫(kù)。
alter system set "_enable_shared_pool_durations"=false scope=spfile;
通過(guò)該參數(shù)設(shè)置后,把它們四個(gè)duration都合并到一個(gè)池中,不會(huì)再出現(xiàn)一個(gè)duration的內(nèi)存被耗盡,而另外一個(gè)duration仍具有空閑內(nèi)存,對(duì)于共享池和流池都是這樣;設(shè)置sga_target之后,所有池都會(huì)通過(guò)buffer cache來(lái)傳輸granules(顆粒)整數(shù)倍大小的內(nèi)存,如果shrink,則返回buffer cache,沒(méi)有從一個(gè)pool到另外一個(gè)pool的直接傳輸,所有的內(nèi)存resize都會(huì)以buffer cache作為源和目標(biāo)。
設(shè)置該參數(shù)的唯一負(fù)面影響是SGA resize的時(shí)候,不能從shared pool中取內(nèi)存到其他的pool。
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。