溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

頻發(fā):故障排除之又見(jiàn) ORA-4031丨云和恩墨技術(shù)通訊

發(fā)布時(shí)間:2020-08-11 01:46:39 來(lái)源:ITPUB博客 閱讀:422 作者:數(shù)據(jù)和云 欄目:數(shù)據(jù)庫(kù)

親愛(ài)的讀者朋友:

為了及時(shí)共享行業(yè)案例,通知共性問(wèn)題,達(dá)成共享和提前預(yù)防,我們整理和編輯了《云和恩墨技術(shù)通訊》,通過(guò)對(duì)過(guò)去一段時(shí)間的知識(shí)回顧,故障歸納,以期提供有價(jià)值的信息供大家參考。同時(shí),我們也希望能夠?qū)狳c(diǎn)事件、新的產(chǎn)品特性及其他有價(jià)值的信息聚集起來(lái),為您提供具有前瞻性的支持信息,保持對(duì)于當(dāng)前最新的數(shù)據(jù)庫(kù)新聞和事件的了解,其中包括重要數(shù)據(jù)庫(kù)產(chǎn)品發(fā)布、警報(bào)、更新、新版本、補(bǔ)丁等。

本期目錄:

新聞:2019年11月數(shù)據(jù)庫(kù)流行度排行

經(jīng)驗(yàn):Oracle RAC跨節(jié)點(diǎn)訪問(wèn)數(shù)據(jù)塊,節(jié)點(diǎn)長(zhǎng)事務(wù)加劇gc等待

經(jīng)驗(yàn):IBM MQ通道連接數(shù)達(dá)到最大故障分析

問(wèn)題:子游標(biāo)過(guò)多導(dǎo)致數(shù)據(jù)庫(kù)HANG

頻發(fā):再談Library Cache Lock

頻發(fā):故障排除之又見(jiàn)ORA-4031

警示:強(qiáng)制關(guān)閉OGG進(jìn)程觸發(fā)bug致abended

公告:首屆墨天輪年度十大突出貢獻(xiàn)人物評(píng)選活動(dòng)

云和恩墨技術(shù)通訊集錦: https://www.modb.pro/doc/topic/5927

部分精選-頻發(fā):故障排除之又見(jiàn) ORA-4031  作者:候靜遠(yuǎn)


當(dāng)遇到ORA-4031錯(cuò)誤時(shí),你會(huì)不會(huì)內(nèi)心一緊。Oracle進(jìn)程在向SGA申請(qǐng)內(nèi)存時(shí),如果申請(qǐng)失敗,則會(huì)拋出這個(gè)錯(cuò)誤,大部分情況下是在向SGA中的 shared pool申請(qǐng)內(nèi)存時(shí)失敗。嚴(yán)重情況下,可能導(dǎo)致數(shù)據(jù)庫(kù)出現(xiàn)異常崩潰。本文分享客戶近期碰到的一起由于ORA-4031問(wèn)題導(dǎo)致數(shù)據(jù)庫(kù)異常宕機(jī)的案例,供大家參考。

問(wèn)題描述

2019年9月4日凌晨3點(diǎn)左右,接到監(jiān)控系統(tǒng)告警:數(shù)據(jù)庫(kù)出現(xiàn)異常,無(wú)法連接。登陸到數(shù)據(jù)庫(kù)1節(jié)點(diǎn)查看后臺(tái)alert日志發(fā)現(xiàn)有大量ORA-04031報(bào)錯(cuò),2節(jié)點(diǎn)有少量報(bào)錯(cuò)。為了盡快恢復(fù)業(yè)務(wù),嘗試直接重啟1節(jié)點(diǎn)數(shù)據(jù)庫(kù),重啟完成之后恢復(fù)正常。

問(wèn)題分析

1. 節(jié)點(diǎn)后臺(tái)對(duì)應(yīng)alert日志:

Wed Sep 04 03:57:50 2019

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set   ...","sga heap(2,0)","kglsim object batch")

Wed Sep 04 03:58:10 2019

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set   ...","sga heap(1,0)","kglsim object batch")

Wed Sep 04 03:58:26 2019

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set   ...","sga heap(7,0)","kglsim object batch")

Wed Sep 04 03:58:42 2019

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.col_usage$ set   ...","sga heap(6,0)","kglsim object batch")

Wed Sep 04 03:58:57 2019

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_smon_29747.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","update sys.mon_mods$ set ins...","sga heap(5,0)","kglsim object batch")

Wed Sep 04 03:59:08 2019

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx/trace/xxxxx_xxx0_42548.trc:

ORA-00604: error occurred at recursive SQL level 1

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select count(*) from sys.job...","sga heap(3,0)","kglsim object batch")

Wed Sep 04 03:59:10 2019

License high water mark = 97

USER (ospid: 28750): terminating the instance

統(tǒng)計(jì)1節(jié)點(diǎn)每個(gè)子池及duration出現(xiàn)04031的次數(shù),sga heap(n,0)-n代表第幾個(gè)子池,0代表是第幾個(gè)duration:

頻發(fā):故障排除之又見(jiàn) ORA-4031丨云和恩墨技術(shù)通訊

根據(jù)alert日志可以看出,所有的ora-4031都發(fā)生在shared pool子池的第0個(gè)duration上。

Summary of resize operations history:

shared pool            start   3.19 GB  now   3.19 GB  0 grows   0 shrinks

large pool             start   0.50 GB  now   0.50 GB  0 grows   0 shrinks

java pool              start   0.50 GB  now   0.50 GB  0 grows   0 shrinks

SGA Target             start  32.00 GB  now  32.00 GB  0 grows   0 shrinks

DEFAULT buffer cache   start  27.59 GB  now  27.59 GB  0 grows   0 shrinks

PGA Target             start  11.00 GB  now  11.00 GB  0 grows   0 shrinks

發(fā)現(xiàn)shared pool并沒(méi)有進(jìn)行resize。

==============================================

TOP 20 MEMORY USES ACROSS SGA HEAP 1 - 7

----------------------------------------------

"KGLH0                     "       1103 MB 19%

"SQLA                      "       1081 MB 18%

"free memory               "        835 MB 14%

"gcs resources             "        794 MB 14%

"gcs shadows               "        550 MB  9%

"db_block_hash_buckets     "        178 MB  3%

"ASH buffers               "        160 MB  3%

"KGLHD                     "        157 MB  3%

"Checkpoint queue          "        156 MB  3%

"kglsim object batch       "         90 MB  2%

"kglsim heap               "         56 MB  1%

"ges resource              "         53 MB  1%

"ges enqueues              "         43 MB  1%

"KGLDA                     "         41 MB  1%

"dbwriter coalesce buffer  "         40 MB  1%

"dirty object counts array "         40 MB  1%

"object queue              "         35 MB  1%

"gcs res hash bucket       "         32 MB  1%

"dbktb: trace buffer       "         31 MB  1%

"FileOpenBlock             "         30 MB  1%

TOTALS ---------------------------------------

Total free memory                   830 MB

Total memory alloc.                5026 MB

Grand total                        5856 MB

==============================================

2. 節(jié)點(diǎn)后臺(tái)對(duì)應(yīng)的alert日志:

Wed Sep 04 03:23:18 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q002_35378.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(4,0)","kglsim object batch")

Wed Sep 04 03:23:23 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q003_35453.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(1,0)","kglsim object batch")

Wed Sep 04 03:23:29 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q004_35725.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(5,0)","kglsim object batch")

Wed Sep 04 03:23:34 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q001_35778.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(2,0)","kglsim object batch")

Wed Sep 04 03:23:39 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q002_36069.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(6,0)","kglsim object batch")

Wed Sep 04 03:23:45 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q003_36151.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(3,0)","kglsim object batch")

Wed Sep 04 03:23:50 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q004_36242.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(7,0)","kglsim object batch")

Wed Sep 04 03:23:55 2019

Emon ping encountered error 12801

Errors in file /u01/app/oracle/diag/rdbms/xxxxx/xxxxx2/trace/xxxxx2_q001_36305.trc:

ORA-12801: error signaled in parallel query server PZ99, instance dnfwglpt1:xxxxx (1)

ORA-04031: unable to allocate 3896 bytes of shared memory ("shared pool","select inst_id, reg_id, num_...","sga heap(4,0)","kglsim object batch")

統(tǒng)計(jì)2節(jié)點(diǎn)每個(gè)子池及duration出現(xiàn)04031的次數(shù):

頻發(fā):故障排除之又見(jiàn) ORA-4031丨云和恩墨技術(shù)通訊

根據(jù)alert日志可以看出,所有的ora-4031同樣都發(fā)生在shared pool子池的第0個(gè)duration上,導(dǎo)致4031的根本原因是因?yàn)閟hared pool子池的第0個(gè)duration內(nèi)存不足。

通過(guò)設(shè)置sga_target的ASMM管理后,共享池(shared_pool)和流池(streams pool)每個(gè)子池都是4個(gè)duration。它們分別是:instance,session,cursor,execution,只有第四個(gè)duration,也就是execution是可以resize的,而當(dāng)?shù)?個(gè)duration內(nèi)存不足的時(shí)候不能resize,就直接會(huì)報(bào)錯(cuò)ora-4031。

問(wèn)題解決

通過(guò)禁用duration,必須設(shè)置參數(shù)"_enable_shared_pool_durations=fales",并重啟數(shù)據(jù)庫(kù)。

alter system set "_enable_shared_pool_durations"=false scope=spfile;

通過(guò)該參數(shù)設(shè)置后,把它們四個(gè)duration都合并到一個(gè)池中,不會(huì)再出現(xiàn)一個(gè)duration的內(nèi)存被耗盡,而另外一個(gè)duration仍具有空閑內(nèi)存,對(duì)于共享池和流池都是這樣;設(shè)置sga_target之后,所有池都會(huì)通過(guò)buffer cache來(lái)傳輸granules(顆粒)整數(shù)倍大小的內(nèi)存,如果shrink,則返回buffer cache,沒(méi)有從一個(gè)pool到另外一個(gè)pool的直接傳輸,所有的內(nèi)存resize都會(huì)以buffer cache作為源和目標(biāo)。

設(shè)置該參數(shù)的唯一負(fù)面影響是SGA resize的時(shí)候,不能從shared pool中取內(nèi)存到其他的pool。

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI