溫馨提示×

溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊×
其他方式登錄
點(diǎn)擊 登錄注冊 即表示同意《億速云用戶服務(wù)條款》

Resource Manager關(guān)閉引起的自動(dòng)JOB無法調(diào)用

發(fā)布時(shí)間:2020-08-09 17:54:12 來源:ITPUB博客 閱讀:194 作者:ericwen2015 欄目:關(guān)系型數(shù)據(jù)庫

 

內(nèi)容介紹

本文詳細(xì)介紹了在客戶現(xiàn)場一次因?yàn)?span>resmgr:cpu quantum等待事件而導(dǎo)致CPU大量被消耗,在確認(rèn)原因并將導(dǎo)致CPU沖高的Resource Manager資源管理器特性關(guān)閉后,后續(xù)導(dǎo)致了后臺(tái)alert日志中大量的自動(dòng)定時(shí)JOB沒有執(zhí)行并報(bào)錯(cuò):

ORA-12012: error on auto execute of job "SYS"."ORA$AT_OS_OPT_SY_255"

ORA-29373: resource manager is not on

 

客戶操作系統(tǒng):RHEL6

數(shù)據(jù)庫版本:11.2.0.3.11

         以下我們將描述并說明Resource Manager資源管理器以及后續(xù)的解決處理排查等一系列的過程。

 

概念普及

Resource Manager資源管理器是oracle10g推出,并在11g中得以完善的一項(xiàng)功能。資源管理器通過控制數(shù)據(jù)庫內(nèi)部的執(zhí)行調(diào)度任務(wù)控制資源在各個(gè)會(huì)話之間的分配。通過控制所要運(yùn)行的會(huì)話以及會(huì)話運(yùn)行的時(shí)間長度,數(shù)據(jù)庫資源管理器可以確保資源利用及分配和我們計(jì)劃中的配置一樣,有效利用資源,當(dāng)然,Resource Manager主要控制資源包括如下幾塊:

Oracle會(huì)話的CPU使用率

并行度

SQL語句操作執(zhí)行時(shí)間

會(huì)話空閑時(shí)間

活躍會(huì)話(session)數(shù)

UNDO管理

當(dāng)然,在真正的生產(chǎn)環(huán)境中,Resource Manager還沒有被大量的使用起來,但是在11g(11.1.0.6 to 11.1.0.7  and 11gR2)開始,oracle默認(rèn)開啟Resource Manager計(jì)劃,如下顯示oracle后臺(tái)日志:

Setting Resource Manager plan SCHEDULER[0x51B5]:DEFAULT_MAINTENANCE_PLAN via scheduler window

Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter

Thu Feb 05 22:00:03 2009

Begin automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"

Thu Feb 05 22:00:39 2009

End automatic SQL Tuning Advisor run for special tuning task "SYS_AUTO_SQL_TUNING_TASK"

 

 

如下顯示在10g11g中,oracle對(duì)Resource Manager采用的不同策略:

Subject

10G

11G

Maintenance Window

2 windows, WEEK and WEEKEND

Each day has its own window

Resource manager

Not enabled per default

Default resource plan specified

 

 

故障詳解

客戶新系統(tǒng)上線,在上線后,發(fā)現(xiàn)數(shù)據(jù)庫出現(xiàn)大量的等待事件:resmgr:cpu quantum,這個(gè)等待事件是在11g中一個(gè)比較常見的等待事件,消耗大量CPU 系統(tǒng)性能變差,在和客戶溝通后,決定對(duì)該特性進(jìn)行關(guān)閉,以下為關(guān)閉該特性步驟:

ALTER SYSTEM SET “_resource_manager_always_on”=FALSE SCOPE=SPFILE SID='*';

 

 #1,Change the active windows to use the null resource manager plan (or other nonrestrictive plan) using:

 

   execute dbms_scheduler.set_attribute('WEEKNIGHT_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('WEEKEND_WINDOW','RESOURCE_PLAN','');

 

#2,Since in 11g there are more Maintenance Windows, we should add them too:

 

   execute dbms_scheduler.set_attribute('SATURDAY_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('SUNDAY_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('MONDAY_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('TUESDAY_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('WEDNESDAY_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('THURSDAY_WINDOW','RESOURCE_PLAN','');

   execute dbms_scheduler.set_attribute('FRIDAY_WINDOW','RESOURCE_PLAN','');

 

 #3. Then, for each window_name (WINDOW_NAME from DBA_SCHEDULER_WINDOWS), run:

 

   execute dbms_scheduler.set_attribute('<window name>','RESOURCE_PLAN','');

 

以上關(guān)掉Resource Manager后效果確實(shí)很明顯,系統(tǒng)運(yùn)行穩(wěn)定,CPU處于正常水平。但是運(yùn)行一段時(shí)間后后臺(tái)開始報(bào)錯(cuò):

ORA-12012: error on auto execute of job "SYS"."ORA$AT_SQ_SQL_SW_186"

ORA-29373: resource manager is not on

從報(bào)錯(cuò)上看,似乎和Resource Manager有一定的關(guān)系,從JOB的命名來看,這個(gè)JOB是"SQL Tuning Advisor Job", 用于診斷和監(jiān)控高負(fù)載的SQL, 并為ADDM提供SQL優(yōu)化建議的JOB,第一次,我們對(duì)該報(bào)錯(cuò)的JOB,也就是SQL Tuning Advisor進(jìn)行了關(guān)閉,關(guān)閉過程如下:

exec DBMS_AUTO_TASK_ADMIN.DISABLE('SQL TUNING ADVISOR',NULL, NULL);

 

關(guān)閉完后又有相應(yīng)新的JOB開始報(bào)錯(cuò):

ORA-12012: error on auto execute of job "SYS"."ORA$AT_OS_OPT_SY_255"

ORA-29373: resource manager is not on

我們檢查后臺(tái)最近的JOB執(zhí)行情況:

SQL> select client_name,window_name,job_name,job_status from dba_autotask_job_history ;

 

CLIENT_NAME                     WINDOW_NAME      JOB_NAME              JOB_STATUS 

----------------------------- -------------------------------------  --------

auto optimizer stats collection WEDNESDAY_WINDOW  ORA$AT_OS_OPT_SY_102  FAILED 

auto optimizer stats collection FRIDAY_WINDOW     ORA$AT_OS_OPT_SY_105  FAILED 

auto optimizer stats collection TUESDAY_WINDOW    ORA$AT_OS_OPT_SY_99   FAILED 

auto optimizer stats collection  MONDAY_WINDOW    ORA$AT_OS_OPT_SY_96   FAILED

在以上的腳本輸出中,我們發(fā)現(xiàn)大量的后臺(tái)JOB報(bào)錯(cuò),而以上JOB報(bào)錯(cuò)從告警日志上看,基本都是由于Resource Manager關(guān)閉造成。但是我們知道,Resource Manager的開啟關(guān)閉應(yīng)該不至于影響JOB的正常調(diào)度,而且在相同的數(shù)據(jù)庫版本下,我們在AIX上卻沒有發(fā)現(xiàn)問題,而在檢查過程中,我們發(fā)現(xiàn),雖然我們已經(jīng)關(guān)閉了Resource Manager,但是相關(guān)的DBRM進(jìn)程卻依舊存在,ORACLEDBRM進(jìn)程進(jìn)程即為Resource Manager的管理進(jìn)程,在某種情況下,關(guān)閉了Resource Manager卻發(fā)現(xiàn)該進(jìn)程依舊存在,這本身就存在一定的問題。這從一定程度上給予我們一定的懷疑方向,可能存在resource并沒有完全關(guān)閉,而且從詳細(xì)的trace文件中的從call stack trace里來看, 當(dāng)前進(jìn)程也沒有沒有關(guān)于resource manager的函數(shù)調(diào)用,而是一直在向另外一個(gè)進(jìn)程post message。

所以,我們懷疑錯(cuò)誤很有可能是由于DBRM沒有正常關(guān)閉造成。

#ps -ef|grep dbrm 

oracle 34920 1 0 Apr18 ? 00:00:59  ora_dbrm_nfdb1

 

在對(duì)問題進(jìn)行了詳細(xì)的跟蹤及分析后,我們向ORACLE提交了SR,ORACLE反饋給我們第二個(gè)隱含參數(shù): _resource_manager_always_on設(shè)置為FALSE

我們嘗試對(duì)該參數(shù)進(jìn)行設(shè)置,并重啟了數(shù)據(jù)庫生效該參數(shù)后,DBRM進(jìn)程消失:

# ps -ef | grep dbrm

root      4650  3647  0 21:58 pts/2    00:00:00 grep dbrm

且相關(guān)的JOB執(zhí)行正常。為了確認(rèn)問題,我們分別進(jìn)行了3個(gè)實(shí)驗(yàn),分別為:

1. 設(shè)置“_resource_manager_always_on隱含參數(shù),關(guān)閉resource manager windows 調(diào)用計(jì)劃

2. 刪除隱含參數(shù),只設(shè)置resource manager windows 調(diào)用計(jì)劃

3. 添加2個(gè)隱含參數(shù),關(guān)閉resource manager windows 調(diào)用計(jì)劃

 

關(guān)閉數(shù)據(jù)庫,調(diào)整時(shí)間,設(shè)置_resource_manager_always_on隱含參數(shù),關(guān)閉windows 計(jì)劃

關(guān)閉數(shù)據(jù)庫:

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

 

調(diào)整時(shí)間:

[root@rhel6 ~]# date -s 2014/05/14

[root@rhel6 ~]# date -s 21:57:00

[root@rhel6 ~]# clock -w

[root@rhel6 ~]# date

Wed May 14 21:57:35 CST 2014

 

添加隱含參數(shù),啟動(dòng)至open

SQL> ALTER SYSTEM SET "_resource_manager_always_on"=FALSE SCOPE=SPFILE;

SQL> startup force(測試環(huán)境,直接force啟動(dòng),生產(chǎn)環(huán)境勿如此操作)

 

設(shè)置resource manager plan:

execute dbms_scheduler.set_attribute('SATURDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('SUNDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('MONDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('TUESDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('WEDNESDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('THURSDAY_WINDOW','RESOURCE_PLAN','');

execute dbms_scheduler.set_attribute('FRIDAY_WINDOW','RESOURCE_PLAN','');

 

我們觀察22點(diǎn)的alert信息,確實(shí)開始報(bào)錯(cuò):

Wed May 14 22:00:03 2014

Errors in file /oracle/ora11g/base/diag/rdbms/ora11g/ora11g/trace/ora11g_j003_4452.trc:

ORA-12012: error on auto execute of job "SYS"."ORA$AT_OS_OPT_SY_27"

ORA-29373: resource manager is not on

Wed May 14 22:00:03 2014

Errors in file /oracle/ora11g/base/diag/rdbms/ora11g/ora11g/trace/ora11g_j004_4454.trc:

ORA-12012: error on auto execute of job "SYS"."ORA$AT_SA_SPC_SY_28"

ORA-29373: resource manager is not on

Wed May 14 22:00:03 2014

Errors in file /oracle/ora11g/base/diag/rdbms/ora11g/ora11g/trace/ora11g_j005_4456.trc:

ORA-12012: error on auto execute of job "SYS"."ORA$AT_SQ_SQL_SW_29"

ORA-29373: resource manager is not on

Wed May 14 22:00:04 2014

XDB installed.

XDB initialized.

 

檢查DBRM進(jìn)程已經(jīng)存在:

[root@rhel6 ~]# ps -ef | grep dbrm

ora11g    4346     1  0 21:57 ?        00:00:00 ora_dbrm_ora11g

 

 

檢查后臺(tái)JOB執(zhí)行記錄視圖:

SQL> select CLIENT_NAME,WINDOW_NAME,JOB_NAME,JOB_STATUS,JOB_START_TIME from DBA_AUTOTASK_JOB_HISTORY where CLIENT_NAME='auto optimizer stats collection' order by JOB_START_TIME desc;

 

CLIENT_NAME                         WINDOW_NAME                    JOB_NAME                       JOB_STATUS                     JOB_START_TIME

----------------------------------- ------------------------------ ------------------------------ ------------------------------ ----------------------------------------

auto optimizer stats collection     WEDNESDAY_WINDOW               ORA$AT_OS_OPT_SY_27            FAILED                         14-MAY-14 10.00.03.233570 PM PRC

 

模擬過程與當(dāng)前環(huán)境產(chǎn)生效果一致

接下來我們進(jìn)行第二個(gè)模擬:

關(guān)閉數(shù)據(jù)庫,調(diào)整時(shí)間,去除隱含參數(shù),關(guān)閉windows 計(jì)劃

關(guān)閉數(shù)據(jù)庫:

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

 

調(diào)整時(shí)間:

[root@rhel6 ~]# date -s 2014/05/14

[root@rhel6 ~]# date -s 21:57:00

[root@rhel6 ~]# clock -w

[root@rhel6 ~]# date

Wed May 14 21:57:35 CST 2014

 

去除隱含參數(shù),啟動(dòng)至open:

隱含參數(shù)去除采用create pfile from spfile;

刪除spfile,編輯pfile文件,刪除隱含參數(shù),以pfile啟動(dòng)數(shù)據(jù)庫

 

設(shè)置resource manager plan:(由于前面已經(jīng)設(shè)置過,無需再設(shè)置)

 

我們觀察22點(diǎn)的alert信息,發(fā)現(xiàn)沒有報(bào)錯(cuò)

Tue May 13 22:00:03 2014

Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"

End automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"

Tue May 13 22:00:05 2014

XDB installed.

XDB initialized.

 

 

檢查 DBRM進(jìn)程:

[root@rhel6 ~]# ps -ef | grep dbrm

ora11g    3844     1  0 21:55 ?        00:00:00 ora_dbrm_ora11g

 

說明:此時(shí)resource manager由于只是關(guān)閉了resource manager plan計(jì)劃,沒有真正關(guān)閉resource manager 因此該進(jìn)程依舊存在。

 

檢查后臺(tái)JOB執(zhí)行視圖信息:

SQL> select CLIENT_NAME,WINDOW_NAME,JOB_NAME,JOB_STATUS,JOB_START_TIME from DBA_AUTOTASK_JOB_HISTORY where CLIENT_NAME='auto optimizer stats collection' order by JOB_START_TIME desc;

 

CLIENT_NAME                              WINDOW_NAME                    JOB_NAME                  JOB_STATUS           JOB_START_TIME

---------------------------------------- ------------------------------ ------------------------- -------------------- ----------------------------------------

auto optimizer stats collection          TUESDAY_WINDOW                 ORA$AT_OS_OPT_SY_24       SUCCEEDED            13-MAY-14 10.00.02.102741 PM PRC

 

說明在隱含參數(shù)除掉的情況下,JOB可以正常執(zhí)行,后臺(tái)沒有報(bào)錯(cuò)。

 

模擬隱含參數(shù)及resource manager plan均存在的情況

關(guān)閉數(shù)據(jù)庫,調(diào)整時(shí)間,添加隱含參數(shù),關(guān)閉windows 計(jì)劃

關(guān)閉數(shù)據(jù)庫:

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

調(diào)整時(shí)間:

[root@rhel6 ~]# date -s 2014/05/15

[root@rhel6 ~]# date -s 21:57:00

[root@rhel6 ~]# clock -w

[root@rhel6 ~]# date

Thu May 15 21:57:35 CST 2014

添加隱含參數(shù),啟動(dòng)至open:

SQL> alter system set "_resource_manager_always_off"=true scope=spfile;

SQL> ALTER SYSTEM SET "_resource_manager_always_on"=FALSE SCOPE=SPFILE;

SQL> startup force(測試環(huán)境,直接force啟動(dòng),生產(chǎn)環(huán)境勿如此操作)

設(shè)置resource manager plan:(由于前面已經(jīng)設(shè)置過,無需再設(shè)置)

 

我們觀察22點(diǎn)的alert信息,發(fā)現(xiàn)沒有報(bào)錯(cuò)

Thu May 15 22:00:03 2014

Begin automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"

Thu May 15 22:00:05 2014

XDB installed.

XDB initialized.

End automatic SQL Tuning Advisor run for special tuning task  "SYS_AUTO_SQL_TUNING_TASK"

 

檢查 DBRM進(jìn)程:

[root@rhel6 ~]# ps -ef | grep dbrm

root      4650  3647  0 21:58 pts/2    00:00:00 grep dbrm

 

說明:此時(shí)DBRM進(jìn)程消失

 

檢查后臺(tái)JOB執(zhí)行視圖信息:

SQL> select CLIENT_NAME,WINDOW_NAME,JOB_NAME,JOB_STATUS,JOB_START_TIME from DBA_AUTOTASK_JOB_HISTORY where CLIENT_NAME='auto optimizer stats collection' order by JOB_START_TIME desc;

 

CLIENT_NAME                         WINDOW_NAME                    JOB_NAME                       JOB_STATUS                     JOB_START_TIME

----------------------------------- ------------------------------ ------------------------------ ------------------------------ ----------------------------------------

auto optimizer stats collection     THURSDAY_WINDOW                ORA$AT_OS_OPT_SY_30            SUCCEEDED                      15-MAY-14 10.00.02.115232 PM PRC

 

說明在隱含參數(shù)兩個(gè)都添加的情況下,完全屏蔽resource manager的的情況下,JOB可以正常執(zhí)行,后臺(tái)沒有報(bào)錯(cuò)。

 

技術(shù)結(jié)論

以上測試分析結(jié)果證明,后臺(tái)報(bào)錯(cuò)JOB執(zhí)行失敗原因應(yīng)該是DBRM進(jìn)程依舊活動(dòng),而DBRM進(jìn)程是管理Resource Manager當(dāng)去除"_resource_manager_always_off"=true"_resource_manager_always_on"=FALSE

或者將兩個(gè)參數(shù)全部添加,均可避免該錯(cuò)誤,統(tǒng)計(jì)信息自動(dòng)收集也可以自動(dòng)執(zhí)行

在對(duì)以為問題進(jìn)行分析及確認(rèn)后,我們向ORACLE提交了相關(guān)的SR,最終,ORACLE對(duì)此,確認(rèn)相關(guān)的一個(gè)BUG,BUG號(hào)為:Bug 18748456 : AUTO TASK JOBS FAILED WITH ORA-29373 ERROR

 ------------------------------------------------------------------------------------
<版權(quán)所有,文章允許轉(zhuǎn)載,但必須以鏈接方式注明源地址,否則追究法律責(zé)任!>
原博客地址:http://blog.itpub.net/23732248/
原作者:應(yīng)以峰 (frank-ying)
-------------------------------------------------------------------------------------
向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請(qǐng)聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI