溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

oracle rac的lmd進(jìn)程怎么理解

發(fā)布時(shí)間:2021-12-22 09:15:59 來(lái)源:億速云 閱讀:401 作者:iii 欄目:關(guān)系型數(shù)據(jù)庫(kù)

這篇文章主要介紹“oracle rac的lmd進(jìn)程怎么理解”,在日常操作中,相信很多人在oracle rac的lmd進(jìn)程怎么理解問(wèn)題上存在疑惑,小編查閱了各式資料,整理出簡(jiǎn)單好用的操作方法,希望對(duì)大家解答”oracle rac的lmd進(jìn)程怎么理解”的疑惑有所幫助!接下來(lái),請(qǐng)跟著小編一起來(lái)學(xué)習(xí)吧!

結(jié)論

1,測(cè)試環(huán)境為oracle 10.2.0.1 rac
2,lmd進(jìn)程如果異常中斷,會(huì)導(dǎo)致所屬RAC實(shí)例重啟,并且在關(guān)庫(kù)前會(huì)生成一個(gè)SYSTEMSTATE DUMP文件
3,lmon進(jìn)程是監(jiān)控lmd進(jìn)程,即lmd進(jìn)程如果死掉,會(huì)由lmon進(jìn)程重啟它
4,lmd進(jìn)程負(fù)責(zé)全局隊(duì)列服務(wù),即GES,說(shuō)白了,就是管理跨RAC多實(shí)例的資源請(qǐng)求,由此可見(jiàn)LMD進(jìn)程的重要性,如果LMD出現(xiàn)故障,數(shù)據(jù)庫(kù)DML操作會(huì)HANG住
   進(jìn)而會(huì)引發(fā)RAC節(jié)點(diǎn)間的IPC通訊延時(shí)
5,IPC通訊延時(shí)會(huì)產(chǎn)生對(duì)應(yīng)的LMD的TRACE FILE   

測(cè)試

--lmd含義
lmd進(jìn)程是負(fù)責(zé)全局隊(duì)列服務(wù)的進(jìn)程,即GES;
它是負(fù)責(zé)每個(gè)RAC實(shí)例來(lái)自遠(yuǎn)端RAC節(jié)點(diǎn)的資源請(qǐng)求;并且它是一個(gè)DAEMON進(jìn)程,也就是說(shuō)會(huì)由一個(gè)監(jiān)控進(jìn)程保護(hù)它,如果它不存在,由監(jiān)控進(jìn)程重啟它


--可見(jiàn)lmd進(jìn)程如果異常中斷,會(huì)直接導(dǎo)致RAC節(jié)點(diǎn)強(qiáng)制關(guān)閉,并且在關(guān)閉實(shí)例前生成一個(gè)systemstate dump,以供分析
[oracle@jingfa1 ~]$ ps -ef|grep lmd
oracle    4774     1  0 Nov09 ?        00:00:31 asm_lmd0_+ASM1
oracle   11220     1  0 02:13 ?        00:00:15 ora_lmd0_jingfa1
oracle   30706 30376  0 05:19 pts/3    00:00:00 grep lmd
[oracle@jingfa1 ~]$ kill -9 11220

Tue Nov 10 05:20:03 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_pmon_11212.trc:
ORA-00482: LMD* process terminated with error
Tue Nov 10 05:20:03 2015
PMON: terminating instance due to error 482
Tue Nov 10 05:20:03 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_lms0_11222.trc:
ORA-00482: LMD* process terminated with error
Tue Nov 10 05:20:03 2015
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/jingfa/bdump/jingfa1_diag_11214.trc
Tue Nov 10 05:20:03 2015
Trace dumping is performing id=[cdmp_20151110052003]
Tue Nov 10 05:20:08 2015
Instance terminated by PMON, pid = 11212
--緊接實(shí)例又會(huì)自動(dòng)重啟
Tue Nov 10 05:21:05 2015
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0

可見(jiàn)lmd進(jìn)程又會(huì)自動(dòng)重啟
[oracle@jingfa1 ~]$ ps -ef|grep lmd
oracle    3474 30376  0 05:23 pts/3    00:00:00 grep lmd
oracle    4774     1  0 Nov09 ?        00:00:31 asm_lmd0_+ASM1
oracle   32703     1  0 05:21 ?        00:00:00 ora_lmd0_jingfa1

上述說(shuō)lmd進(jìn)程的健康是由其監(jiān)控進(jìn)程負(fù)責(zé)的,經(jīng)查官方手冊(cè)是lmon進(jìn)程,LMON進(jìn)程負(fù)責(zé)每個(gè)RAC實(shí)例跨實(shí)例或者叫全局隊(duì)列及資源的管理,以及全局隊(duì)列鎖的恢復(fù)操作

[oracle@jingfa1 bdump]$ ps -ef|grep lmon
oracle    4772     1  0 Nov09 ?        00:00:29 asm_lmon_+ASM1
oracle   19857 30376  0 05:34 pts/3    00:00:00 grep lmon
oracle   32701     1  0 05:21 ?        00:00:02 ora_lmon_jingfa1
[oracle@jingfa1 bdump]$ kill -9 32701
可見(jiàn)如果異常中斷LMON,其所屬的LMD進(jìn)程也會(huì)強(qiáng)制關(guān)閉
[oracle@jingfa1 bdump]$ ps -ef|grep lmd
oracle    4774     1  0 Nov09 ?        00:00:32 asm_lmd0_+ASM1
oracle   21171 30376  0 05:34 pts/3    00:00:00 grep lmd

可見(jiàn)只要異常中斷l(xiāng)mon進(jìn)程,會(huì)強(qiáng)制重啟數(shù)據(jù)庫(kù)實(shí)例
Tue Nov 10 05:34:18 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_pmon_32695.trc:
ORA-00481: LMON process terminated with error
Tue Nov 10 05:34:18 2015
PMON: terminating instance due to error 481
Tue Nov 10 05:34:18 2015
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/jingfa/bdump/jingfa1_diag_32697.trc
Tue Nov 10 05:34:18 2015
Trace dumping is performing id=[cdmp_20151110053418]
Tue Nov 10 05:34:23 2015
Instance terminated by PMON, pid = 32695
Tue Nov 10 05:35:19 2015
Starting ORACLE instance (normal)

可見(jiàn)lmon及l(fā)md會(huì)自動(dòng)重啟
[oracle@jingfa1 bdump]$ ps -ef|grep lmon
oracle    4772     1  0 Nov09 ?        00:00:30 asm_lmon_+ASM1
oracle   21820     1  0 05:35 ?        00:00:01 ora_lmon_jingfa1
oracle   27926 30376  0 05:39 pts/3    00:00:00 grep lmon
[oracle@jingfa1 bdump]$ ps -ef|grep lmd
oracle    4774     1  0 Nov09 ?        00:00:33 asm_lmd0_+ASM1
oracle   21822     1  0 05:35 ?        00:00:00 ora_lmd0_jingfa1
oracle   28028 30376  0 05:39 pts/3    00:00:00 grep lmd

引申下,也就是說(shuō)肯定操作系統(tǒng)層面會(huì)有某種機(jī)制,確保lmon及l(fā)md進(jìn)程異常中斷后,會(huì)重啟它們,哪這種機(jī)制到底是什么呢?
經(jīng)分析操作系統(tǒng)層面的各個(gè)進(jìn)程,主要是/etc/init.d下,對(duì)比后發(fā)現(xiàn)lmon及其所屬lmd是隸屬于ORACLE層面,而非集群層面,沒(méi)有對(duì)應(yīng)的進(jìn)程控制它們,

我們換個(gè)思路分析,與lmd進(jìn)程相關(guān)的參數(shù)有哪些,其含義是什么?

NAME_1                                             VALUE_1                                            DESC1
-------------------------------------------------- -------------------------------------------------- --------------------------------------------------
_lm_lmd_waittime                                   8                                                  default wait time for lmd in centiseconds

---node1
SQL> select addr,program,username,pid,spid from v$process where username='oracle' and spid=21822;

ADDR             PROGRAM                                          USERNAME               PID SPID
---------------- ------------------------------------------------ --------------- ---------- ------------
0000000083A585C8 oracle@jingfa1 (LMD0)                            oracle                   6 21822

--node2
SQL> select addr,program,username,pid,spid from v$process where username='oracle' and spid=668;


ADDR             PROGRAM                                          USERNAME               PID SPID
---------------- ------------------------------------------------ --------------- ---------- ------------
0000000083A585C8 oracle@jingfa2 (LMD0)                            oracle                   6 668

--node2
SQL> conn tbs_zxy/system
Connected.
SQL> update t_lock set a=11 where a=1;

1 row updated.

--node1
SQL> update t_lock set a=1111 where a=1;
--hang住
可見(jiàn)上述參數(shù)并不直接與鎖的檢測(cè)有關(guān)喲,但是lmd是和全局鎖有關(guān)的

換個(gè)思路,如果oradebug 模擬暫停lmd,再產(chǎn)生全局鎖會(huì)如何呢

---node1
暫停lmd
SQL> oradebug setospid 21822
Oracle pid: 6, Unix process pid: 21822, image: oracle@jingfa1 (LMD0)
SQL> oradebug suspend
Statement processed.

Tue Nov 10 06:03:44 2015
Unix process pid: 21822, image: oracle@jingfa1 (LMD0) flash frozen

---node2
暫停lmd
SQL> oradebug setospid 668
Oracle pid: 6, Unix process pid: 668, image: oracle@jingfa2 (LMD0)
SQL> oradebug suspend
Statement processed.

Tue Nov 10 06:06:08 2015
Unix process pid: 668, image: oracle@jingfa2 (LMD0) flash frozen

---node2
SQL> update t_lock set a=11 where a=1;

1 row updated.

--node1
SQL> update t_lock set a=1111 where a=1;
--hang住

現(xiàn)在開(kāi)始觀察節(jié)點(diǎn)1及節(jié)點(diǎn)2的告警日志

--node2
Tue Nov 10 06:09:42 2015
IPC Send timeout detected.Sender: ospid 682  --可見(jiàn)發(fā)送進(jìn)程是SMON進(jìn)程
Receiver: inst 1 binc 432326879 ospid 21822  --可見(jiàn)接受者是NODE1的LMD進(jìn)程
Tue Nov 10 06:09:45 2015
IPC Send timeout to 0.0 inc 20 for msg type 12 from opid 12 --同上,接受者也是SMON進(jìn)程
Tue Nov 10 06:09:45 2015
Communications reconfiguration: instance_number 1
Tue Nov 10 06:09:45 2015
IPC Send timeout detected.Sender: ospid 696  --可見(jiàn)是MMON進(jìn)程為發(fā)送進(jìn)程
Receiver: inst 1 binc 432326879 ospid 21822   --可見(jiàn)接受進(jìn)程是節(jié)點(diǎn)的lmd進(jìn)程
Tue Nov 10 06:09:48 2015
IPC Send timeout to 0.0 inc 20 for msg type 12 from opid 15  ---同上,接受者為mmon發(fā)送進(jìn)程

--node1
Tue Nov 10 06:09:23 2015
IPC Send timeout detected. Receiver ospid 21822  --可見(jiàn)接受為L(zhǎng)MD進(jìn)程
Tue Nov 10 06:09:23 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_lmd0_21822.trc: --產(chǎn)生一個(gè)LMD的TRACE文件
IPC Send timeout detected. Receiver ospid 21822 --同上
Tue Nov 10 06:09:27 2015
Errors in file /u01/app/oracle/admin/jingfa/bdump/jingfa1_lmd0_21822.trc:  

由上可見(jiàn)lmd確實(shí)與全局鎖獲取相關(guān),如果LMD進(jìn)程出現(xiàn)故障,會(huì)導(dǎo)致RAC2個(gè)節(jié)點(diǎn)通訊出現(xiàn)問(wèn)題

[oracle@jingfa2 bdump]$ ps -ef|grep 682
oracle     682     1  0 02:14 ?        00:00:01 ora_smon_jingfa2
oracle    7157 13004  0 06:15 pts/1    00:00:00 grep 682

SQL> select spid,pid,program from v$process where spid=696;

SPID                PID PROGRAM
------------ ---------- ------------------------------------------------
696                  15 oracle@jingfa2 (MMON)

到此,關(guān)于“oracle rac的lmd進(jìn)程怎么理解”的學(xué)習(xí)就結(jié)束了,希望能夠解決大家的疑惑。理論與實(shí)踐的搭配能更好的幫助大家學(xué)習(xí),快去試試吧!若想繼續(xù)學(xué)習(xí)更多相關(guān)知識(shí),請(qǐng)繼續(xù)關(guān)注億速云網(wǎng)站,小編會(huì)繼續(xù)努力為大家?guī)?lái)更多實(shí)用的文章!

向AI問(wèn)一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI