您好,登錄后才能下訂單哦!
這篇文章主要講解了“Oracle集群心跳及其參數(shù)misscount/disktimeout/reboottime分析”,文中的講解內(nèi)容簡(jiǎn)單清晰,易于學(xué)習(xí)與理解,下面請(qǐng)大家跟著小編的思路慢慢深入,一起來(lái)研究和學(xué)習(xí)“Oracle集群心跳及其參數(shù)misscount/disktimeout/reboottime分析”吧!
一、OCSSD與CSS
OCSSD是一個(gè)管理及提供Cluster Synchronization Services (CSS)服務(wù)的Linux或者Unix進(jìn)程。使用Oracle用戶來(lái)執(zhí)行該進(jìn)程并提供節(jié)點(diǎn)成員管理功能,一旦該進(jìn)程失敗。將導(dǎo)致節(jié)點(diǎn)重新啟動(dòng)。CSS服務(wù)提供2種心跳機(jī)制。一種為網(wǎng)絡(luò)心跳。一種為磁盤(pán)心跳。兩種心跳都有最大延時(shí),網(wǎng)絡(luò)心跳的延時(shí)叫MC(Misscount), 磁盤(pán)心跳延時(shí)叫作IOT (I/O Timeout)。
這2個(gè)參數(shù)都以秒為單位。缺省時(shí)情況下Misscount < Disktimeout。
以下分別描寫(xiě)敘述這2種心跳機(jī)制。
二、網(wǎng)絡(luò)心跳
故名思義即是通過(guò)私有網(wǎng)絡(luò)來(lái)檢測(cè)節(jié)點(diǎn)的狀態(tài)。假設(shè)私有網(wǎng)絡(luò)硬件、軟件導(dǎo)致集群節(jié)點(diǎn)間私有網(wǎng)絡(luò)在一定時(shí)間內(nèi)無(wú)法進(jìn)行正常通信。由此而導(dǎo)致腦裂。由于集群環(huán)境中的存儲(chǔ)為共享存儲(chǔ),因此此時(shí)必須要將故障節(jié)點(diǎn)從 集群隔離出來(lái),以避免數(shù)據(jù)災(zāi)難。關(guān)于這個(gè)網(wǎng)絡(luò)心跳的詳細(xì)動(dòng)作描寫(xiě)敘述例如以下:
Every one second, a sending thread in the cssd sends a network tcp heartbeat to itself and all nodes. The receiving thread of the ocssd.bin receives the heartbeat.
If the package network is dropped or has error, the error correction mechanism on tcp would retransmit the package.
Oracle does not retransmit. From the ocssd.log, you will see a WARNING message about missing of heartbeat if a node does not receive a heartbeat from another node for 15 seconds (50% of miscount). Another warning is reported in ocssd.log if the same node is missing for 22 seconds (75% of miscount)..another warning continues from the same node for 27 seconds (90% miscount). When the heartbeat is missing 100% ..30 seconds miscount, the node is evicted
這個(gè)網(wǎng)絡(luò)心跳的延遲稱之為misscount,能夠通過(guò)crsctl 工具查詢及改動(dòng)。
[grid@Linux-01 ~]$ crsctl get css misscount
CRS-4678: Successful get misscount 30 for Cluster Synchronization Services.
上面的查詢結(jié)果表明,假設(shè)集群各節(jié)點(diǎn)間內(nèi)聯(lián)網(wǎng)絡(luò)延遲大于30s,Oracle覺(jué)得節(jié)點(diǎn)間發(fā)生了腦裂,須要將故障節(jié)點(diǎn)逐出集群。
怎樣尋找故障節(jié)點(diǎn)。Oracle則通過(guò)投票算法來(lái)決定,以下是一個(gè)算法描寫(xiě)敘述演示樣例,描寫(xiě)敘述參考大話Oracle RAC。
集群中各個(gè)節(jié)點(diǎn)須要心跳機(jī)制來(lái)通報(bào)彼此的"健康狀態(tài)"。假設(shè)每收到一個(gè)節(jié)點(diǎn)的"通報(bào)"代表一票。對(duì)于三個(gè)節(jié)點(diǎn)的集群。正常執(zhí)行時(shí),每一個(gè)節(jié)點(diǎn)都會(huì)有3票。當(dāng)結(jié)點(diǎn)A心跳出現(xiàn)故障但節(jié)點(diǎn)A還在執(zhí)行,這時(shí)整個(gè)集群就會(huì)分裂成2個(gè)小的partition。
節(jié)點(diǎn)A是一個(gè)。剩下的2個(gè)是一個(gè)。
這是必須剔除一個(gè)partition才干保障集群的健康執(zhí)行。 對(duì)于這3個(gè)節(jié)點(diǎn)的集群, A 心跳出現(xiàn)故障后, B 和 C 是一個(gè)partion,有2票, A僅僅有1票。
依照投票算法。 B 和C 組成的集群獲得控制權(quán)。 A 被剔除。假設(shè)僅僅有2個(gè)節(jié)點(diǎn),投票算法就失效了。
由于每一個(gè)節(jié)點(diǎn)上都僅僅有1票。 這時(shí)就須要引入第三個(gè)設(shè)備:Quorum Device. Quorum Device 通常採(cǎi)用的是共享磁盤(pán),這個(gè)磁盤(pán)也叫作Quorum disk。 這個(gè)Quorum Disk 也代表一票。 當(dāng)2個(gè)結(jié)點(diǎn)的心跳出現(xiàn)故障時(shí), 2個(gè)節(jié)點(diǎn)同一時(shí)候去爭(zhēng)取Quorum Disk 這一票, 最早到達(dá)的請(qǐng)求被最先滿足。
故最先獲得Quorum Disk的節(jié)點(diǎn)就獲得2票。還有一個(gè)節(jié)點(diǎn)就會(huì)被剔除。
節(jié)點(diǎn)一旦被隔離之后,在11gR2之前一般是重新啟動(dòng)故障節(jié)點(diǎn)。
而在11gR2中。ClusterWare會(huì)首先嘗試關(guān)閉該節(jié)點(diǎn)的全部資源,嘗試對(duì)集群中失敗的組建進(jìn)行清理,即重新啟動(dòng)失敗的組件。
假設(shè)清理失敗的組件未成功,為了強(qiáng)制清理,則再對(duì)節(jié)點(diǎn)進(jìn)行重新啟動(dòng)。
三、磁盤(pán)心跳
A thread in ocssd.bin updates the voting disk every second.
If a node does not update the voting disks for 200 seconds, it's evicted.
However, the ocssd.bin on the local node has the logic that it will bring down the node if it has an I/O error more than majority of the voting disks. Also there is a CRS reconfiguration is happening when misscount is 27 second and the local node is rebooted. As a result, you rarely see an eviction due to failure of the voting disk on 10.2.0.4 (this is more common in 10.2.0.1)) because the ocssd.bin will abort the node before it get evicted by another node if writing to the voting disk is the problem.
如上所述,每一個(gè)節(jié)點(diǎn)會(huì)每一秒鐘更新一次表決磁盤(pán)。共享的表決磁盤(pán)用于檢查磁盤(pán)心跳。
假設(shè)ocssd進(jìn)程更新表決磁盤(pán)的時(shí)間超過(guò)200s,即disktimeout設(shè)定的值。Oracle會(huì)覺(jué)得該表決磁盤(pán)脫機(jī),同一時(shí)候在Clusterware的告警日志中生成表決磁盤(pán)脫機(jī)記錄。假設(shè)當(dāng)前節(jié)點(diǎn)表決磁盤(pán)脫機(jī)的個(gè)數(shù)小于在線表決磁盤(pán)的個(gè)數(shù),該節(jié)點(diǎn)能夠幸存,假設(shè)脫機(jī)表決磁盤(pán)的個(gè)數(shù)大于或等于在線表決磁盤(pán)的個(gè)數(shù),則clusterware覺(jué)得磁盤(pán)心跳出現(xiàn)故障。故障節(jié)點(diǎn)會(huì)被逐出集群。執(zhí)行自己主動(dòng)修復(fù)過(guò)程。
比方有3個(gè)表決磁盤(pán)。節(jié)點(diǎn)A有表決磁盤(pán)出現(xiàn)了脫機(jī)。此時(shí)脫機(jī)磁盤(pán)(1個(gè))<在線磁盤(pán)(2)。clusterware會(huì)在告警日志中生成脫機(jī)記錄,但不採(cǎi)取不論什么行動(dòng)。假設(shè)當(dāng)前節(jié)點(diǎn)有2個(gè)或2個(gè)以上表決磁盤(pán)脫機(jī),此時(shí)脫機(jī)磁盤(pán)(2個(gè))>在線磁盤(pán)(1個(gè))。那節(jié)點(diǎn)A被踢出集群。
四、RebootTime參數(shù)
注意這個(gè)RebootTime參數(shù)。也非常重要,缺省情況下為3s。
Default 3 seconds -the amount of time allowed for a node to complete a reboot
after the CSS daemon has been evicted.
crsctl get css reboottime
五、心跳參數(shù)的調(diào)整
1) 10.2.0.2 to 11.1.0.7版本號(hào)的改動(dòng)方法
a) Shut down CRS on all but one node. For exact steps use note 309542.1
b) Execute crsctl as root to modify the misscount:
$CRS_HOME/bin/crsctl set css misscount <n> #### where <n> is the maximum private network latency in seconds
$CRS_HOME/bin/crsctl set css reboottime <r> [-force] #### (<r> is seconds)
$CRS_HOME/bin/crsctl set css disktimeout <d> [-force] #### (<d> is seconds)
c) Reboot the node where adjustment was made
d) Start all other nodes which was shutdown in step 1
e) Execute crsctl as root to confirm the change:
$CRS_HOME/bin/crsctl get css misscount
$CRS_HOME/bin/crsctl get css reboottime
$CRS_HOME/bin/crsctl get css disktimeout
2) 11gR2的改動(dòng)方法
With 11gR2, these settings can be changed online without taking any node down:
a) Execute crsctl as root to modify the misscount:
$CRS_HOME/bin/crsctl set css misscount <n> #### where <n> is the maximum private network latency in seconds
$CRS_HOME/bin/crsctl set css reboottime <r> [-force] #### (<r> is seconds)
$CRS_HOME/bin/crsctl set css disktimeout <d> [-force] #### (<d> is seconds)
b) Execute crsctl as root to confirm the change:
$CRS_HOME/bin/crsctl get css misscount
$CRS_HOME/bin/crsctl get css reboottime
$CRS_HOME/bin/crsctl get css disktimeout
感謝各位的閱讀,以上就是“Oracle集群心跳及其參數(shù)misscount/disktimeout/reboottime分析”的內(nèi)容了,經(jīng)過(guò)本文的學(xué)習(xí)后,相信大家對(duì)Oracle集群心跳及其參數(shù)misscount/disktimeout/reboottime分析這一問(wèn)題有了更深刻的體會(huì),具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是億速云,小編將為大家推送更多相關(guān)知識(shí)點(diǎn)的文章,歡迎關(guān)注!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。