您好,登錄后才能下訂單哦!
這篇文章給大家分享的是有關(guān)Ceph 出現(xiàn)pg object unfound怎么辦的內(nèi)容。小編覺得挺實(shí)用的,因此分享給大家做個參考,一起跟隨小編過來看看吧。
集群中的一個節(jié)點(diǎn)損壞,同時另外一個節(jié)點(diǎn)壞了一塊盤
查看ceph集群的狀態(tài),看到歸置組pg 4.210丟了一個塊
# ceph health detail HEALTH_WARN 481/5647596 objects misplaced (0.009%); 1/1882532 objects unfound (0.000%); Degraded data redundancy: 965/5647596 objects degraded (0.017%), 1 pg degraded, 1 pg undersized OBJECT_MISPLACED 481/5647596 objects misplaced (0.009%) OBJECT_UNFOUND 1/1882532 objects unfound (0.000%) pg 4.210 has 1 unfound objects PG_DEGRADED Degraded data redundancy: 965/5647596 objects degraded (0.017%), 1 pg degraded, 1 pg undersized pg 4.210 is stuck undersized for 38159.843116, current state active+recovery_wait+undersized+degraded+remapped, last acting [2]
查看pg 4.210,可以看到它現(xiàn)在只有一個副本
# ceph pg dump_json pools |grep 4.210 dumped all 4.210 482 1 965 481 1 2013720576 3461 3461 active+recovery_wait+undersized+degraded+remapped 2019-07-10 09:34:53.693724 9027'1835435 9027:1937140 [6,17,20] 6 [2] 2 6368'1830618 2019-07-07 01:36:16.289885 6368'1830618 2019-07-07 01:36:16.289885 2 # ceph pg map 4.210 osdmap e9181 pg 4.210 (4.210) -> up [26,20,2] acting [2] 丟了兩個副本,而且最主要的是主副本也丟了…
因?yàn)槟J(rèn)指定的pool的min_size為2,這就導(dǎo)致4.210所在的池vms不能正常使用
# ceph osd pool stats vms pool vms id 4 965/1478433 objects degraded (0.065%) 481/1478433 objects misplaced (0.033%) 1/492811 objects unfound (0.000%) client io 680 B/s rd, 399 kB/s wr, 0 op/s rd, 25 op/s wr
# ceph osd pool ls detail|grep vms pool 4 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 10312 lfor 0/874 flags hashpspool stripe_width 0 application rbd
直接影響了部分虛擬機(jī),導(dǎo)致部分虛擬機(jī)夯住了,執(zhí)行命令無回應(yīng)
為了可以正常使用,先見vms池的min_size調(diào)整為1
# ceph osd pool set vms min_size 1 set pool 4 min_size to 1
查看pg4.210
# ceph pg 4.210 query "recovery_state": [ { "name": "Started/Primary/Active", "enter_time": "2019-07-09 23:04:31.718033", "might_have_unfound": [ { "osd": "4", "status": "already probed" }, { "osd": "6", "status": "already probed" }, { "osd": "15", "status": "already probed" }, { "osd": "17", "status": "already probed" }, { "osd": "20", "status": "already probed" }, { "osd": "22", "status": "osd is down" }, { "osd": "23", "status": "already probed" }, { "osd": "26", "status": "osd is down" } ]
字面上理解,pg 4.210的自我恢復(fù)狀態(tài),它已經(jīng)探查了osd4、6、15、17、20、23,osd22和26已經(jīng)down了,而我這里的osd22和26都已經(jīng)移出了集群
根據(jù)官網(wǎng)了解到此處might_have_unfound的osd有以下四種狀態(tài)
already probed querying OSD is down not queried (yet)
兩種解決方案,回退舊版或者直接刪除
# ceph pg 4.210 mark_unfound_lost revert Error EINVAL: pg has 1 unfound objects but we haven't probed all sources,not marking lost # ceph pg 4.210 mark_unfound_lost delete Error EINVAL: pg has 1 unfound objects but we haven't probed all sources,not marking lost
提示報(bào)錯,pg那個未發(fā)現(xiàn)的塊還沒有探查所有的資源,不能標(biāo)記為丟失,也就是不會回退也不可以刪除
猜測可能是已經(jīng)down的osd22和26未探查,剛好壞的節(jié)點(diǎn)也重裝完成,重新添加osd
osd的刪除添加過程此處不贅述了。
添加完成后,再次查看pg 4.210
"recovery_state": [ { "name": "Started/Primary/Active", "enter_time": "2019-07-15 15:24:32.277667", "might_have_unfound": [ { "osd": "4", "status": "already probed" }, { "osd": "6", "status": "already probed" }, { "osd": "15", "status": "already probed" }, { "osd": "17", "status": "already probed" }, { "osd": "20", "status": "already probed" }, { "osd": "22", "status": "already probed" }, { "osd": "23", "status": "already probed" }, { "osd": "24", "status": "already probed" }, { "osd": "26", "status": "already probed" } ], "recovery_progress": { "backfill_targets": [ "20", "26" ],
可以看到所有的資源都probed了,此時執(zhí)行回退命令
# ceph pg 4.210 mark_unfound_lost revert pg has 1 objects unfound and apparently lost marking
查看集群狀態(tài)
# ceph health detail HEALTH_OK
恢復(fù)池vms的min_size為2
# ceph osd pool set vms min_size 2 set pool 4 min_size to 2
感謝各位的閱讀!關(guān)于“Ceph 出現(xiàn)pg object unfound怎么辦”這篇文章就分享到這里了,希望以上內(nèi)容可以對大家有一定的幫助,讓大家可以學(xué)到更多知識,如果覺得文章不錯,可以把它分享出去讓更多的人看到吧!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。