溫馨提示×

您好,登錄后才能下訂單哦!

密碼登錄×
登錄注冊(cè)×
其他方式登錄
點(diǎn)擊 登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

發(fā)布時(shí)間:2020-08-06 03:04:12 來源:網(wǎng)絡(luò) 閱讀:5249 作者:loong576 欄目:系統(tǒng)運(yùn)維

環(huán)境說明:

主機(jī)名 操作系統(tǒng)版本 ip docker version kubelet version 配置 備注
master Centos 7.6.1810 172.27.9.131 Docker 18.09.6 V1.14.2 2C2G master主機(jī)
node01 Centos 7.6.1810 172.27.9.135 Docker 18.09.6 V1.14.2 2C2G node節(jié)點(diǎn)
node02 Centos 7.6.1810 172.27.9.136 Docker 18.09.6 V1.14.2 2C2G node節(jié)點(diǎn)

?

k8s集群部署詳見:Centos7.6部署k8s(v1.14.2)集群

k8s學(xué)習(xí)資料詳見:基本概念、kubectl命令和資料分享

emptyDir詳見:存儲(chǔ)卷和數(shù)據(jù)持久化(Volumes and Persistent Storage)

k8s高可用集群部署詳見:Centos7.6部署k8s v1.16.4高可用集群(主備模式)

一、背景

當(dāng)node節(jié)點(diǎn)進(jìn)行如打補(bǔ)丁、操作系統(tǒng)升級(jí)等操作時(shí),需停機(jī)維護(hù),這就涉及pod驅(qū)逐遷移,本文將詳細(xì)介紹node節(jié)點(diǎn)維護(hù)的整個(gè)過程。

二、pdb簡(jiǎn)介

  • pdb為poddisruptionbudgets縮寫,意為主動(dòng)驅(qū)逐保護(hù);
  • 沒有pdb。當(dāng)進(jìn)行節(jié)點(diǎn)維護(hù)時(shí),如果某個(gè)服務(wù)的多個(gè)pod在該節(jié)點(diǎn)上,則節(jié)點(diǎn)的停機(jī)可能會(huì)造成服務(wù)中斷或者服務(wù)降級(jí)。舉個(gè)例子,某服務(wù)有5個(gè)pod,最低3個(gè)pod能保證服務(wù)質(zhì)量,否則會(huì)造成響應(yīng)慢等影響,此時(shí)該服務(wù)的4個(gè)pod在node01上,如果對(duì)node01進(jìn)行停機(jī)維護(hù),此時(shí)只有1個(gè)pod能正常對(duì)外服務(wù),在node01的4個(gè)pod遷移過程中,就會(huì)影響該服務(wù)正常響應(yīng);
  • pdb能保證應(yīng)用在節(jié)點(diǎn)維護(hù)時(shí)不低于一定數(shù)量的pod運(yùn)行,從而保持服務(wù)質(zhì)量;

三、準(zhǔn)備工作

1.新建pod

[root@master ~]# more nginx-master.yml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-master
spec:
  replicas: 10 
  template:
    metadata:
      labels:
        app: nginx
    spec:
      restartPolicy: Always
      containers:
      - name: nginx
        image: nginx:latest
[root@master ~]# kubectl apply -f nginx-master.yml 
deployment.extensions/nginx-master created
[root@master ~]# kubectl get po -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP             NODE     NOMINATED NODE   READINESS GATES
nginx-master-9d4cf4f77-47vfj   1/1     Running   0          28s   10.244.0.129   master   <none>           <none>
nginx-master-9d4cf4f77-69jn6   1/1     Running   0          28s   10.244.2.206   node02   <none>           <none>
nginx-master-9d4cf4f77-6drhg   1/1     Running   0          28s   10.244.1.218   node01   <none>           <none>
nginx-master-9d4cf4f77-b7zfd   1/1     Running   0          28s   10.244.1.219   node01   <none>           <none>
nginx-master-9d4cf4f77-fxsjd   1/1     Running   0          28s   10.244.2.204   node02   <none>           <none>
nginx-master-9d4cf4f77-ktnvk   1/1     Running   0          28s   10.244.0.128   master   <none>           <none>
nginx-master-9d4cf4f77-mzrx7   1/1     Running   0          28s   10.244.1.217   node01   <none>           <none>
nginx-master-9d4cf4f77-pcznk   1/1     Running   0          28s   10.244.2.203   node02   <none>           <none>
nginx-master-9d4cf4f77-px98b   1/1     Running   0          28s   10.244.2.205   node02   <none>           <none>
nginx-master-9d4cf4f77-wtcwt   1/1     Running   0          28s   10.244.1.220   node01   <none>           <none>

新建pod,鏡像為最新版的nginx,deployment為nginx-master,數(shù)量為10。可以看到10個(gè)pod分布在node01、node02和master 3臺(tái)不同主機(jī)上。

2.新建pdb

[root@master ~]# more pdb-nginx.yaml 
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: pdb-nginx
spec:
  minAvailable: 9
  selector:
    matchLabels:
      app: nginx
[root@master ~]# kubectl apply -f pdb-nginx.yaml 
poddisruptionbudget.policy/pdb-nginx created
[root@master ~]# kubectl get pdb
NAME        MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
pdb-nginx   9               N/A               1                     8s

新建pdb pdb-nginx,Label Selector和deployment一樣都為app: nginx,minAvailable: 9意為存活的nginx pod至少為9個(gè)。

四、節(jié)點(diǎn)維護(hù)

本文以節(jié)點(diǎn)node02維護(hù)為例介紹。

1.設(shè)置節(jié)點(diǎn)不可調(diào)度

[root@master ~]# kubectl cordon node02
node/node02 cordoned
[root@master ~]# kubectl get node
NAME     STATUS                     ROLES    AGE    VERSION
master   Ready                      master   184d   v1.14.2
node01   Ready                      <none>   183d   v1.14.2
node02   Ready,SchedulingDisabled   <none>   182d   v1.14.2
[root@master ~]# kubectl get po -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP             NODE     NOMINATED NODE   READINESS GATES
nginx-master-9d4cf4f77-47vfj   1/1     Running   0          30m   10.244.0.129   master   <none>           <none>
nginx-master-9d4cf4f77-69jn6   1/1     Running   0          30m   10.244.2.206   node02   <none>           <none>
nginx-master-9d4cf4f77-6drhg   1/1     Running   0          30m   10.244.1.218   node01   <none>           <none>
nginx-master-9d4cf4f77-b7zfd   1/1     Running   0          30m   10.244.1.219   node01   <none>           <none>
nginx-master-9d4cf4f77-fxsjd   1/1     Running   0          30m   10.244.2.204   node02   <none>           <none>
nginx-master-9d4cf4f77-ktnvk   1/1     Running   0          30m   10.244.0.128   master   <none>           <none>
nginx-master-9d4cf4f77-mzrx7   1/1     Running   0          30m   10.244.1.217   node01   <none>           <none>
nginx-master-9d4cf4f77-pcznk   1/1     Running   0          30m   10.244.2.203   node02   <none>           <none>
nginx-master-9d4cf4f77-px98b   1/1     Running   0          30m   10.244.2.205   node02   <none>           <none>
nginx-master-9d4cf4f77-wtcwt   1/1     Running   0          30m   10.244.1.220   node01   <none>           <none>

設(shè)置node02不可調(diào)度,查看各節(jié)點(diǎn)狀態(tài),發(fā)現(xiàn)node02為SchedulingDisabled,此時(shí)master不會(huì)將新的pod調(diào)度到該節(jié)點(diǎn)上,但是node02上pod還是正常運(yùn)行。

2.驅(qū)逐節(jié)點(diǎn)上的pod

[root@master ~]# kubectl drain node02 --delete-local-data --ignore-daemonsets --force 
node/node02 already cordoned

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

參數(shù)說明:

  • --delete-local-data 即使pod使用了emptyDir也刪除
  • --ignore-daemonsets 忽略deamonset控制器的pod,如果不忽略,deamonset控制器控制的pod被刪除后可能馬上又在此節(jié)點(diǎn)上啟動(dòng)起來,會(huì)成為死循環(huán);
  • --force 不加force參數(shù)只會(huì)刪除該NODE上由ReplicationController, ReplicaSet, DaemonSet,StatefulSet or Job創(chuàng)建的Pod,加了后還會(huì)刪除'裸奔的pod'(沒有綁定到任何replication controller)

可以看到同一時(shí)刻只有一個(gè)pod進(jìn)行遷移,對(duì)外提供服務(wù)的pod始終有9個(gè)。

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

遷移pod nginx-master-9d4cf4f77-pcznk到node01

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

遷移pod nginx-master-9d4cf4f77-px98b到master,此時(shí)前一個(gè)pod nginx-master-9d4cf4f77-pcznk已經(jīng)遷移完成。

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

遷移pod nginx-master-9d4cf4f77-69jn6到master

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

遷移pod nginx-master-9d4cf4f77-fxsjd到master

這個(gè)也再次驗(yàn)證了同一時(shí)刻只有一個(gè)pod遷移,nginx服務(wù)始終有9個(gè)pod對(duì)外提供服務(wù)。

3.維護(hù)結(jié)束

[root@master ~]# kubectl uncordon node02
node/node02 uncordoned
[root@master ~]# kubectl get nodes      
NAME     STATUS   ROLES    AGE    VERSION
master   Ready    master   184d   v1.14.2
node01   Ready    <none>   183d   v1.14.2
node02   Ready    <none>   183d   v1.14.2

維護(hù)結(jié)束,重新將node02節(jié)點(diǎn)置為可調(diào)度狀態(tài)。

五、pod回遷

pod回遷貌似還沒什么好的辦法,這里采用delete然后重建的方式回遷。

[root@master ~]# kubectl get po -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP             NODE     NOMINATED NODE   READINESS GATES
nginx-master-9d4cf4f77-2vnvk   1/1     Running   0          33m   10.244.1.222   node01   <none>           <none>
nginx-master-9d4cf4f77-47vfj   1/1     Running   0          73m   10.244.0.129   master   <none>           <none>
nginx-master-9d4cf4f77-6drhg   1/1     Running   0          73m   10.244.1.218   node01   <none>           <none>
nginx-master-9d4cf4f77-7n7pt   1/1     Running   0          32m   10.244.0.131   master   <none>           <none>
nginx-master-9d4cf4f77-b7zfd   1/1     Running   0          73m   10.244.1.219   node01   <none>           <none>
nginx-master-9d4cf4f77-ktnvk   1/1     Running   0          73m   10.244.0.128   master   <none>           <none>
nginx-master-9d4cf4f77-mzrx7   1/1     Running   0          73m   10.244.1.217   node01   <none>           <none>
nginx-master-9d4cf4f77-pdkst   1/1     Running   0          32m   10.244.0.130   master   <none>           <none>
nginx-master-9d4cf4f77-pskmp   1/1     Running   0          32m   10.244.0.132   master   <none>           <none>
nginx-master-9d4cf4f77-wtcwt   1/1     Running   0          73m   10.244.1.220   node01   <none>           <none>
[root@master ~]# kubectl delete po nginx-master-9d4cf4f77-47vfj
pod "nginx-master-9d4cf4f77-47vfj" deleted
[root@master ~]# kubectl delete po nginx-master-9d4cf4f77-2vnvk
pod "nginx-master-9d4cf4f77-2vnvk" deleted
[root@master ~]# kubectl get po -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP             NODE     NOMINATED NODE   READINESS GATES
nginx-master-9d4cf4f77-6drhg   1/1     Running   0          76m   10.244.1.218   node01   <none>           <none>
nginx-master-9d4cf4f77-7n7pt   1/1     Running   0          35m   10.244.0.131   master   <none>           <none>
nginx-master-9d4cf4f77-b7zfd   1/1     Running   0          76m   10.244.1.219   node01   <none>           <none>
nginx-master-9d4cf4f77-f92hp   1/1     Running   0          44s   10.244.2.207   node02   <none>           <none>
nginx-master-9d4cf4f77-ktnvk   1/1     Running   0          76m   10.244.0.128   master   <none>           <none>
nginx-master-9d4cf4f77-mzrx7   1/1     Running   0          76m   10.244.1.217   node01   <none>           <none>
nginx-master-9d4cf4f77-pdkst   1/1     Running   0          35m   10.244.0.130   master   <none>           <none>
nginx-master-9d4cf4f77-pskmp   1/1     Running   0          35m   10.244.0.132   master   <none>           <none>
nginx-master-9d4cf4f77-tdghn   1/1     Running   0          15s   10.244.2.208   node02   <none>           <none>
nginx-master-9d4cf4f77-wtcwt   1/1     Running   0          76m   10.244.1.220   node01   <none>           <none>

在業(yè)務(wù)低峰delete pod nginx-master-9d4cf4f77-47vfj和nginx-master-9d4cf4f77-2vnvk,由于node02上的pod之前都被驅(qū)逐,此時(shí)資源使用率最低,所以pod重建時(shí)會(huì)調(diào)度值該節(jié)點(diǎn),完成pod回遷。

六、節(jié)點(diǎn)刪除

1.刪除節(jié)點(diǎn)

實(shí)際運(yùn)維過程中可能會(huì)刪除某個(gè)node節(jié)點(diǎn),本文還是以node02為例,介紹如果刪除節(jié)點(diǎn)。

[root@master ~]# kubectl cordon node02
[root@master ~]# kubectl drain node02 --delete-local-data --ignore-daemonsets --force 
[root@master ~]# kubectl delete node node02

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

[root@node02 ~]# kubeadm reset

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

2.節(jié)點(diǎn)重新加入

master節(jié)點(diǎn)上運(yùn)行

[root@master ~]# kubeadm token create --print-join-command
kubeadm join 172.27.9.131:6443 --token kpz40z.tuxb4t4m1q37vwl1     --discovery-token-ca-cert-hash sha256:5f656ae26b5e7d4641a979cbfdffeb7845cc5962bbfcd1d5435f00a25c02ea50 

node02重新加入集群

[root@node02 ~]# kubeadm join 172.27.9.131:6443 --token svrip0.lajrfl4jgal0ul6i     --discovery-token-ca-cert-hash sha256:5f656ae26b5e7d4641a979cbfdffeb7845cc5962bbfcd1d5435f00a25c02ea50 

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

查看node

k8s實(shí)踐(十四):Pod驅(qū)逐遷移和Node節(jié)點(diǎn)維護(hù)

本文所有腳本和配置文件已上傳:Pode Eviction and Node Manage

向AI問一下細(xì)節(jié)

免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。

AI