溫馨提示×

溫馨提示×

您好，登錄后才能下訂單哦！

密碼登錄×

忘記密碼？

登錄注冊(cè)×

獲取短信驗(yàn)證碼

其他方式登錄

點(diǎn)擊登錄注冊(cè) 即表示同意《億速云用戶服務(wù)條款》

用戶登錄×

賬戶密碼登錄

請(qǐng)使用微信掃描上方二維碼

使用幫助

請(qǐng)求超時(shí)！

請(qǐng)點(diǎn)擊重新獲取二維碼

Kubernetes Endpoints Controller的源碼解析

發(fā)布時(shí)間：2021-08-30 16:12:24 來源：億速云閱讀：109 作者：chen 欄目：云計(jì)算

本篇內(nèi)容介紹了“Kubernetes Endpoints Controller的源碼解析”的有關(guān)知識(shí)，在實(shí)際案例的操作過程中，不少人都會(huì)遇到這樣的困境，接下來就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧！希望大家仔細(xì)閱讀，能夠?qū)W有所成！

Endpoints Controller相關(guān)的配置項(xiàng)

--concurrent-endpoint-syncs int32 Default: 5 The number of endpoint syncing operations that will be done concurrently. Larger number = faster endpoint updating, but more CPU (and network) load.
--leader-elect-resource-lock endpoints Default: "endpoints" The type of resource object that is used for locking during leader election. Supported options are endpoints (default) and configmaps.

Endpoints Controller Watch的GVK

Core/V1/Pods
Core/V1/Services
Core/V1/Endpoints

Endpoints Controller Event Handler

Add Service Event --> enqueueService
Update Service Event --> enqueueService(new)
Delete Service Event --> enqueueService
Add Pod Event --> addPod
Update Pod Event --> updatePod
Delete Pod Event --> deletePod
Add/Update/Delete Endpoints Event --> nil

Run Endpoints Controller

啟動(dòng)兩類go協(xié)程：

一類協(xié)程數(shù)為--concurrent-endpoint-syncs配置值(default 5)，每個(gè)worker負(fù)責(zé)從service queue中pop service進(jìn)行syncService同步，完成一次sync后等待1s再從service queue中pop一個(gè)service進(jìn)行sync，如此反復(fù)。
另一類協(xié)程只有一個(gè)協(xié)程，負(fù)責(zé)checkLeftoverEndpoints，只有啟動(dòng)時(shí)會(huì)執(zhí)行一次。

// Run will not return until stopCh is closed. workers determines how many
// endpoints will be handled in parallel.
func (e *EndpointController) Run(workers int, stopCh <-chan struct{}) {
	defer utilruntime.HandleCrash()
	defer e.queue.ShutDown()

	glog.Infof("Starting endpoint controller")
	defer glog.Infof("Shutting down endpoint controller")

	if !controller.WaitForCacheSync("endpoint", stopCh, e.podsSynced, e.servicesSynced, e.endpointsSynced) {
		return
	}

	// workers = --concurrent-endpoint-syncs's value (default 5)
	for i := 0; i < workers; i++ {
		// workerLoopPeriod = 1s
		go wait.Until(e.worker, e.workerLoopPeriod, stopCh)
	}

	go func() {
		defer utilruntime.HandleCrash()
		e.checkLeftoverEndpoints()
	}()

	<-stopCh
}

checkLeftoverEndpoints

checkLeftoverEndpoints負(fù)責(zé)List所有當(dāng)前集群中的endpoints并將它們對(duì)應(yīng)的services添加到queue中，由workers進(jìn)行syncService同步。

這是為了防止在controller-manager發(fā)生重啟時(shí)時(shí)，用戶刪除了某些Services或者某些Endpoints還沒刪除干凈，Endpoints Controller沒有處理的情況下，在Endpoints Controller再次啟動(dòng)時(shí)能通過checkLeftoverEndpoints檢測到那些孤立的endpionts（沒有對(duì)應(yīng)services），將虛構(gòu)的Services重新加入到隊(duì)列進(jìn)行syncService操作，從而完成這些孤立endpoint的清理工作。

上面提到的虛構(gòu)Services其實(shí)是把Endpoints的Key(namespace/name)作為Services的Key，因此這就是為什么要求Endpiont和Service的名字要一致的原因之一。

func (e *EndpointController) checkLeftoverEndpoints() {
	list, err := e.endpointsLister.List(labels.Everything())
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("Unable to list endpoints (%v); orphaned endpoints will not be cleaned up. (They're pretty harmless, but you can restart this component if you want another attempt made.)", err))
		return
	}
	for _, ep := range list {
		if _, ok := ep.Annotations[resourcelock.LeaderElectionRecordAnnotationKey]; ok {
			// when there are multiple controller-manager instances,
			// we observe that it will delete leader-election endpoints after 5min
			// and cause re-election
			// so skip the delete here
			// as leader-election only have endpoints without service
			continue
		}
		key, err := keyFunc(ep)
		if err != nil {
			utilruntime.HandleError(fmt.Errorf("Unable to get key for endpoint %#v", ep))
			continue
		}
		e.queue.Add(key)
	}
}

另外，還需要注意一點(diǎn)，對(duì)于kube-controller-manager多實(shí)例HA部署時(shí)，各個(gè)contorller-manager endpoints是沒有對(duì)應(yīng)service的，這種情況下，我們不能把虛構(gòu)的Service加入到隊(duì)列觸發(fā)這些“理應(yīng)孤立”的endpoints被清理，因此我們給這些“理應(yīng)孤立”的endpoints加上Annotation "control-plane.alpha.kubernetes.io/leader"以做跳過處理。

Endpoint Contoller的核心邏輯syncService

Service的Add/Update/Delete Event Handler都是將Service Key加入到Queue中，等待worker進(jìn)行syncService處理：

根據(jù)queue中得到的service key(namespace/name)去indexer中獲取對(duì)應(yīng)的Service Object，如果沒獲取到，則調(diào)api刪除同Key（namespace/name）的Endpoints Object進(jìn)行清理工作，這對(duì)應(yīng)到checkLeftoverEndpoints中描述到的那些孤立endpoints清理工作。
因?yàn)镾ervice是通過LabelSelector進(jìn)行Pod匹配，將匹配的Pods構(gòu)建對(duì)應(yīng)的Endpoints Subsets加入到Endpoints中，因此這里會(huì)先過濾掉那些沒有LabelSelector的Services。
然后用Service的LabelSelector獲取同namespace下的所有Pods。
檢查service.Spec.PublishNotReadyAddresses是否為true，或者Service Annotations "service.alpha.kubernetes.io/tolerate-unready-endpoints"是否為true(/t/T/True/TRUE/1)，如果為true，則表示tolerate Unready Endpoints，即Unready的Pods信息也會(huì)被加入該Service對(duì)應(yīng)的Endpoints中。
注意，Annotations "service.alpha.kubernetes.io/tolerate-unready-endpoints"在Kubernetes 1.13中將被棄用，后續(xù)只使用.Spec.PublishNotReadyAddresses Field。

接下來就是遍歷前面獲取到的Pods，用各個(gè)Pod的IP、ContainerPorts、HostName及Service的Port去構(gòu)建Endpoints的Subsets，注意如下特殊處理：

4）當(dāng)tolerate Unready Endpoints為true(即使Pod not Ready)或者Pod isReady時(shí)，Pod對(duì)應(yīng)的EndpointAddress也會(huì)被加入到(Ready)Addresses中。

5）tolerate Unready Endpoints為false且Pod isNotReady情況下：

 - 當(dāng)pod.Spec.RestartPolicy為Never，Pod Status.Phase為非結(jié)束狀態(tài)(非Failed/Successed)時(shí)，Pod對(duì)應(yīng)的EndpointAddress也會(huì)被加入到NotReadyAddresses中。
 - 當(dāng)pod.Spec.RestartPolicy為OnFailure, Pod Status.Phase為非Successed時(shí)，Pod對(duì)應(yīng)的EndpointAddress也會(huì)被加入到NotReadyAddresses中。
 - 其他情況下，Pod對(duì)應(yīng)的EndpointAddress也會(huì)被加入到NotReadyAddresses中。

跳過沒有pod.Status.PodIP為空的pod；
當(dāng)tolerate Unready Endpoints為false時(shí)，跳過那些被標(biāo)記刪除(DeletionTimestamp != nil)的Pods;
對(duì)于Headless Service，因?yàn)闆]有Service Port，因此構(gòu)建EndpointSubset時(shí)對(duì)應(yīng)的Ports內(nèi)容為空；

從indexer中獲取service對(duì)應(yīng)的Endpoints Object(currentEndpoints)，如果從indexer中沒有返回對(duì)應(yīng)的Endpoints Object，則構(gòu)建一個(gè)與該Service同名、同Labels的Endpoints對(duì)象(newEndpoints)。
如果currentEndpoints的ResourceVersion不為空，則對(duì)比currentEndpoints.Subsets、Labels與前面構(gòu)建的Subsets、Service.Labels是否DeepEqual，如果是則說明不需要update，流程結(jié)束。
否則，就像currentEndpoints DeepCopy給newEndpoints,并用前面構(gòu)建的Subsets和Services.Labels替換newEndpoints中對(duì)應(yīng)內(nèi)容。
如果currentEndpoints的ResourceVersion為空，則調(diào)用Create API去創(chuàng)建上一步的newEndpoints Object。如果currentEndpoints的ResourceVersion不為空，表示已經(jīng)存在對(duì)應(yīng)的Endpoints，則調(diào)用Update API用newEndpoints去更新該Endpoints。
流程結(jié)束。

Pod Event Hanlder

Add Pod

通過Services LabeleSelector與Pod Labels進(jìn)行匹配的方法，將該P(yáng)od能匹配上的所有Services都找出來，然后將它們的Key(namespace/name)都加入到queue等待sync。

// When a pod is added, figure out what services it will be a member of and
// enqueue them. obj must have *v1.Pod type.
func (e *EndpointController) addPod(obj interface{}) {
	pod := obj.(*v1.Pod)
	services, err := e.getPodServiceMemberships(pod)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("Unable to get pod %s/%s's service memberships: %v", pod.Namespace, pod.Name, err))
		return
	}
	for key := range services {
		e.queue.Add(key)
	}
}

Update Pod

如果newPod.ResourceVersion等于oldPod.ResourceVersion，則跳過，不進(jìn)行任何update。
檢查新老Pod的DeletionTimestamp、Ready Condition以及由PodIP,Hostname等建構(gòu)的EndpointAddress是否發(fā)生變更，只要其中之一發(fā)生變更，podChangedFlag就為true。
檢查新老Pod Spec的Labels、HostName、Subdomain是否發(fā)生變更，只要其中之一發(fā)生變更，labelChangedFlag就為true。
如果podChangedFlag和labelChangedFlag都為false，則跳過，不做任何update。
通過Services LabeleSelector與Pod Labels進(jìn)行匹配的方法，將newPod能匹配上的所有Services都找出來(services記錄)，如果labelChangedFlag為true，則根據(jù)LabelSelector匹配找出oldPod對(duì)應(yīng)的oldServices:
互相差值進(jìn)行union集合的含義：services.Difference(oldServices).Union(oldServices.Difference(services))

如果podChangedFlag為true,則將services和oldServices進(jìn)行union集合，將集合內(nèi)的所有Services Key都加入到queue中等待sync；
如果podChangedFlag為false，則將services和oldServices的互相差值進(jìn)行union集合，將集合內(nèi)的所有Services Key都加入到queue中等待sync；

Delete Pod

如果該pod還是個(gè)完整記錄的pod，則跟addPod邏輯一樣：通過Services LabeleSelector與Pod Labels進(jìn)行匹配的方法，將該P(yáng)od能匹配上的所有Services都找出來，然后將它們的Key(namespace/name)都加入到queue等待sync。
如果該pod是tombstone object(final state is unrecorded)，則將其轉(zhuǎn)換成v1.pod后，再調(diào)用addPod。相比正常的Pod，就是多了一步：從tombstone到v1.pod的轉(zhuǎn)換。

// When a pod is deleted, enqueue the services the pod used to be a member of.
// obj could be an *v1.Pod, or a DeletionFinalStateUnknown marker item.
func (e *EndpointController) deletePod(obj interface{}) {
	if _, ok := obj.(*v1.Pod); ok {
		// Enqueue all the services that the pod used to be a member
		// of. This happens to be exactly the same thing we do when a
		// pod is added.
		e.addPod(obj)
		return
	}
	// If we reached here it means the pod was deleted but its final state is unrecorded.
	tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
	if !ok {
		utilruntime.HandleError(fmt.Errorf("Couldn't get object from tombstone %#v", obj))
		return
	}
	pod, ok := tombstone.Obj.(*v1.Pod)
	if !ok {
		utilruntime.HandleError(fmt.Errorf("Tombstone contained object that is not a Pod: %#v", obj))
		return
	}
	glog.V(4).Infof("Enqueuing services of deleted pod %s/%s having final state unrecorded", pod.Namespace, pod.Name)
	e.addPod(pod)
}

核心Struct

里面有幾個(gè)struct，挺容易混淆的，簡單用圖表示下，方便比對(duì)：

Kubernetes Endpoints Controller的源碼解析

總結(jié)

通過對(duì)Endpoints Controller的源碼分析，我們了解了其中很多細(xì)節(jié)，比如對(duì)Service和Pod事件處理邏輯、對(duì)孤立Pod的處理方法、Pod Labels變更帶來的影響等等，這對(duì)我們通過Watch Endpoints去寫自己的Ingress組件對(duì)接公司內(nèi)部的路由組件時(shí)是有幫助的。

“Kubernetes Endpoints Controller的源碼解析”的內(nèi)容就介紹到這里了，感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注億速云網(wǎng)站，小編將為大家輸出更多高質(zhì)量的實(shí)用文章！

向AI問一下細(xì)節(jié)

推薦閱讀：

免責(zé)聲明：本站發(fā)布的內(nèi)容（圖片、視頻和文字）以原創(chuàng)、轉(zhuǎn)載和分享為主，文章觀點(diǎn)不代表本網(wǎng)站立場，如果涉及侵權(quán)請(qǐng)聯(lián)系站長郵箱：is@yisu.com進(jìn)行舉報(bào)，并提供相關(guān)證據(jù)，一經(jīng)查實(shí)，將立刻刪除涉嫌侵權(quán)內(nèi)容。

上一篇新聞：
Linux查看系統(tǒng)日志的命令
下一篇新聞：
怎么分析并探索Docker容器鏡像的內(nèi)容

猜你喜歡

AI
助
手

產(chǎn)品服務(wù)

地區(qū)劃分

專題活動(dòng)

幫助支持

關(guān)于我們

售后咨詢

7*24小時(shí)在線電話：400-100-2938

7*24小時(shí)在線 QQ：800811969

關(guān)注億速云

億速云公眾號(hào)

手機(jī)網(wǎng)站二維碼