Predicates Policies有什么用

發(fā)布時(shí)間：2021-12-20 09:58:21 來源：億速云閱讀：159 作者：iii 欄目：云計(jì)算

本篇內(nèi)容介紹了“Predicates Policies有什么用”的有關(guān)知識，在實(shí)際案例的操作過程中，不少人都會遇到這樣的困境，接下來就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧！希望大家仔細(xì)閱讀，能夠?qū)W有所成！

##Predicates Policies分析在/plugin/pkg/scheduler/algorithm/predicates.go中實(shí)現(xiàn)了以下的預(yù)選策略：

NoDiskConflict：檢查在此主機(jī)上是否存在卷沖突。如果這個(gè)主機(jī)已經(jīng)掛載了卷，其它同樣使用這個(gè)卷的Pod不能調(diào)度到這個(gè)主機(jī)上。GCE,Amazon EBS, and Ceph RBD使用的規(guī)則如下：

GCE允許同時(shí)掛載多個(gè)卷，只要這些卷都是只讀的。
Amazon EBS不允許不同的Pod掛載同一個(gè)卷。
Ceph RBD不允許任何兩個(gè)pods分享相同的monitor，match pool和 image。

NoVolumeZoneConflict：檢查給定的zone限制前提下，檢查如果在此主機(jī)上部署Pod是否存在卷沖突。假定一些volumes可能有zone調(diào)度約束， VolumeZonePredicate根據(jù)volumes自身需求來評估pod是否滿足條件。必要條件就是任何volumes的zone-labels必須與節(jié)點(diǎn)上的zone-labels完全匹配。節(jié)點(diǎn)上可以有多個(gè)zone-labels的約束（比如一個(gè)假設(shè)的復(fù)制卷可能會允許進(jìn)行區(qū)域范圍內(nèi)的訪問）。目前，這個(gè)只對PersistentVolumeClaims支持，而且只在PersistentVolume的范圍內(nèi)查找標(biāo)簽。處理在Pod的屬性中定義的volumes（即不使用PersistentVolume）有可能會變得更加困難，因?yàn)橐谡{(diào)度的過程中確定volume的zone，這很有可能會需要調(diào)用云提供商。
PodFitsResources：檢查主機(jī)的資源是否滿足Pod的需求。根據(jù)實(shí)際已經(jīng)分配的資源量做調(diào)度，而不是使用已實(shí)際使用的資源量做調(diào)度。
PodFitsHostPorts：檢查Pod內(nèi)每一個(gè)容器所需的HostPort是否已被其它容器占用。如果有所需的HostPort不滿足需求，那么Pod不能調(diào)度到這個(gè)主機(jī)上。
HostName：檢查主機(jī)名稱是不是Pod指定的HostName。
MatchNodeSelector：檢查主機(jī)的標(biāo)簽是否滿足Pod的nodeSelector屬性需求。
MaxEBSVolumeCount：確保已掛載的EBS存儲卷不超過設(shè)置的最大值。默認(rèn)值是39。它會檢查直接使用的存儲卷，和間接使用這種類型存儲的PVC。計(jì)算不同卷的總目，如果新的Pod部署上去后卷的數(shù)目會超過設(shè)置的最大值，那么Pod不能調(diào)度到這個(gè)主機(jī)上。
MaxGCEPDVolumeCount：確保已掛載的GCE存儲卷不超過設(shè)置的最大值。默認(rèn)值是16。規(guī)則同上。

下面是NoDiskConflict的代碼實(shí)現(xiàn)，其他Predicates Policies實(shí)現(xiàn)類似，都得如下函數(shù)原型： type FitPredicate func(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (bool, []PredicateFailureReason, error)

func NoDiskConflict(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (bool, []algorithm.PredicateFailureReason, error) {
	for _, v := range pod.Spec.Volumes {
		for _, ev := range nodeInfo.Pods() {
			if isVolumeConflict(v, ev) {
				return false, []algorithm.PredicateFailureReason{ErrDiskConflict}, nil
			}
		}
	}
	return true, nil, nil
}


func isVolumeConflict(volume v1.Volume, pod *v1.Pod) bool {
	// fast path if there is no conflict checking targets.
	if volume.GCEPersistentDisk == nil && volume.AWSElasticBlockStore == nil && volume.RBD == nil && volume.ISCSI == nil {
		return false
	}

	for _, existingVolume := range pod.Spec.Volumes {
		...

		if volume.RBD != nil && existingVolume.RBD != nil {
			mon, pool, image := volume.RBD.CephMonitors, volume.RBD.RBDPool, volume.RBD.RBDImage
			emon, epool, eimage := existingVolume.RBD.CephMonitors, existingVolume.RBD.RBDPool, existingVolume.RBD.RBDImage
			// two RBDs images are the same if they share the same Ceph monitor, are in the same RADOS Pool, and have the same image name
			// only one read-write mount is permitted for the same RBD image.
			// same RBD image mounted by multiple Pods conflicts unless all Pods mount the image read-only
			if haveSame(mon, emon) && pool == epool && image == eimage && !(volume.RBD.ReadOnly && existingVolume.RBD.ReadOnly) {
				return true
			}
		}
	}

	return false
}

##Priorities Policies分析

現(xiàn)在支持的優(yōu)先級函數(shù)包括以下幾種：

LeastRequestedPriority：如果新的pod要分配給一個(gè)節(jié)點(diǎn)，這個(gè)節(jié)點(diǎn)的優(yōu)先級就由節(jié)點(diǎn)空閑的那部分與總?cè)萘康谋戎担矗側(cè)萘?節(jié)點(diǎn)上pod的容量總和-新pod的容量）/總?cè)萘浚﹣頉Q定。CPU和memory權(quán)重相當(dāng)，比值最大的節(jié)點(diǎn)的得分最高。需要注意的是，這個(gè)優(yōu)先級函數(shù)起到了按照資源消耗來跨節(jié)點(diǎn)分配pods的作用。計(jì)算公式如下： cpu((capacity – sum(requested)) * 10 / capacity) + memory((capacity – sum(requested)) * 10 / capacity) / 2
BalancedResourceAllocation：盡量選擇在部署Pod后各項(xiàng)資源更均衡的機(jī)器。BalancedResourceAllocation不能單獨(dú)使用，而且必須和LeastRequestedPriority同時(shí)使用，它分別計(jì)算主機(jī)上的cpu和memory的比重，主機(jī)的分值由cpu比重和memory比重的“距離”決定。計(jì)算公式如下： score = 10 – abs(cpuFraction-memoryFraction)*10
SelectorSpreadPriority：對于屬于同一個(gè)service、replication controller的Pod，盡量分散在不同的主機(jī)上。如果指定了區(qū)域，則會盡量把Pod分散在不同區(qū)域的不同主機(jī)上。調(diào)度一個(gè)Pod的時(shí)候，先查找Pod對于的service或者replication controller，然后查找service或replication controller中已存在的Pod，主機(jī)上運(yùn)行的已存在的Pod越少，主機(jī)的打分越高。
CalculateAntiAffinityPriority：對于屬于同一個(gè)service的Pod，盡量分散在不同的具有指定標(biāo)簽的主機(jī)上。
ImageLocalityPriority：根據(jù)主機(jī)上是否已具備Pod運(yùn)行的環(huán)境來打分。ImageLocalityPriority會判斷主機(jī)上是否已存在Pod運(yùn)行所需的鏡像，根據(jù)已有鏡像的大小返回一個(gè)0-10的打分。如果主機(jī)上不存在Pod所需的鏡像，返回0；如果主機(jī)上存在部分所需鏡像，則根據(jù)這些鏡像的大小來決定分值，鏡像越大，打分就越高。
NodeAffinityPriority（Kubernetes1.2實(shí)驗(yàn)中的新特性）：Kubernetes調(diào)度中的親和性機(jī)制。Node Selectors（調(diào)度時(shí)將pod限定在指定節(jié)點(diǎn)上），支持多種操作符（In, NotIn, Exists, DoesNotExist, Gt, Lt），而不限于對節(jié)點(diǎn)labels的精確匹配。另外，Kubernetes支持兩種類型的選擇器，一種是“hard（requiredDuringSchedulingIgnoredDuringExecution）”選擇器，它保證所選的主機(jī)必須滿足所有Pod對主機(jī)的規(guī)則要求。這種選擇器更像是之前的nodeselector，在nodeselector的基礎(chǔ)上增加了更合適的表現(xiàn)語法。另一種是“soft（preferresDuringSchedulingIgnoredDuringExecution）”選擇器，它作為對調(diào)度器的提示，調(diào)度器會盡量但不保證滿足NodeSelector的所有要求。

下面是ImageLocalityPriority的代碼實(shí)現(xiàn)，其他Priorities Policies實(shí)現(xiàn)類似，都得如下函數(shù)原型： type PriorityMapFunction func(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (schedulerapi.HostPriority, error)

func ImageLocalityPriorityMap(pod *v1.Pod, meta interface{}, nodeInfo *schedulercache.NodeInfo) (schedulerapi.HostPriority, error) {
	node := nodeInfo.Node()
	if node == nil {
		return schedulerapi.HostPriority{}, fmt.Errorf("node not found")
	}

	var sumSize int64
	for i := range pod.Spec.Containers {
		sumSize += checkContainerImageOnNode(node, &pod.Spec.Containers[i])
	}
	return schedulerapi.HostPriority{
		Host:  node.Name,
		Score: calculateScoreFromSize(sumSize),
	}, nil
}

func calculateScoreFromSize(sumSize int64) int {
	var score int
	switch {
	case sumSize == 0 || sumSize < minImgSize:
		// score == 0 means none of the images required by this pod are present on this
		// node or the total size of the images present is too small to be taken into further consideration.
		score = 0
	// If existing images' total size is larger than max, just make it highest priority.
	case sumSize >= maxImgSize:
		score = 10
	default:
		score = int((10 * (sumSize - minImgSize) / (maxImgSize - minImgSize)) + 1)
	}
	// Return which bucket the given size belongs to
	return score
}

其計(jì)算每個(gè)Node的Score算法為： score = int((10 * (sumSize - minImgSize) / (maxImgSize - minImgSize)) + 1)

其中： minImgSize int64 = 23 * mb, maxImgSize int64 = 1000 * mb, sumSize為Pod中定義的container Images' size 的總和。

可見，Node上該P(yáng)od要求的容器鏡像大小之和越大，得分越高，越有可能是目標(biāo)Node。

“Predicates Policies有什么用”的內(nèi)容就介紹到這里了，感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識可以關(guān)注億速云網(wǎng)站，小編將為大家輸出更多高質(zhì)量的實(shí)用文章！

向AI問一下細(xì)節(jié)

Predicates Policies有什么用

猜你喜歡

最新資訊

相關(guān)推薦

相關(guān)標(biāo)簽