您好,登錄后才能下訂單哦!
本篇內(nèi)容主要講解“怎么創(chuàng)建Node Controller”,感興趣的朋友不妨來(lái)看看。本文介紹的方法操作簡(jiǎn)單快捷,實(shí)用性強(qiáng)。下面就讓小編來(lái)帶大家學(xué)習(xí)“怎么創(chuàng)建Node Controller”吧!
Controller Manager在啟動(dòng)時(shí),會(huì)啟動(dòng)一系列的Controller,Node Controller也是在Controller Manager啟動(dòng)時(shí)StartControllers方法中啟動(dòng)的Controller之一,其對(duì)應(yīng)的創(chuàng)建代碼如下。
cmd/kube-controller-manager/app/controllermanager.go:455 nodeController, err := nodecontroller.NewNodeController( sharedInformers.Core().V1().Pods(), sharedInformers.Core().V1().Nodes(), sharedInformers.Extensions().V1beta1().DaemonSets(), cloud, clientBuilder.ClientOrDie("node-controller"), s.PodEvictionTimeout.Duration, s.NodeEvictionRate, s.SecondaryNodeEvictionRate, s.LargeClusterSizeThreshold, s.UnhealthyZoneThreshold, s.NodeMonitorGracePeriod.Duration, s.NodeStartupGracePeriod.Duration, s.NodeMonitorPeriod.Duration, clusterCIDR, serviceCIDR, int(s.NodeCIDRMaskSize), s.AllocateNodeCIDRs, s.EnableTaintManager, utilfeature.DefaultFeatureGate.Enabled(features.TaintBasedEvictions), )
可見(jiàn),Node Controller主要是ListWatch sharedInformers中的如下對(duì)象:
Pods
Nodes
DaemonSets
另外,需要注意:
s.EnableTaintManager的默認(rèn)值為true,即表示默認(rèn)開(kāi)啟Taint Manager,可通過(guò)--enable-taint-manager
進(jìn)行設(shè)置。
DefaultFeatureGate.Enabled(features.TaintBasedEvictions)的默認(rèn)值為false,可通過(guò)--feature-gates
中添加TaintBasedEvictions=true
修改為true,true即表示Node上的Pods Eviction Operation通過(guò)TaintManager來(lái)進(jìn)行。
補(bǔ)充:關(guān)于Kubernetes的Default FeaturesGate的設(shè)置見(jiàn)如下代碼:
pkg/features/kube_features.go:100 var defaultKubernetesFeatureGates = map[utilfeature.Feature]utilfeature.FeatureSpec{ ExternalTrafficLocalOnly: {Default: true, PreRelease: utilfeature.Beta}, AppArmor: {Default: true, PreRelease: utilfeature.Beta}, DynamicKubeletConfig: {Default: false, PreRelease: utilfeature.Alpha}, DynamicVolumeProvisioning: {Default: true, PreRelease: utilfeature.Alpha}, ExperimentalHostUserNamespaceDefaultingGate: {Default: false, PreRelease: utilfeature.Beta}, ExperimentalCriticalPodAnnotation: {Default: false, PreRelease: utilfeature.Alpha}, AffinityInAnnotations: {Default: false, PreRelease: utilfeature.Alpha}, Accelerators: {Default: false, PreRelease: utilfeature.Alpha}, TaintBasedEvictions: {Default: false, PreRelease: utilfeature.Alpha}, // inherited features from generic apiserver, relisted here to get a conflict if it is changed // unintentionally on either side: StreamingProxyRedirects: {Default: true, PreRelease: utilfeature.Beta}, }
func NewNodeController( podInformer coreinformers.PodInformer, nodeInformer coreinformers.NodeInformer, daemonSetInformer extensionsinformers.DaemonSetInformer, cloud cloudprovider.Interface, kubeClient clientset.Interface, podEvictionTimeout time.Duration, evictionLimiterQPS float32, secondaryEvictionLimiterQPS float32, largeClusterThreshold int32, unhealthyZoneThreshold float32, nodeMonitorGracePeriod time.Duration, nodeStartupGracePeriod time.Duration, nodeMonitorPeriod time.Duration, clusterCIDR *net.IPNet, serviceCIDR *net.IPNet, nodeCIDRMaskSize int, allocateNodeCIDRs bool, runTaintManager bool, useTaintBasedEvictions bool) (*NodeController, error) { ... nc := &NodeController{ cloud: cloud, knownNodeSet: make(map[string]*v1.Node), kubeClient: kubeClient, recorder: recorder, podEvictionTimeout: podEvictionTimeout, maximumGracePeriod: 5 * time.Minute, // 不可配置,表示"The maximum duration before a pod evicted from a node can be forcefully terminated" zonePodEvictor: make(map[string]*RateLimitedTimedQueue), zoneNotReadyOrUnreachableTainer: make(map[string]*RateLimitedTimedQueue), nodeStatusMap: make(map[string]nodeStatusData), nodeMonitorGracePeriod: nodeMonitorGracePeriod, nodeMonitorPeriod: nodeMonitorPeriod, nodeStartupGracePeriod: nodeStartupGracePeriod, lookupIP: net.LookupIP, now: metav1.Now, clusterCIDR: clusterCIDR, serviceCIDR: serviceCIDR, allocateNodeCIDRs: allocateNodeCIDRs, forcefullyDeletePod: func(p *v1.Pod) error { return forcefullyDeletePod(kubeClient, p) }, nodeExistsInCloudProvider: func(nodeName types.NodeName) (bool, error) { return nodeExistsInCloudProvider(cloud, nodeName) }, evictionLimiterQPS: evictionLimiterQPS, secondaryEvictionLimiterQPS: secondaryEvictionLimiterQPS, largeClusterThreshold: largeClusterThreshold, unhealthyZoneThreshold: unhealthyZoneThreshold, zoneStates: make(map[string]zoneState), runTaintManager: runTaintManager, useTaintBasedEvictions: useTaintBasedEvictions && runTaintManager, } ... // 注冊(cè)enterPartialDisruptionFunc函數(shù)為ReducedQPSFunc,當(dāng)zone state為"PartialDisruption"時(shí),將invoke ReducedQPSFunc來(lái)setLimiterInZone。 nc.enterPartialDisruptionFunc = nc.ReducedQPSFunc // 注冊(cè)enterFullDisruptionFunc函數(shù)為HealthyQPSFunc,當(dāng)zone state為"FullDisruption"時(shí),將invoke HealthyQPSFunc來(lái)setLimiterInZone。 nc.enterFullDisruptionFunc = nc.HealthyQPSFunc // 注冊(cè)computeZoneStateFunc函數(shù)為ComputeZoneState,當(dāng)handleDisruption時(shí),將invoke ComputeZoneState來(lái)計(jì)算集群中unhealthy node number及zone state。 nc.computeZoneStateFunc = nc.ComputeZoneState // 注冊(cè)PodInformer的Event Handler:Add,Update,Delete。 podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{ // 對(duì)于Pod Add和Update Event,都會(huì)去判斷Node上kubelet的version,如果version低于1.1.0,則會(huì)通過(guò)forcefullyDeletePod直接調(diào)用apiserver接口刪除etcd中該P(yáng)od object。 // 對(duì)于Pod Add, Update, Delete Event,如果啟動(dòng)了TaintManager,則會(huì)對(duì)比OldPod和newPod的Tolerations信息,如果不相同,則會(huì)將該P(yáng)od的變更信息Add到NoExecuteTaintManager的podUpdateQueue中,交給Taint Controller處理。只不過(guò)對(duì)于Delete Event,newPod 為nil。 AddFunc: func(obj interface{}) { nc.maybeDeleteTerminatingPod(obj) pod := obj.(*v1.Pod) if nc.taintManager != nil { nc.taintManager.PodUpdated(nil, pod) } }, UpdateFunc: func(prev, obj interface{}) { nc.maybeDeleteTerminatingPod(obj) prevPod := prev.(*v1.Pod) newPod := obj.(*v1.Pod) if nc.taintManager != nil { nc.taintManager.PodUpdated(prevPod, newPod) } }, DeleteFunc: func(obj interface{}) { pod, isPod := obj.(*v1.Pod) // We can get DeletedFinalStateUnknown instead of *v1.Node here and we need to handle that correctly. #34692 if !isPod { deletedState, ok := obj.(cache.DeletedFinalStateUnknown) if !ok { glog.Errorf("Received unexpected object: %v", obj) return } pod, ok = deletedState.Obj.(*v1.Pod) if !ok { glog.Errorf("DeletedFinalStateUnknown contained non-Node object: %v", deletedState.Obj) return } } if nc.taintManager != nil { nc.taintManager.PodUpdated(pod, nil) } }, }) // returns true if the shared informer's store has synced. nc.podInformerSynced = podInformer.Informer().HasSynced // 注冊(cè)NodeInformer的Event Handler:Add,Update,Delete。 nodeEventHandlerFuncs := cache.ResourceEventHandlerFuncs{} if nc.allocateNodeCIDRs { // --allocate-node-cidrs —— Should CIDRs for Pods be allocated and set on the cloud provider。 ... } else { nodeEventHandlerFuncs = cache.ResourceEventHandlerFuncs{ // 對(duì)于Node Add, Update, Delete Event,如果啟動(dòng)了TaintManager,則會(huì)對(duì)比OldNode和newNode的Taints信息,如果不相同,則會(huì)將該Node的變更信息Add到NoExecuteTaintManager的nodeUpdateQueue中,交給Taint Controller處理。只不過(guò)對(duì)于Delete Event,newNode 為nil。 AddFunc: func(originalObj interface{}) { obj, err := api.Scheme.DeepCopy(originalObj) if err != nil { utilruntime.HandleError(err) return } node := obj.(*v1.Node) if nc.taintManager != nil { nc.taintManager.NodeUpdated(nil, node) } }, UpdateFunc: func(oldNode, newNode interface{}) { node := newNode.(*v1.Node) prevNode := oldNode.(*v1.Node) if nc.taintManager != nil { nc.taintManager.NodeUpdated(prevNode, node) } }, DeleteFunc: func(originalObj interface{}) { obj, err := api.Scheme.DeepCopy(originalObj) if err != nil { utilruntime.HandleError(err) return } node, isNode := obj.(*v1.Node) // We can get DeletedFinalStateUnknown instead of *v1.Node here and we need to handle that correctly. #34692 if !isNode { deletedState, ok := obj.(cache.DeletedFinalStateUnknown) if !ok { glog.Errorf("Received unexpected object: %v", obj) return } node, ok = deletedState.Obj.(*v1.Node) if !ok { glog.Errorf("DeletedFinalStateUnknown contained non-Node object: %v", deletedState.Obj) return } } if nc.taintManager != nil { nc.taintManager.NodeUpdated(node, nil) } }, } } // 注冊(cè)NoExecuteTaintManager為taintManager。 if nc.runTaintManager { nc.taintManager = NewNoExecuteTaintManager(kubeClient) } nodeInformer.Informer().AddEventHandler(nodeEventHandlerFuncs) nc.nodeLister = nodeInformer.Lister() // returns true if the shared informer's nodeStore has synced. nc.nodeInformerSynced = nodeInformer.Informer().HasSynced // returns true if the shared informer's daemonSetStore has synced. nc.daemonSetStore = daemonSetInformer.Lister() nc.daemonSetInformerSynced = daemonSetInformer.Informer().HasSynced return nc, nil
因此,創(chuàng)建NodeController實(shí)例時(shí),主要進(jìn)行了如下工作:
maximumGracePeriod
- The maximum duration before a pod evicted from a node can be forcefully terminated. 不可配置,代碼中寫死為5min。
注冊(cè)enterPartialDisruptionFunc
函數(shù)為ReducedQPSFunc
,當(dāng)zone state為"PartialDisruption"時(shí),將invoke ReducedQPSFunc
來(lái)setLimiterInZone
。
注冊(cè)enterFullDisruptionFunc
函數(shù)為HealthyQPSFunc
,當(dāng)zone state為"FullDisruption"時(shí),將invoke HealthyQPSFunc
來(lái)setLimiterInZone
。
注冊(cè)computeZoneStateFunc
函數(shù)為ComputeZoneState
,當(dāng)handleDisruption
時(shí),將invoke ComputeZoneState
來(lái)計(jì)算集群中unhealthy node number及zone state。
注冊(cè)**PodInformer
**的Event Handler:Add,Update,Delete。
對(duì)于Pod Add和Update Event,都會(huì)去判斷Node上kubelet version,如果version低于1.1.0,則會(huì)通過(guò)forcefullyDeletePod
直接調(diào)用apiserver接口刪除etcd中該P(yáng)od object。
對(duì)于Pod Add, Update, Delete Event,如果啟動(dòng)了TaintManager
,則會(huì)對(duì)比OldPod和newPod的Tolerations信息,如果不相同,則會(huì)將該P(yáng)od的變更信息Add到NoExecuteTaintManager
的**podUpdateQueue
**中,交給Taint Controller處理。只不過(guò)對(duì)于Delete Event,newPod 為nil。
注冊(cè)PodInformerSynced,用來(lái)檢查the shared informer's Podstore
是否已經(jīng)synced.
注冊(cè)**NodeInformer
**的Event Handler:Add,Update,Delete。
對(duì)于Node Add, Update, Delete Event,如果啟動(dòng)了TaintManager
,則會(huì)對(duì)比OldNode和newNode的Taints信息,如果不相同,則會(huì)將該Node的變更信息Add到NoExecuteTaintManager
的nodeUpdateQueue
中,交給Taint Controller處理。只不過(guò)對(duì)于Delete Event,newNode 為nil。
注冊(cè)NoExecuteTaintManager
為taintManager。
注冊(cè)NodeInformerSynced,用來(lái)檢查the shared informer's NodeStore
是否已經(jīng)synced.
注冊(cè)DaemonSetInformerSynced,用來(lái)檢查the shared informer's DaemonSetStore
是否已經(jīng)synced.
上面提到ZoneState,關(guān)于ZoneState是怎么來(lái)的,見(jiàn)如下代碼:
pkg/api/v1/types.go:3277 const ( // NodeReady means kubelet is healthy and ready to accept pods. NodeReady NodeConditionType = "Ready" // NodeOutOfDisk means the kubelet will not accept new pods due to insufficient free disk // space on the node. NodeOutOfDisk NodeConditionType = "OutOfDisk" // NodeMemoryPressure means the kubelet is under pressure due to insufficient available memory. NodeMemoryPressure NodeConditionType = "MemoryPressure" // NodeDiskPressure means the kubelet is under pressure due to insufficient available disk. NodeDiskPressure NodeConditionType = "DiskPressure" // NodeNetworkUnavailable means that network for the node is not correctly configured. NodeNetworkUnavailable NodeConditionType = "NetworkUnavailable" // NodeInodePressure means the kubelet is under pressure due to insufficient available inodes. NodeInodePressure NodeConditionType = "InodePressure" ) pkg/controller/node/nodecontroller.go:1149 // This function is expected to get a slice of NodeReadyConditions for all Nodes in a given zone. // The zone is considered: // - fullyDisrupted if there're no Ready Nodes, // - partiallyDisrupted if at least than nc.unhealthyZoneThreshold percent of Nodes are not Ready, // - normal otherwise func (nc *NodeController) ComputeZoneState(nodeReadyConditions []*v1.NodeCondition) (int, zoneState) { readyNodes := 0 notReadyNodes := 0 for i := range nodeReadyConditions { if nodeReadyConditions[i] != nil && nodeReadyConditions[i].Status == v1.ConditionTrue { readyNodes++ } else { notReadyNodes++ } } switch { case readyNodes == 0 && notReadyNodes > 0: return notReadyNodes, stateFullDisruption case notReadyNodes > 2 && float32(notReadyNodes)/float32(notReadyNodes+readyNodes) >= nc.unhealthyZoneThreshold: return notReadyNodes, statePartialDisruption default: return notReadyNodes, stateNormal } }
zone state共分為如下三種類型:
FullDisruption:Ready狀態(tài)的Nodes number為0,并且NotReady狀態(tài)的Nodes number大于0。
PartialDisruption:NotReady狀態(tài)的Nodes number大于2,并且notReadyNodes/(notReadyNodes+readyNodes) >= nc.unhealthyZoneThreshold
,其中nc.unhealthyZoneThreshold通過(guò)--unhealthy-zone-threshold
設(shè)置,默認(rèn)為0.55。
Normal:除了以上兩種zone state,其他都屬于Normal狀態(tài)。
到此,相信大家對(duì)“怎么創(chuàng)建Node Controller”有了更深的了解,不妨來(lái)實(shí)際操作一番吧!這里是億速云網(wǎng)站,更多相關(guān)內(nèi)容可以進(jìn)入相關(guān)頻道進(jìn)行查詢,關(guān)注我們,繼續(xù)學(xué)習(xí)!
免責(zé)聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如果涉及侵權(quán)請(qǐng)聯(lián)系站長(zhǎng)郵箱:is@yisu.com進(jìn)行舉報(bào),并提供相關(guān)證據(jù),一經(jīng)查實(shí),將立刻刪除涉嫌侵權(quán)內(nèi)容。