這篇文章主要講解了“Kubernetes中Node異常時(shí)Pod狀態(tài)是怎樣的”,文中的講解內(nèi)容簡(jiǎn)單清晰,易于學(xué)習(xí)與理解,下面請(qǐng)大家跟著小編的思路慢慢深入,一起來(lái)研究和學(xué)習(xí)“Kubernetes中Node異常時(shí)Pod狀態(tài)是怎樣的”吧!
成都創(chuàng)新互聯(lián)公司主營(yíng)延吉網(wǎng)站建設(shè)的網(wǎng)絡(luò)公司,主營(yíng)網(wǎng)站建設(shè)方案,成都app開(kāi)發(fā),延吉h5微信平臺(tái)小程序開(kāi)發(fā)搭建,延吉網(wǎng)站營(yíng)銷推廣歡迎延吉等地區(qū)企業(yè)咨詢
一個(gè)節(jié)點(diǎn)上運(yùn)行著pod前提下,這個(gè)時(shí)候把kubelet進(jìn)程停掉。里面的pod會(huì)被干掉嗎?會(huì)在其他節(jié)點(diǎn)recreate嗎?
結(jié)論:
(1)Node狀態(tài)變?yōu)镹otReady (2)Pod 5分鐘之內(nèi)狀態(tài)無(wú)變化,5分鐘之后的狀態(tài)變化:Daemonset的Pod狀態(tài)變?yōu)镹odelost,Deployment、Statefulset和Static Pod的狀態(tài)先變?yōu)镹odeLost,然后馬上變?yōu)閁nknown。Deployment的pod會(huì)recreate,但是Deployment如果是node selector停掉kubelet的node,則recreate的pod會(huì)一直處于Pending的狀態(tài)。Static Pod和Statefulset的Pod會(huì)一直處于Unknown狀態(tài)。
如果kubelet 10分鐘后又起來(lái)了,node和pod會(huì)怎樣?
結(jié)論:
(1)Node狀態(tài)變?yōu)镽eady。 (2)Daemonset的pod不會(huì)recreate,舊pod狀態(tài)直接變?yōu)镽unning。 (3)Deployment的則是將kubelet進(jìn)程停止的Node刪除(原因可能是因?yàn)榕fPod狀態(tài)在集群中有變化,但是Pod狀態(tài)在變化時(shí)發(fā)現(xiàn)集群中Deployment的Pod實(shí)例數(shù)已經(jīng)夠了,所以對(duì)舊Pod做了刪除處理) (4)Statefulset的Pod會(huì)重新recreate。 (5)Staic Pod沒(méi)有重啟,但是Pod的運(yùn)行時(shí)間會(huì)在kubelet起來(lái)的時(shí)候置為0。
在kubelet停止后,statefulset的pod會(huì)變成nodelost,接著就變成unknown,但是不會(huì)重啟,然后等kubelet起來(lái)后,statefulset的pod才會(huì)recreate。
還有一個(gè)就是Static Pod在kubelet重啟以后應(yīng)該沒(méi)有重啟,但是集群中查詢Static Pod的狀態(tài)時(shí),Static Pod的運(yùn)行時(shí)間變了
Node down后,StatefulSet Pods並沒(méi)有重建,為什麼?
我們?cè)趎ode controller中發(fā)現(xiàn),除了daemonset pods外,都會(huì)調(diào)用delete pod api刪除pod。
但并不是調(diào)用了delete pod api就會(huì)從apiserver/etcd中刪除pod object,僅僅是設(shè)置pod 的deletionTimestamp,標(biāo)記該pod要被刪除。真正刪除Pod的行為是kubelet,kubelet grace terminate該pod后去真正刪除pod object。這個(gè)時(shí)候statefulset controller 發(fā)現(xiàn)某個(gè)replica缺失就會(huì)去recreate這個(gè)pod。
但此時(shí)由于kubelet掛了,無(wú)法與master通信,導(dǎo)致Pod Object一直無(wú)法從etcd中刪除。如果能成功刪除Pod Object,就可以在其他Node重建Pod。
另外,要注意,statefulset只會(huì)針對(duì)isFailed Pod,(但現(xiàn)在Pods是Unkown狀態(tài))才會(huì)去delete Pod。
// delete and recreate failed pods if isFailed(replicas[I]) { ssc.recorder.Eventf(set, v1.EventTypeWarning, "RecreatingFailedPod", "StatefulSetPlus %s/%s is recreating failed Pod %s", set.Namespace, set.Name, replicas[I].Name) if err := ssc.podControl.DeleteStatefulPlusPod(set, replicas[I]); err != nil { return &status, err } if getPodRevision(replicas[I]) == currentRevision.Name { status.CurrentReplicas— } if getPodRevision(replicas[I]) == updateRevision.Name { status.UpdatedReplicas— } status.Replicas— replicas[I] = newVersionedStatefulSetPlusPod( currentSet, updateSet, currentRevision.Name, updateRevision.Name, i) }
所以針對(duì)node異常的情況,有狀態(tài)應(yīng)用(Non-Quorum)的保障,應(yīng)該補(bǔ)充以下行為:
監(jiān)測(cè)node的網(wǎng)絡(luò)、kubelet進(jìn)程、操作系統(tǒng)等是否異常,區(qū)別對(duì)待。
比如,如果是網(wǎng)絡(luò)異常,Pod無(wú)法正常提供服務(wù),那么需要kubectl delete pod -f —grace-period=0
進(jìn)行強(qiáng)制從etcd中刪除該pod。
強(qiáng)制刪除后,statefulset controller就會(huì)自動(dòng)觸發(fā)在其他Node上recreate pod。
亦或者,更粗暴的方法,就是放棄GracePeriodSeconds,StatefulSet Pod GracePeriodSeconds為nil或者0,則就會(huì)直接從etcd中刪除該object。
// BeforeDelete tests whether the object can be gracefully deleted. // If graceful is set, the object should be gracefully deleted. If gracefulPending // is set, the object has already been gracefully deleted (and the provided grace // period is longer than the time to deletion). An error is returned if the // condition cannot be checked or the gracePeriodSeconds is invalid. The options // argument may be updated with default values if graceful is true. Second place // where we set deletionTimestamp is pkg/registry/generic/registry/store.go. // This function is responsible for setting deletionTimestamp during gracefulDeletion, // other one for cascading deletions. func BeforeDelete(strategy RESTDeleteStrategy, ctx context.Context, obj runtime.Object, options *metav1.DeleteOptions) (graceful, gracefulPending bool, err error) { objectMeta, gvk, kerr := objectMetaAndKind(strategy, obj) if kerr != nil { return false, false, kerr } if errs := validation.ValidateDeleteOptions(options); len(errs) > 0 { return false, false, errors.NewInvalid(schema.GroupKind{Group: metav1.GroupName, Kind: "DeleteOptions"}, "", errs) } // Checking the Preconditions here to fail early. They'll be enforced later on when we actually do the deletion, too. if options.Preconditions != nil && options.Preconditions.UID != nil && *options.Preconditions.UID != objectMeta.GetUID() { return false, false, errors.NewConflict(schema.GroupResource{Group: gvk.Group, Resource: gvk.Kind}, objectMeta.GetName(), fmt.Errorf("the UID in the precondition (%s) does not match the UID in record (%s). The object might have been deleted and then recreated", *options.Preconditions.UID, objectMeta.GetUID())) } gracefulStrategy, ok := strategy.(RESTGracefulDeleteStrategy) if !ok { // If we're not deleting gracefully there's no point in updating Generation, as we won't update // the obcject before deleting it. return false, false, nil } // if the object is already being deleted, no need to update generation. if objectMeta.GetDeletionTimestamp() != nil { // if we are already being deleted, we may only shorten the deletion grace period // this means the object was gracefully deleted previously but deletionGracePeriodSeconds was not set, // so we force deletion immediately // IMPORTANT: // The deletion operation happens in two phases. // 1. Update to set DeletionGracePeriodSeconds and DeletionTimestamp // 2. Delete the object from storage. // If the update succeeds, but the delete fails (network error, internal storage error, etc.), // a resource was previously left in a state that was non-recoverable. We // check if the existing stored resource has a grace period as 0 and if so // attempt to delete immediately in order to recover from this scenario. if objectMeta.GetDeletionGracePeriodSeconds() == nil || *objectMeta.GetDeletionGracePeriodSeconds() == 0 { return false, false, nil } ... } ... }
感謝各位的閱讀,以上就是“Kubernetes中Node異常時(shí)Pod狀態(tài)是怎樣的”的內(nèi)容了,經(jīng)過(guò)本文的學(xué)習(xí)后,相信大家對(duì)Kubernetes中Node異常時(shí)Pod狀態(tài)是怎樣的這一問(wèn)題有了更深刻的體會(huì),具體使用情況還需要大家實(shí)踐驗(yàn)證。這里是創(chuàng)新互聯(lián),小編將為大家推送更多相關(guān)知識(shí)點(diǎn)的文章,歡迎關(guān)注!
名稱欄目:Kubernetes中Node異常時(shí)Pod狀態(tài)是怎樣的
標(biāo)題路徑:http://redsoil1982.com.cn/article36/pdsgpg.html
成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供營(yíng)銷型網(wǎng)站建設(shè)、網(wǎng)站排名、網(wǎng)站建設(shè)、定制網(wǎng)站、面包屑導(dǎo)航、標(biāo)簽優(yōu)化
聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來(lái)源: 創(chuàng)新互聯(lián)