11.2、亲和性调度(podAffinity)
5.4.2 亲和性调度
上一节,介绍了两种定向调度的方式,使用起来非常方便,但是也有一定的问题,那就是如果没有满足条件的Node,那么Pod将不会被运行,即使在集群中还有可用Node列表也不行,这就限制了它的使用场景。
基于上面的问题,kubernetes还提供了一种亲和性调度(Affinity)。它在NodeSelector的基础之上的进行了扩展,可以通过配置的形式,实现优先选择满足条件的Node进行调度,如果没有,也可以调度到不满足条件的节点上,使调度更加灵活。
Affinity主要分为三类:
- nodeAffinity(node亲和性): 以node为目标,解决pod可以调度到哪些node的问题(即pod偏好node1则优先调度到node1)
- podAffinity(pod亲和性) : 以pod为目标,解决pod可以和哪些已存在的pod部署在同一个拓扑域中的问题(即pod1喜欢和node2节点的pod2在一起那他就可以去node2上)
- podAntiAffinity(pod反亲和性) : 以pod为目标,解决pod不能和哪些已存在pod部署在同一个拓扑域中的问题(即pod2讨厌和node1节点在一起那它就可以去node2节点 讨厌pod1那他就不会在pod1所在的节点)
关于亲和性(反亲和性)使用场景的说明:
*亲和性***:如果两个应用频繁交互,那就有必要利用亲和性让两个应用的尽可能的靠近,这样可以减少因网络通信而带来的性能损耗。 **
反亲和性****:当应用的采用多副本部署时,有必要采用反亲和性让各个应用实例打散分布在各个node上,这样可以提高服务的高可用性(如tomcat高可用)。
一、NodeAffinity (node亲和性)
1、首先来看一下NodeAffinity
的可配置项:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| pod.spec.affinity.nodeAffinity requiredDuringSchedulingIgnoredDuringExecution Node节点必须满足指定的所有规则才可以,相当于硬限制 nodeSelectorTerms 节点选择列表 matchFields 按节点字段列出的节点选择器要求列表 matchExpressions 按节点标签列出的节点选择器要求列表(推荐) key 键 values 值 operat or 关系符 支持Exists, DoesNotExist, In, NotIn, Gt, Lt preferredDuringSchedulingIgnoredDuringExecution 优先调度到满足指定的规则的Node,相当于软限制 (倾向) preference 一个节点选择器项,与相应的权重相关联 matchFields 按节点字段列出的节点选择器要求列表 matchExpressions 按节点标签列出的节点选择器要求列表(推荐) key 键 values 值 operator 关系符 支持In, NotIn, Exists, DoesNotExist, Gt, Lt weight 倾向权重,在范围1-100。
|
1 2 3 4 5 6 7 8 9 10 11
| 关系符的使用说明:
- matchExpressions: - key: nodeenv operator: Exists - key: nodeenv operator: In values: ["xxx","yyy"] - key: nodeenv operator: Gt values: "xxx"
|
2、requiredDuringSchedulingIgnoredDuringExecution
硬限制
必须调度到含标签的node上否则报错
接下来首先演示一下<span class="ne-text">requiredDuringSchedulingIgnoredDuringExecution</span>
(硬限制),
创建pod-nodeaffinity-required.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| apiVersion: v1 kind: Pod metadata: name: pod-nodeaffinity-required namespace: dev spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: nodeenv operator: In values: ["xxx","yyy"]
|
查看当前节点下有没有这样的标签(label)可以看到当前的node节点都无法满足
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| kubectl label nodes k8s nodeenv=pro kubectl label nodes k8s1 nodeenv=test
kubectl get nodes --show-labels
kubectl create -f pod-nodeaffinity-required.yaml
kubectl get pods pod-nodeaffinity-required -n dev -o wide
kubectl describe pod pod-nodeaffinity-required -n dev
kubectl delete -f pod-nodeaffinity-required.yaml
vim pod-nodeaffinity-required.yaml
affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: nodeenv operator: In values: ["pro","yyy"]
kubectl create -f pod-nodeaffinity-required.yaml
kubectl get pods pod-nodeaffinity-required -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod-nodeaffinity-required 1/1 Running 0 95s 192.168.77.14 k8s <none> <none>
|
标签
调度失败(pending)
效果
3、requiredDuringSchedulingIgnoredDuringExecution (软限制)
优先调度到权重高且符合标签的node上面
接下来再演示一下<span class="ne-text">requiredDuringSchedulingIgnoredDuringExecution</span>
,(软限制-即优先调度到满足条件的上面)
创建 pod-nodeaffinity-preferred.yaml
- 权重weight的意思是设置权重优先调度到满足条件的节点上去、如果没有满足的则随便调度在那个节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: v1 kind: Pod metadata: name: pod-nodeaffinity-preferred namespace: dev spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: nodeenv operator: In values: ["xxx","yyy"]
|
1 2 3 4 5 6 7
| kubectl create -f pod-nodeaffinity-preferred.yaml
kubectl get pod pod-nodeaffinity-preferred -n dev -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod-nodeaffinity-preferred 1/1 Running 0 54s 192.168.166.208 k8s1 <none> <none>
|
NodeAffinity规则设置的注意事项:
- 如果同时定义了nodeSelector和nodeAffinity,那么必须两个条件都得到满足,Pod才能运行在指定的Node上
- 如果nodeAffinity指定了多个nodeSelectorTerms,那么只需要其中一个能够匹配成功即可
- 如果一个nodeSelectorTerms中有多个matchExpressions即有多个key值 ,则一个节点必须满足所有的才能匹配成功
- 如果一个pod所在的Node在Pod运行期间其标签发生了改变,不再符合该Pod的节点亲和性需求,则系统将忽略此变化
4、PodAffinity(新创建的Pod跟参照pod在一个区域)
- PodAffinity主要实现以运行的Pod为参照,实现让新创建的Pod跟参照pod在一个区域的功能。
首先来看一下<span class="ne-text">PodAffinity</span>
的可配置项:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| pod.spec.affinity.podAffinity requiredDuringSchedulingIgnoredDuringExecution 硬限制 namespaces 指定参照pod的namespace topologyKey 指定调度作用域 labelSelector 标签选择器 matchExpressions 按节点标签列出的节点选择器要求列表(推荐) key 键 values 值 operator 关系符 支持In, NotIn, Exists, DoesNotExist. matchLabels 指多个matchExpressions映射的内容 preferredDuringSchedulingIgnoredDuringExecution 软限制 podAffinityTerm 选项 namespaces topologyKey labelSelector matchExpressions key 键 values 值 operator matchLabels weight 倾向权重,在范围1-100
|
topologyKey用于指定调度时作用域,例如:
- 如果指定为 kubernetes.io/hostname,那就是以Node节点为区分范围
- **如果指定为 **beta.kubernetes.io/os ,则以Node节点的操作系统类型来区分
4.1、接下来,演示下requiredDuringSchedulingIgnoredDuringExecution
,
1)首先创建一个参照Pod,pod-podaffinity-target.yaml:
1 2 3 4 5 6 7 8 9 10 11 12
| apiVersion: v1 kind: Pod metadata: name: pod-podaffinity-target namespace: dev labels: podenv: pro spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest nodeName: k8s1
|
1 2 3 4 5 6 7
| kubectl create -f pod-podaffinity-target.yaml
kubectl get pods pod-podaffinity-target -n dev
NAME READY STATUS RESTARTS AGE pod-podaffinity-target 1/1 Running 0 2m59s
|
4.2、创建pod-podaffinity-required.yaml,内容如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: v1 kind: Pod metadata: name: pod-podaffinity-required namespace: dev spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: podenv operator: In values: ["xxx","yyy"] topologyKey: kubernetes.io/hostname
|
上面配置表达的意思是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
| kubectl create -f pod-podaffinity-required.yaml
kubectl get pods pod-podaffinity-required -n dev
kubectl describe pods pod-podaffinity-required -n dev
vim pod-podaffinity-required.yaml
... affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: podenv operator: In values: ["pro","yyy"] topologyKey: kubernetes.io/hostname ...
kubectl delete -f pod-podaffinity-required.yaml
kubectl create -f pod-podaffinity-required.yaml
kubectl get pods pod-podaffinity-required -n dev
|
可以看到和参照的pod在一起
关于PodAffinity
的 preferredDuringSchedulingIgnoredDuringExecution
,这里不再演示。
4.3、PodAntiAffinity(新创建的Pod跟参照pod不在一个区域)
PodAntiAffinity主要实现以运行的Pod为参照,让新创建的Pod跟参照pod不在一个区域中的功能。
它的配置方式和选项跟PodAffinty是一样的,这里不再做详细解释,直接做一个测试案例。
4.3.1、继续使用上个案例中目标pod
1 2 3 4 5
| kubectl get pods -n dev -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS pod-podaffinity-required 1/1 Running 0 5m53s 172.16.0.54 k8s1 <none> <none> <none> pod-podaffinity-target 1/1 Running 0 21m 172.16.0.53 k8s1 <none> <none> podenv=pro
|
4.3.2、创建pod-podantiaffinity-required.yaml,内容如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
| apiVersion: v1 kind: Pod metadata: name: pod-podantiaffinity-required namespace: dev spec: containers: - name: nginx image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: podenv operator: In values: ["pro"] topologyKey: kubernetes.io/hostname
|
上面配置表达的意思是:新Pod必须要与拥有标签nodeenv=pro的pod不在同一Node上,运行测试一下。
1 2 3 4 5 6 7 8
| kubectl create -f pod-podantiaffinity-required.yaml
kubectl get pods pod-podantiaffinity-required -n dev -o wide .... NAME READY STATUS RESTARTS AGE IP NODE .. pod-podantiaffinity-required 1/1 Running 0 30s 10.244.1.96 node2 ..
|
如果都有pro标签时则会报错如下: