11.3、污点和容忍（Taints）

5.4.3 污点和容忍

一、污点（Taints）

前面的调度方式都是站在Pod的角度上，通过在Pod上添加属性，来确定Pod是否要调度到指定的Node上，其实我们也可以站在Node的角度上，通过在Node上添加****污点属性，来决定是否允许Pod调度过来。

Node被设置上污点之后就和Pod之间存在了一种相斥的关系，进而拒绝Pod调度进来，甚至可以将已经存在的Pod驱逐出去。

污点的格式为：key=value:effect, key和value是污点的标签，effect描述污点的作用，支持如下三个选项：

PreferNoSchedule：kubernetes将尽量避免把Pod调度到具有该污点的Node上**，除非没有其他节点可调度
NoSchedule：kubernetes将不会把Pod调度到具有该污点的Node上，但不会影响当前Node上已存在的Pod
NoExecute：kubernetes将不会把Pod调度到具有该污点的Node上，**同时也会将Node上已存在的Pod驱离

使用kubectl设置和去除污点的命令示例如下：

# 设置污点
kubectl taint nodes node1 key=value:effect

# 去除污点
kubectl taint nodes node1 key:effect-

# 去除所有污点
kubectl taint nodes node1 key-

接下来，演示下污点的效果：

准备节点node1（为了演示效果更加明显，暂时停止node2节点）
为node1节点设置一个污点: tag=heima:PreferNoSchedule；然后创建pod1( pod1 可以 )
**修改为node1节点设置一个污点: **tag=heima:NoSchedule；然后创建pod2( pod1 正常 pod2 失败 )
**修改为node1节点设置一个污点: **tag=heima:NoExecute；然后创建pod3 ( 3个pod都失败 )

# 为node1设置污点（PreferNoSchedule）
kubectl taint nodes node1 tag=heima:PreferNoSchedule
# 查看污点
kubectl describe nodes |grep Taints  

# 创建pod1（可以正常运行）
kubectl run taint1 --image=registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest -n dev
kubectl get pods -n dev -o wide

# 为node1设置污点(取消PreferNoSchedule加个-减号即可，设置NoSchedule)
# 此时节点上已经存在的正常运行 创建新的则失败为pending状态
kubectl taint nodes node1 tag:PreferNoSchedule-
kubectl taint nodes node1 tag=heima:NoSchedule

# 创建pod2
kubectl run taint2 --image=registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest -n dev
kubectl get pods taint2 -n dev -o wide


# 为node1设置污点(取消NoSchedule，设置NoExecute)
kubectl taint nodes node1 tag:NoSchedule-
kubectl taint nodes node1 tag=heima:NoExecute

# 创建pod3
kubectl run taint3 --image=registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest -n dev
kubectl get pods -n dev -o wide

查看节点污染：

tag=heima:PreferNoSchedule 正常运行

tag=heima:NoSchedule 已经存在的正常运行不存在的无法创建

tag=heima:NoExecute 不能创建且已经存在的会变成pending状态

小提示：

使用kubeadm搭建的集群，默认就会给master节点添加一个污点标记**,所以pod就不会调度到master节点上.**

二、容忍（Toleration）

上面介绍了污点的作用，我们可以在node上添加污点用于拒绝pod调度上来，但是如果就是想将一个pod调度到一个有污点的node上去，这时候应该怎么做呢？这就要使用到容忍。

污点就是拒绝，容忍就是忽略，Node通过污点拒绝pod调度上去，Pod通过容忍忽略拒绝

下面先通过一个案例看下效果：

上一小节，已经在node1节点上打上了<span class="ne-text">NoExecute</span>的污点，此时pod是调度不上去的
本小节，可以通过给pod添加容忍，然后将其调度上去

①、添加容忍

当前已有污点

1 2	# 查看容忍 kubectl describe nodes \|grep Taints

创建pod-toleration.yaml,内容如下

apiVersion: v1
kind: Pod
metadata:
  name: pod-toleration
  namespace: dev
spec:
  containers:
  - name: nginx
    image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest
  tolerations:        # 添加容忍
  - key: "tag"        # 要容忍的污点的key
    operator: "Equal" # 操作符（等于操作符）
    value: "heima"        # 容忍的污点的value
    effect: "NoExecute"   # 添加容忍的规则，这里必须和标记的污点规则相同

# 添加容忍之前的pod
kubectl get pods -n dev -o wide  
# 添加容忍之后的pod
kubectl get pods -n dev -o wide

添加NoExecute容忍前与容忍后状态

下面看一下容忍的详细配置:

kubectl explain pod.spec.tolerations
......
FIELDS:
   key       # 对应着要容忍的污点的键，空意味着匹配所有的键
   value     # 对应着要容忍的污点的值
   operator  # key-value的运算符，支持Equal和Exists（默认）
   effect    # 对应污点的effect，空意味着匹配所有影响
   tolerationSeconds   # 容忍时间, 当effect为NoExecute时生效，表示pod在Node上的停留时间