10.4、容器探测(exec TCPSoket HTTPGet)

一、概览

容器探测用于检测容器中的应用实例是否正常工作,是保障业务可用性的一种传统机制。如果经过探测,实例的状态不符合预期,那么kubernetes就会把该问题实例” 摘除 “,不承担业务流量。kubernetes提供了两种探针来实现容器探测,分别是:

  • liveness probes:存活性探针,用于检测应用实例当前是否处于正常运行状态,如果不是,k8s会重启容器
  • readiness probes:就绪性探针,用于检测应用实例当前是否可以接收请求,如果不能,k8s不会转发流量

livenessProbe 决定是否重启容器,readinessProbe 决定是否将请求转发给容器。

上面两种探针目前均支持三种探测方式:

  • Exec命令:在容器内执行一次命令,如果命令执行的退出码为0,则认为程序正常,否则不正常
1
2
3
4
5
6
7
……
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
……
  • TCPSocket:将会尝试访问一个用户容器的端口,如果能够建立这条连接,则认为程序正常,否则不正常
1
2
3
4
5
……  
livenessProbe:
tcpSocket:
port: 8080
……
  • HTTPGet:调用容器内Web应用的URL,如果返回的状态码在200和399之间,则认为程序正常,否则不正常
1
2
3
4
5
6
7
8
……
livenessProbe:
httpGet:
path: / #URI地址
port: 80 #端口号
host: 127.0.0.1 #主机地址
scheme: HTTP #支持的协议,http或者https
……

下面以liveness probes为例,做几个演示:

二、执行方式

1、方式一:Exec

创建pod-liveness-exec.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-exec
namespace: dev
spec:
containers:
- name: nginx
image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest # nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
exec:
command: ["/bin/cat","/tmp/hello.txt"] # 执行一个查看文件的命令

创建pod,观察效果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 创建Pod
kubectl create -f pod-liveness-exec.yaml

# 查看Pod详情
kubectl describe pods pod-liveness-exec -n dev

......
Normal Created 20s (x2 over 50s) kubelet, node1 Created container nginx
Normal Started 20s (x2 over 50s) kubelet, node1 Started container nginx
Normal Killing 20s kubelet, node1 Container nginx failed liveness probe, will be restarted
Warning Unhealthy 0s (x5 over 40s) kubelet, node1 Liveness probe failed: cat: can't open '/tmp/hello11.txt': No such file or directory

# 观察上面的信息就会发现nginx容器启动之后就进行了健康检查
# 检查失败之后,容器被kill掉,然后尝试进行重启(这是重启策略的作用,后面讲解)
# 稍等一会之后,再观察pod信息,就可以看到RESTARTS不再是0,而是一直增长
kubectl get pods pod-liveness-exec -n dev

NAME READY STATUS RESTARTS AGE
pod-liveness-exec 0/1 CrashLoopBackOff 2 3m19s

# 当然接下来,可以修改成一个存在的文件,比如/tmp/hello.txt,再试,结果就正常了......

https://github.com/zznn-cloud/zznn-cloud-blog-images/raw/main/Qexo/24/12/image_808501ef065eab0e4598ef658faa491c.png

2、方式二:TCPSocket

创建 pod-liveness-tcpsocket.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-tcpsocket
namespace: dev
spec:
containers:
- name: nginx
image: registry.cn-hangzhou.aliyuncs.com/zznn/mycentos:nginx-latest # nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
tcpSocket:
port: 8080 # 尝试访问8080端口

创建pod,观察效果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 创建Pod
kubectl create -f pod-liveness-tcpsocket.yaml

# 查看Pod详情
kubectl describe pods pod-liveness-tcpsocket -n dev
......
Normal Scheduled 31s default-scheduler Successfully assigned dev/pod-liveness-tcpsocket to node2
Normal Pulled <invalid> kubelet, node2 Container image "nginx:1.17.1" already present on machine
Normal Created <invalid> kubelet, node2 Created container nginx
Normal Started <invalid> kubelet, node2 Started container nginx
Warning Unhealthy <invalid> (x2 over <invalid>) kubelet, node2 Liveness probe failed: dial tcp 10.244.2.44:8080: connect: connection refused

# 观察上面的信息,发现尝试访问8080端口,但是失败了
# 稍等一会之后,再观察pod信息,就可以看到RESTARTS不再是0,而是一直增长
kubectl get pods pod-liveness-tcpsocket -n dev

NAME READY STATUS RESTARTS AGE
pod-liveness-tcpsocket 0/1 CrashLoopBackOff 2 3m19s

# 当然接下来,可以修改成一个可以访问的端口,比如80,再试,结果就正常了......

3、方式三:HTTPGet

创建 pod-liveness-httpget.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-httpget
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
httpGet: # 其实就是访问http://127.0.0.1:80/hello
scheme: HTTP # 支持的协议,http或者https
port: 80 # 端口号
path: /hello # URI地址

创建pod,观察效果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 创建Pod
kubectl create -f pod-liveness-httpget.yaml

# 查看Pod详情
kubectl describe pod pod-liveness-httpget -n dev
.......
Normal Pulled 6s (x3 over 64s) kubelet, node1 Container image "nginx:1.17.1" already present on machine
Normal Created 6s (x3 over 64s) kubelet, node1 Created container nginx
Normal Started 6s (x3 over 63s) kubelet, node1 Started container nginx
Warning Unhealthy 6s (x6 over 56s) kubelet, node1 Liveness probe failed: HTTP probe failed with statuscode: 404
Normal Killing 6s (x2 over 36s) kubelet, node1 Container nginx failed liveness probe, will be restarted

# 观察上面信息,尝试访问路径,但是未找到,出现404错误
# 稍等一会之后,再观察pod信息,就可以看到RESTARTS不再是0,而是一直增长
kubectl get pod pod-liveness-httpget -n dev

NAME READY STATUS RESTARTS AGE
pod-liveness-httpget 1/1 Running 5 3m17s

# 当然接下来,可以修改成一个可以访问的路径path,比如/,再试,结果就正常了......

至此,已经使用liveness Probe演示了三种探测方式,但是查看livenessProbe的子属性,会发现除了这三种方式,还有一些其他的配置,在这里一并解释下:

1
2
3
4
5
6
7
8
9
10
[root@k8s-master01 ~]# kubectl explain pod.spec.containers.livenessProbe
FIELDS:
exec <Object>
tcpSocket <Object>
httpGet <Object>
initialDelaySeconds <integer> # 容器启动后等待多少秒执行第一次探测
timeoutSeconds <integer> # 探测超时时间。默认1秒,最小1秒
periodSeconds <integer> # 执行探测的频率。默认是10秒,最小1秒
failureThreshold <integer> # 连续探测失败多少次才被认定为失败。默认是3。最小值是1
successThreshold <integer> # 连续探测成功多少次才被认定为成功。默认是1

下面稍微配置两个,演示下效果即可:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@k8s-master01 ~]# more pod-liveness-httpget.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-liveness-httpget
namespace: dev
spec:
containers:
- name: nginx
image: nginx:1.17.1
ports:
- name: nginx-port
containerPort: 80
livenessProbe:
httpGet:
scheme: HTTP
port: 80
path: /
initialDelaySeconds: 30 # 容器启动后30s开始探测
timeoutSeconds: 5 # 探测超时时间为5s