使用kubeadm将k8s从1.23.x升级到1.24.x

一、背景描述

​ 现有k8s环境是使用kubeadm部署的k8s1.23.9环境,使用的网络解决方案是flannel,k8s节点使用的Centos 7.9.2009-amd64 虚拟机,k8s节点的容器运行时使用是的Docker。k8s1.23.9集群一共有3个节点,详情如下(升级前集群是否正常可用的,图是后面临时截取的,所以状态可以忽略):

image-20241118190306646

​ 目的是在联网的情况下,使用kubeadm升级到k8s1.24.2。但Kubernetes从1.20版本开始已经弃用了DockerShim,并在1.24版本移除了DockerShim,后续如果需要使用Docker作为容器运行时(Container Runtime),需要安装cri-docker,即一个CRI(Container Runtime Interface)的实现。因为并不能直接使用kubeadm将集群升级到k8s1.24.2,但在执行一些针对容器运行时的安装配置后就可以了。具体步骤如下。

​ 同时还需注意,使用kubeadm升级k8s集群时不支持跨大版本升级。打个比方,如果要从k8s1.23.x 升级到k8s1.25.x,就需要先从k8s1.23.x 升级到k8s1.24.x,然后再从k8s1.24.x 升级到k8s1.25.x。如果想在此文档的基础上将k8s升级到更版本,就需要一级一级往上升,不能跳级(可能kubeadm不承认天才吧,哈哈^_^)!

​ 参考:https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/

二、升级步骤

2.1 安装cri-dockerd(所有节点)

1
2
3
4
5
6
7
8
9
10
#访问:https://github.com/Mirantis/cri-dockerd/releases

[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.2/cri-dockerd-0.3.2.amd64.tgz
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# gunzip cri-dockerd-0.3.2.amd64.tgz
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# tar -xf cri-dockerd-0.3.2.amd64.tar

[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# cp cri-dockerd/cri-dockerd /usr/local/bin/
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# chmod 755 /usr/local/bin/cri-dockerd
#删除先前存在的文件
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# rm -f /usr/lib/systemd/system/cri-docker*
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#封装cri-dockerd.service
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# cat /usr/lib/systemd/system/cri-dockerd.service
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-dockerd.socket

[Service]
Type=notify

#ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
ExecStart=/usr/local/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --container-runtime-endpoint=unix:///var/run/cri-dockerd.sock --cri-dockerd-root-directory=/var/ib/dockershim --docker-endpoint=unix:///var/run/docker.sock --cri-dockerd-root-directory=/var/1ib/docker

ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

StartLimitBurst=3

StartLimitInterval=60s

LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

TasksMax=infinity
Delegate=yes
KillMode=process

[Install]
WantedBy=multi-user.target
1
2
3
4
5
6
7
8
9
10
11
12
13
14
#封装cri-dockerd.socket
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# cat /usr/lib/systemd/system/cri-dockerd.socket
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-dockerd.service

[Socket]
ListenStream=/var/run/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker

[Install]
WantedBy=sockets.target
1
2
3
4
5
6
7
8
9
#配置Kubernetes使用cri-docker:
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# vi /var/lib/kubelet/kubeadm-flags.env
# 添加参数或修改参数为如下
KUBELET_KUBEADM_ARGS="--cgroup-driver=system -pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7 --container-runtime=remote --container-runtime-endpoint=unix:///var/run/cri-dockerd.sock"

#启动cri-dockerd服务
systemctl daemon-reload
systemctl restart cri-dockerd.service cri-dockerd.socket
systemctl status cri-dockerd.service cri-dockerd.socket

2.2 Master 节点操作

2.2.1 升级 kubeadm

1
2
3
#kubeadm版本与目的k8s版本要一致
[root@k8s-master01 ~]# yum install -y kubeadm-1.24.2-0 --disableexcludes=kubernetes
[root@k8s-master01 ~]# kubeadm version

2.2.2 查看升级计划

1
2
3
#k8s不支持跨大版本升级
[root@k8s-master01 ~]# kubeadm upgrade plan
#下图转存过一次,所以有点模糊。可以看到,输出内容提示现在的k8s.1.23.9最高只能升级到k8s.1.24.17,而且还需要升级kubeadm
image-20241118184701703

2.2.3 执行Master节点升级

1
2
3
4
5
[root@k8s-master01 ~]# kubeadm upgrade apply v1.24.2
#如果执行失败并提示:Error while dialing dial unix /var/run/dockershim.sock,参考“2.4 问题处理”->“2.4.1 xxx章节”内容进行处理

#如果有多个控制节点,后续控制节点请执行:kubeadm upgrade node
#参考:https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
image-20241118185005730
1
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl get nodes --show-labels
image-20241118185157241
1
2
3
4
5
#将k8s-master01节点设置为不可调度,然后将将其上现有pod迁移到其他节点上去
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl drain k8s-master01 --ignore-daemonsets
#升级client端的kubectl版本
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# yum install kubectl-1.24.2
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl version --short
image-20241118185720354
1
2
3
4
5
6
#升级kubelet
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# yum install kubelet-1.24.2
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubelet –version
#重启kubelet服务
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# systemctl daemon-reload
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# systemctl restart kubelet
image-20241118185814268
1
2
3
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl uncordon k8s-master01
#uncordon此节点后,此节点上kubelet服务才变成Running状态
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# systemctl status kubelet
image-20241118185943405
1
2
#查看各节点,可以看到当前显示集群版本还是1.23.9
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl get nodes
image-20241118190029580

2.2.4 后续配置

​ k8s1.24 弃用了 node-role.kubernetes.io/master, 替换成了 node-role.kubernetes.io/control-plane , 所以需要手动添加 master 相关的 lable 和 taint,否则之前使用了该 label,taint 的应用会出问题。

1
2
3
4
5
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl label nodes k8s-master01 node-role.kubernetes.io/master=

[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl taint nodes k8s-master01 node-role.kubernetes.io/control-plane-

[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl taint nodes k8s-master01 node-role.kubernetes.io/master:NoSchedule --overwrite

2.3 Worker 节点操作

2.3.1 升级 kubeadm

1
2
[root@k8s-worker01 ~]# yum install -y kubeadm-1.24.2-0 --disableexcludes=Kubernetes
[root@k8s-worker01 ~]# kubeadm version

2.3.2 升级工作节点

1
[root@k8s-worker01 ~]# kubeadm upgrade node
image-20241118190900487
1
[root@k8s-worker01 ~]# yum install kubectl-1.24.2 kubelet-1.24.2 -y
image-20241118190925463
1
2
3
4
5
[root@k8s-worker01 ~]# systemctl daemon-reload
[root@k8s-worker01 ~]# systemctl restart kubelet
#uncordon此节点后,此节点上kubelet服务才变成Running状态
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl uncordon k8s-worker01
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl get nodes
image-20241118191051445

2.3.3 工作节点生成config文件

​ 上述操作完成后,在工作节点无法执行kubectl命令,报错如下

1
[root@k8s-worker01 ~]# kubectl get nodes
image-20241118191753051
1
2
3
4
5
6
#在所有工作节点上创建目录
[root@k8s-worker01 ~]# mkdir -p $HOME/.kube
#从主节点复制文件到工作节点指定目录
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# scp /etc/kubernetes/admin.conf root@k8s-worker01:/root/.kube/config

#之后在此工作节点就可以正常执行kubectl命令

2.3.4 注意

​ 其他k8s工作节点都要执行上述操作。

​ 所有工作执行完上述操作后,此时所有k8s节点的状态仍是处于NotReady状态,所有节点keubelet服务activating (auto-restart),请参考以下章节“问题处理”解决。

​ 升级所有操作都完成后,k8s1.24.2集群的所有节点都会处于Ready状态,k8s集群版本显示为v1.24.2:

image-20241118224834915

2.4 问题处理

2.4.1 控制节点升级时,报“Error while dialing dial unix /var/run/dockershim.sock”

image-20241118223816806

解决办法:

1
2
3
4
5
#编辑所有节点
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl edit nodes k8s-master01
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl edit nodes k8s-worker01
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# kubectl edit nodes k8s-worker02
#做如下修改:
image-20241118224008028

2.4.2 升级操作完成后,所有节点都是NotReady状态,所有节点keubelet服务activating (auto-restart)

image-20241118224047709
image-20241118224208751
1
2
3
4
#分析定位:
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# journalctl -xeu kubelet > kubelet.log
#查看kubelet的相关日志
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# vi kubelet.log
image-20241118224327843

发现kubelet无法识别"--network-plugin"这个参数,因为该配置和"DockerShim"一起在Kubernetes的1.24版本被移除了。

解决办法:

1
2
# 找到 "--network-plugin" 删除即可
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# vi /var/lib/kubelet/kubeadm-flags.env
image-20241118224524590
1
2
3
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# source /var/lib/kubelet/kubeadm-flags.env                                                                                                                     
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# systemctl restart kubelet
[root@k8s-master01 upgrade-k8s1.23.9-to-1.24.2]# systemctl status kubelet -l
image-20241118224556425

其他节点也是这样修改与操作后,重启kubelet服务,之后kubelet服务处于Running状态。

之后K8s集群中所有节点都是Ready状态:

image-20241118224702797

使用kubeadm将k8s从1.23.x升级到1.24.x
https://jiangsanyin.github.io/2024/11/18/使用kubeadm将k8s从1-23-x升级到1-24-x/
作者
sanyinjiang
发布于
2024年11月18日
许可协议