广告位联系
返回顶部
分享到

k8s集群部署时etcd容器不停重启问题以及处理介绍

云和虚拟化 来源:互联网 作者:佚名 发布时间:2024-01-16 22:06:09 人浏览
摘要

问题现象 在安装部署Kubernetes 1.26版本时,通过kubeadm初始化集群后,发现执行kubectl命令报以下错误: The connection to the server localhost:8080 was refused - did you specify the right host or port? 查看kub

问题现象

在安装部署Kubernetes 1.26版本时,通过kubeadm初始化集群后,发现执行kubectl命令报以下错误:

The connection to the server localhost:8080 was refused - did you specify the right host or port?

查看kubelet状态是否正常,发现无法连接apiserver的6443端口。

1

2

3

4

Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015089    7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused"

Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015445    7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused"

Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015654    7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused"

Dec 21 09:36:03 k8s-master kubelet[7127]: E1221 09:36:03.015818    7127 kubelet_node_status.go:540] "Error updating node status, will retry" err="error getting node \"k8s-master\": Get \"https://192.168.2.200:6443/api/v1/nodes/k8s-master?timeout=10s\": dial tcp 192.168.2.200:6443: connect: connection refused"

进而查看apiserver容器的状态,由于是基于containerd作为容器运行时,此时kubectl不可用的情况下,使用crictl ps -a命令可以查看所有容器的情况。

1

2

3

4

5

6

7

8

root@k8s-master:~/k8s/calico# crictl ps -a

CONTAINER           IMAGE               CREATED             STATE               NAME                      ATTEMPT             POD ID              POD

395b45b1cb733       a31e1d84401e6       50 seconds ago      Exited              kube-apiserver            28                  e87800ae06ff5       kube-apiserver-k8s-master

b5c7e2a07bf1b       5d7c5dfd3ba18       3 minutes ago       Running             kube-controller-manager   32                  6b7cc9dd07f1d       kube-controller-manager-k8s-master

944aa31862613       556768f31eb1d       4 minutes ago       Exited              kube-proxy                27                  ccb6557c6f629       kube-proxy-ctjjq

c097332b6f416       fce326961ae2d       4 minutes ago       Exited              etcd                      30                  079d491eb9925       etcd-k8s-master

b8103090322c4       dafd8ad70b156       6 minutes ago       Exited              kube-scheduler            32                  48f9544c9798c       kube-scheduler-k8s-master

a14b969e8ad05       5d7c5dfd3ba18       12 minutes ago      Exited              kube-controller-manager   31                  5576806b4e142       kube-controller-manager-k8s-master

发现此时kube-apiserver容器已经退出,查看容器日志是否有异常信息。通过日志信息发现是kube-apiserver无法连接etcd的2379端口,那么问题应该是出在etcd了。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

W1221 07:00:20.392868       1 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {

  "Addr": "127.0.0.1:2379",

  "ServerName": "127.0.0.1",

  "Attributes": null,

  "BalancerAttributes": null,

  "Type": 0,

  "Metadata": null

}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused"

W1221 07:00:21.391330       1 logging.go:59] [core] [Channel #4 SubChannel #6] grpc: addrConn.createTransport failed to connect to {

  "Addr": "127.0.0.1:2379",

  "ServerName": "127.0.0.1",

  "Attributes": null,

  "BalancerAttributes": null,

  "Type": 0,

  "Metadata": null

}. Err: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused"

此时etcd容器也在不断地重启,查看其日志发现没有错误级别的信息。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 is starting a new election at term 2"}

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 became pre-candidate at term 2"}

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 received MsgPreVoteResp from d975d9ebc69964b3 at term 2"}

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 became candidate at term 3"}

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 received MsgVoteResp from d975d9ebc69964b3 at term 3"}

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"d975d9ebc69964b3 became leader at term 3"}

{"level":"info","ts":"2022-12-21T10:29:00.740Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: d975d9ebc69964b3 elected leader d975d9ebc69964b3 at term 3"}

{"level":"info","ts":"2022-12-21T10:29:00.742Z","caller":"etcdserver/server.go:2054","msg":"published local member to cluster through raft","local-member-id":"d975d9ebc69964b3","local-member-attributes":"{Name:k8s-master ClientURLs:[https://192.168.2.200:2379]}","request-path":"/0/members/d975d9ebc69964b3/attributes","cluster-id":"f88ac1c8c4bab6","publish-timeout":"7s"}

{"level":"info","ts":"2022-12-21T10:29:00.742Z","caller":"embed/serve.go:100","msg":"ready to serve client requests"}

{"level":"info","ts":"2022-12-21T10:29:00.742Z","caller":"embed/serve.go:100","msg":"ready to serve client requests"}

{"level":"info","ts":"2022-12-21T10:29:00.743Z","caller":"etcdmain/main.go:44","msg":"notifying init daemon"}

{"level":"info","ts":"2022-12-21T10:29:00.743Z","caller":"etcdmain/main.go:50","msg":"successfully notified init daemon"}

{"level":"info","ts":"2022-12-21T10:29:00.744Z","caller":"embed/serve.go:198","msg":"serving client traffic securely","address":"192.168.2.200:2379"}

{"level":"info","ts":"2022-12-21T10:29:00.745Z","caller":"embed/serve.go:198","msg":"serving client traffic securely","address":"127.0.0.1:2379"}

{"level":"info","ts":"2022-12-21T10:30:20.624Z","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"}

{"level":"info","ts":"2022-12-21T10:30:20.624Z","caller":"embed/etcd.go:373","msg":"closing etcd server","name":"k8s-master","data-dir":"/var/lib/etcd","advertise-peer-urls":["https://192.168.2.200:2380"],"advertise-client-urls":["https://192.168.2.200:2379"]}

{"level":"info","ts":"2022-12-21T10:30:20.636Z","caller":"etcdserver/server.go:1465","msg":"skipped leadership transfer for single voting member cluster","local-member-id":"d975d9ebc69964b3","current-leader-member-id":"d975d9ebc69964b3"}

{"level":"info","ts":"2022-12-21T10:30:20.637Z","caller":"embed/etcd.go:568","msg":"stopping serving peer traffic","address":"192.168.2.200:2380"}

{"level":"info","ts":"2022-12-21T10:30:20.639Z","caller":"embed/etcd.go:573","msg":"stopped serving peer traffic","address":"192.168.2.200:2380"}

{"level":"info","ts":"2022-12-21T10:30:20.639Z","caller":"embed/etcd.go:375","msg":"closed etcd server","name":"k8s-master","data-dir":"/var/lib/etcd","advertise-peer-urls":["https://192.168.2.200:2380"],"advertise-client-urls":["https://192.168.2.200:2379"]}

但是,其中一行日志信息表示etcd收到了关闭的信号,并不是异常退出的。

1

{"level":"info","ts":"2022-12-21T10:30:20.624Z","caller":"osutil/interrupt_unix.go:64","msg":"received signal; shutting down","signal":"terminated"}

解决问题

该问题为未正确设置cgroups导致,在containerd的配置文件/etc/containerd/config.toml中,修改SystemdCgroup配置为true。

1

2

3

4

5

6

7

8

9

10

11

12

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]

  BinaryName = ""

  CriuImagePath = ""

  CriuPath = ""

  CriuWorkPath = ""

  IoGid = 0

  IoUid = 0

  NoNewKeyring = false

  NoPivotRoot = false

  Root = ""

  ShimCgroup = ""

  SystemdCgroup = true

重启containerd服务

1

systemctl restart containerd

etcd容器不再重启,其他容器也恢复正常,问题解决。


版权声明 : 本文内容来源于互联网或用户自行发布贡献,该文观点仅代表原作者本人。本站仅提供信息存储空间服务和不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权, 违法违规的内容, 请发送邮件至2530232025#qq.cn(#换@)举报,一经查实,本站将立刻删除。
原文链接 :
相关文章
  • k8s集群部署时etcd容器不停重启问题以及处理介绍
    问题现象 在安装部署Kubernetes 1.26版本时,通过kubeadm初始化集群后,发现执行kubectl命令报以下错误: The connection to the server localhost:8080 was
  • docker容器中文乱码的解决教程
    docker部署java开发web项目。nohup显示打印日志出现中文乱码,中文显示为问号???。 环境 服务器系统:centos7、docker部署项目 具体操作如下
  • 云原生Docker创建并进入mysql容器的全过程

    云原生Docker创建并进入mysql容器的全过程
    本文主要讲解的是创建mysql的容器,大家都知道,在外面进入mysql都很容易,mysql -u用户名 -p密码就可以,但是是容器的mysql就没那么好进入了
  • 云原生Docker容器自定义DNS解析

    云原生Docker容器自定义DNS解析
    描述 在特定的情况下,或者在网络策略特殊定义下,需要自定义dns进行域名访问,在宿主机上配置了域名解析,对于docker容器无效。 对于局
  • 半小时实现基于ChatGPT搭建微信机器人

    半小时实现基于ChatGPT搭建微信机器人
    ChatGPT刷屏了 相信大家最近被ChatGPT刷屏了,其实在差不多一个月前就火过一次,不会那会好像只在程序员的圈子里面火起来了,并没有被大
  • docker search命令的具体使用
    一、docker search 命令选项 命令选项 描述 filter , -f 根据给定的条件进行过滤 format 自定义打印格式 limit 显示搜索结果,默认值25 no-trunc 回显结
  • docker-cli源码窥探(推荐)

    docker-cli源码窥探(推荐)
    docker-cli源码窥探 最近一直在使用docker,看了一些书和教程,但是一直停在使用的层面,但总觉得不够深入,故决定看看源码,学习优秀的项
  • kvm 透传显卡至win10虚拟机的方法

    kvm 透传显卡至win10虚拟机的方法
    环境 1 2 3 4 5 6 7 8 9 10 11 已安装nvidia 显卡 驱动 操作系统:CentOS Linux release 7.9.2009 (Core) 内核版本:Linux 5.4.135-1.el7.elrepo.x86_64 显卡 型号:rtx 6000
  • Docker Desktop常见的几种启动失败问题解决方法

    Docker Desktop常见的几种启动失败问题解决方法
    报错1,Error:Failed to restart 点Quit 然后出现提示WSL 2 is not installed 点击 Use Hyper-V 打开 启用或关闭windows功能 确保适用于Linux的Windows子系统和
  • 使用Kubernetes自定义资源(CRD)的介绍
    什么是CRD CRD的全称为CustomResourceDefinitions,即自定义资源。k8s拥有一些内置的资源,比如说Pod,Deployment,ReplicaSet等等,而CRD则提供了一种方
  • 本站所有内容来源于互联网或用户自行发布,本站仅提供信息存储空间服务,不拥有版权,不承担法律责任。如有侵犯您的权益,请您联系站长处理!
  • Copyright © 2017-2022 F11.CN All Rights Reserved. F11站长开发者网 版权所有 | 苏ICP备2022031554号-1 | 51LA统计