Bingo Huang


etcd 实战分享

推荐访问我的在线 Slide: https://talks.bingohuang.com/2018/20180521-etcd-in-action/etcd-in-action.slide

继续阅读

etcd 故障排查之 `failed to send out heartbeat on time`

查看 etcd 运行日志,如果看到如下日志: 08:52:05.164847 W | etcdserver: failed to send out heartbeat on time 说明 etcd 发送心跳有问题了,查看官方问答,解释如下: etcd uses a leader

继续阅读

etcd 故障排查之 `etcdserver apply entries took too long`

经常去看看 etcd 运行日志,如果 etcd 负载比较高,或者节点规格较差,往往能看到如下类似日志: 08:52:05.164847 W | etcdserver: apply entries took too long [140.696147ms for 1 entries] 0

继续阅读

etcd 故障排查之 `the clock difference against peer xxx is too high [xxxs > 1s]`

协助排查 etcd 的一个问题,出现如下日志: 2018-05-16 12:38:59.796724 W | rafthttp: the clock difference against peer e7e21c67737845ce is too high [3.370772704s &g

继续阅读

etcd 故障排查之 `database file (xxx) does not match with snapshot`

一次有一位同事找到我,说 etcd 抽风了,不断的重启,帮忙分析原因。 遇到这种问题,不用慌,软件抽风(故障)是常态,治它(当然,不一定什么抽风都能找到病因并治好...) 首先想到的排查方法当然是查看日志,同事首先附上了 kubelet 日志,毕竟所有的pod都是由 kublet 来调度

继续阅读
Bingo Huang