k8s-Metrics Serverv0.5.2安装以及报错解决
在查看k8s 环境中,使用top命令查看Pod的CPU、内存使用过程中,遇到以下问题:
$ kubectl top node
W0818 03:22:46.090578 26207 top_pod.go:140] Using json format to get metrics.e-protocol-buffers flag
error: Metrics API not available
如上看到ERROR信息“Metrics API not available”,这是由于该Kuernetes环境没有安装metric-server组件导致的。
安装metric-server组件可以参考Github上的安装参考资料:https://github.com/kubernetes-sigs/metrics-server,如下所示:
[root@sdw88 glj_tmp]# kubectl create -f components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
在正常可以科学上网的情况下,可以拉取到镜像k8s.gcr.io/metrics-server/metrics-server:v0.5.2
,即可安装完成。
但是呢公司服务器就是不行!只能通过:
#1,先到阿里云下载
docker pull registry.aliyuncs.com/google_containers/metrics-server:v0.5.2
#2.更换标签命:
docker tag registry.aliyuncs.com/google_containers/metrics-server:v0.5.2 k8s.gcr.io/metrics-server/metrics-server:v0.5.2
然后查看Pod是否OK,等待一段时间查看哈:
[root@sdw88 glj_tmp]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7f6cbbb7b8-44nfm 1/1 Running 0 7d19h
coredns-7f6cbbb7b8-pjdt7 1/1 Running 0 7d19h
etcd-sdw88 1/1 Running 0 7d19h
kube-apiserver-sdw88 1/1 Running 0 7d19h
kube-controller-manager-sdw88 1/1 Running 0 7d19h
kube-flannel-ds-bndj6 1/1 Running 0 7d18h
kube-flannel-ds-cnxxc 1/1 Running 0 7d18h
kube-flannel-ds-l28g9 1/1 Running 0 7d18h
kube-proxy-8lppt 1/1 Running 0 28h
kube-proxy-ns4tv 1/1 Running 0 28h
kube-proxy-pnmm8 1/1 Running 0 28h
kube-scheduler-sdw88 1/1 Running 0 7d19h
metrics-server-5b6dd75459-7wzvr 0/1 Running 0 12s
该metric-server Pod尚未准备运行完成,describe查看其详细信息:
[root@sdw88 glj_tmp]# kubectl describe pod metrics-server-5b6dd75459-7wzvr -n kube-system
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 31s default-scheduler Successfully assigned kube-system/metrics-server-5b6dd75459-7wzvr to sdw34
Normal Pulled <invalid> kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.5.2" already present on machine
Normal Created <invalid> kubelet Created container metrics-server
Normal Started <invalid> kubelet Started container metrics-server
Warning Unhealthy 9s (x5 over 49s) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
可以看到在描述中的事件信息中直接:Readiness probe failed: HTTP probe failed with statuscode: 500
然后日志全是这种错:
E1216 10:54:51.417065 1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.33:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 192.168.1.33 because it doesn't contain any IP SANs" node="sdw33"
I1216 10:54:53.739687 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I1216 10:55:03.739164 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
E1216 10:55:06.408232 1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.33:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 192.168.1.33 because it doesn't contain any IP SANs" node="sdw33"
E1216 10:55:06.409199 1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.34:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 192.168.1.34 because it doesn't contain any IP SANs" node="sdw34"
E1216 10:55:06.411028 1 scraper.go:139] "Failed to scrape node" err="Get \"https://192.168.1.88:10250/stats/summary?only_cpu_and_memory=true\": x509: cannot validate certificate for 192.168.1.88 because it doesn't contain any IP SANs" node="sdw88"
针对这个报错,仔细查阅Github上的安装参考资料:https://github.com/kubernetes-sigs/metrics-server (全英文-日了狗)
1.安装要求,如下红框中说明,Kubelet证书需要由群集证书颁发机构签名(或可以禁用证书验证,通过对Metrics Server配置参数--Kubelet-insecure-tls不安全)
2.配置,如下红框中说明,添加了--Kubelet-insecure-tls
这个配置,就不会去验证Kubelets提供的服务证书的CA。只能在测试环境哦-不安全
修改之前的apply的components.yaml
文件,添加--Kubelet-insecure-tls
参数,
–kubelet-insecure-tls
:kubelet 的10250端口使用的是https协议,连接需要验证tls证书。–kubelet-insecure-tls
不验证客户端证书
如下所示
重新apply components.yaml
文件,以及查看Pod
[root@sdw88 glj_tmp]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
metrics-server-5b6dd75459-7wzvr 1/1 Running 0 33s
验证top:
[root@sdw88 glj_tmp]# kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-7f6cbbb7b8-44nfm 3m 23Mi
coredns-7f6cbbb7b8-pjdt7 3m 21Mi
etcd-sdw88 34m 56Mi
kube-apiserver-sdw88 102m 290Mi
kube-controller-manager-sdw88 30m 58Mi
kube-flannel-ds-bndj6 3m 23Mi
kube-flannel-ds-cnxxc 3m 21Mi
kube-flannel-ds-l28g9 4m 26Mi
kube-proxy-8lppt 11m 20Mi
kube-proxy-ns4tv 1m 19Mi
kube-proxy-pnmm8 9m 19Mi
kube-scheduler-sdw88 6m 28Mi
metrics-server-5b6dd75459-7wzvr 4m 22Mi