kube-Prometheus监控Tomcat服务
前言
最近公司某个tomcat集群服务老出问题,问了其他同事有没有Tomcat等应用状态的监控的时候,他们都说这些老服务没上监控。
部署
下载jmx_exporter
我这边使用版本为:0.20.0
下载连接:https://repo.maven.apache.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.20.0/jmx_prometheus_javaagent-0.20.0.jar
配置jmx_exporter
tomcat.yml
,参考https://github.com/prometheus/jmx_exporter/blob/release-0.20.0/example_configs/tomcat.yml
,记得切换到对应jmx_prometheus
的release
,我这边使用0.20.0
# https://grafana.com/grafana/dashboards/8704-tomcat-dashboard/
---
lowercaseOutputLabelNames: true
lowercaseOutputName: true
rules:
- pattern: 'Catalina<type=GlobalRequestProcessor, name=\"(\w+-\w+)-(\d+)\"><>(\w+):'
name: tomcat_$3_total
labels:
port: "$2"
protocol: "$1"
help: Tomcat global $3
type: COUNTER
- pattern: 'Catalina<j2eeType=Servlet, WebModule=//([-a-zA-Z0-9+&@#/%?=~_|!:.,;]*[-a-zA-Z0-9+&@#/%=~_|]), name=([-a-zA-Z0-9+/$%~_-|!.]*), J2EEApplication=none, J2EEServer=none><>(requestCount|maxTime|processingTime|errorCount):'
name: tomcat_servlet_$3_total
labels:
module: "$1"
servlet: "$2"
help: Tomcat servlet $3 total
type: COUNTER
- pattern: 'Catalina<type=ThreadPool, name="(\w+-\w+)-(\d+)"><>(currentThreadCount|currentThreadsBusy|keepAliveCount|pollerThreadCount|connectionCount):'
name: tomcat_threadpool_$3
labels:
port: "$2"
protocol: "$1"
help: Tomcat threadpool $3
type: GAUGE
- pattern: 'Catalina<type=Manager, host=([-a-zA-Z0-9+&@#/%?=~_|!:.,;]*[-a-zA-Z0-9+&@#/%=~_|]), context=([-a-zA-Z0-9+/$%~_-|!.]*)><>(processingTime|sessionCounter|rejectedSessions|expiredSessions):'
name: tomcat_session_$3_total
labels:
context: "$2"
host: "$1"
help: Tomcat session $3 total
type: COUNTER
配置tomcat
修改文件:tomcat/bin/catalina.sh
JAVA_OPTS="$JAVA_OPTS -javaagent:/home/tomcat/jmx_prometheus_javaagent-0.20.0.jar=29400:/home/tomcat/jmx-exporter.yaml"
重启服务后查看服务是否正常
使用ps -ef |grep java
命令来检查启动的java应用中是否有 -javaagent。如果有就表示正常,如果没有,请自行排查问题所在,比如路径不对等。
kebe-prometheus的配置
配置Service
与Endpoints
# http://10.194.106.16:29400/metrics
apiVersion: v1
kind: Service
metadata:
name: monitoring-external-jvm-exporter
labels:
app: java
type: jvm-exporter
namespace: monitoring
spec:
type: ClusterIP
ports:
- name: http-metrics
port: 29400
protocol: TCP
targetPort: 29400
---
apiVersion: v1
kind: Endpoints
metadata:
name: monitoring-external-jvm-exporter
labels:
app: java
type: jvm-exporter
namespace: monitoring
subsets:
- addresses:
- ip: 10.194.106.16
ports:
- name: http-metrics
port: 29400
protocol: TCP
配置ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: jvm-monitor
release: monitoring
name: jvm-monitor
namespace: monitoring
spec:
endpoints:
- port: http-metrics
interval: 30s
path: /metrics
honorLabels: true
selector:
matchLabels:
app: java
type: jvm-exporter
namespaceSelector:
matchNames:
- monitoring
执行yaml后检查 Endpoints 状态:kubectl get endpoints -n monitoring monitoring-external-jvm-exporter
,确认 ENDPOINTS
列显示 10.194.106.16:29400
且状态正常。
验证 Service 关联性:kubectl describe svc -n monitoring monitoring-external-jvm-exporter
,查看 Endpoints
部分是否关联到正确的 IP。
配置grafana
grafana的dashboards编号是8563,添加到grafana中即可。如下图,我这边进行稍微调整