kube-prometheus 监控Redis

kube-prometheus 监控Redis

因为此次是使用K8s部署redis的,因此就采用边车模式(sidecar)新增一个redis-exporter监控容器,如果是集群外的redis,可以参考之前kube-prometheus监控 K8s集群外服务笔记。

操作步骤

部署单机版redis过程省略,此次是sidecar模式部署,redis-deply.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: redis-single-node
  name: redis-single-node
  namespace: tools
spec:
  progressDeadlineSeconds: 600    #定义 deploy 升级的最大时间。
  replicas: 1
  revisionHistoryLimit: 2   #定义保留的升级记录数。
  selector:
    matchLabels:
      app: redis-single-node
  template:
    metadata:
      labels:
        app: redis-single-node
    spec:
      imagePullSecrets:
        - name: hub
      containers:
        - command:
            - sh
            - -c
            - redis-server "/mnt/redis.conf"
          env:
            - name: TZ
              value: Asia/Shanghai
            - name: LANG
              value: C.UTF-8
          image: 10.194.24.53/tools/redis:6.2.13-alpine
          imagePullPolicy: IfNotPresent
          name: redis-single-node
          ports:
            - containerPort: 6379
              name: addr
              protocol: TCP
          resources:
            limits:
              cpu: '1'
              memory: '2Gi'
            requests:
              cpu: 100m
              memory: 10Mi
          securityContext: #上下文参数
            privileged: false  #特权,最高权限
            runAsNonRoot: false #禁止以root用户启动容器 true为禁止
          volumeMounts:
            - mountPath: /mnt
              name: redis-conf
              readOnly: true
            - mountPath: /data
              name: redis-data
        - name: redis-exporter
          image: 10.194.24.53/k8s-component/oliver006/redis_exporter:v1.54.0
          env:
#            - name: REDIS_ADDR
#              value: "redis-single-node:6379"
            - name: REDIS_PASSWORD
              value: "redis密码,如果为空则可不填写"
          securityContext:
            runAsUser: 59000
            runAsGroup: 59000
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
          resources:
            requests:
              cpu: 100m
              memory: 100Mi
            limits:
              cpu: 250m
              memory: 180Mi
          ports:
            - containerPort: 9121
              name: redis-exporter
      restartPolicy: Always
      volumes:
        - configMap:
            defaultMode: 420
            name: redis-config
          name: redis-conf

        - name: redis-data
          persistentVolumeClaim:
            claimName: redis-pvc

redis-exporter-svc.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    app: redis-exporter
  name: redis-exporter-svc
  namespace: tools
spec:
  ports:
    - name: http-metrics
      port: 9121
      protocol: TCP
      targetPort: 9121
  type: ClusterIP
  selector:
    app: redis-single-node

创建servicemonitor的crd对象

redis-exporter-sm.yaml

# ServiceMonitor 服务自动发现规则
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor # prometheus-operator 定义的CRD
metadata:
  labels:
    app: redis-exporter
    release: monitoring
  name: redis-exporter
  namespace: monitoring
spec:
  endpoints:
    - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      port: http-metrics # 拉去metric的端口,这个写的是 service的端口名称,即 service yaml的spec.ports.name
      interval: 30s
      path: /metrics
  jobLabel: redis-exporter # 监控数据的job标签指定为metrics label的值,即加上数据标签job=redis-exporter
  namespaceSelector:
#    matchNames: # 配置需要自动发现的命名空间,可以配置多个
#      - default
    any: true
  selector:
    matchLabels:
      app: redis-exporter

监控Kubernetes集群外的redis_exporter

cat > redis-monitor.yaml << 'EOF'
apiVersion: v1
kind: Endpoints
metadata:
  name: redis-metrics
  namespace: monitoring
  labels:
    k8s-app: redis-metrics
subsets:
- addresses:
    - ip: 172.16.3.225
  ports:
  - name: redis-exporter
    port: 9121
    protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  name: redis-metrics
  namespace: monitoring
  labels:
    k8s-app: redis-metrics
spec:
  type: ClusterIP
  clusterIP: None
  ports:
  - name: redis-exporter
    port: 9121
    protocol: TCP
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: redis-metrics
  namespace: monitoring
  labels:
    app: redis-metrics
    k8s-app: redis-metrics
    prometheus: kube-prometheus
    release: kube-prometheus
spec:
  endpoints:
  - port: redis-exporter
    interval: 15s
  selector:
    matchLabels:
      k8s-app: redis-metrics
  namespaceSelector:
    matchNames:
    - monitoring
EOF

查看target

prometheus已自动发现了redis

image-20230920104258440

登录grafana,导入模板

面板地址https://grafana.com/grafana/dashboards/11835

解决内存监控显示为问题

image-20230920104529163

原因:

因为没有给Redis设置最大内存,所以redis_memory_max_bytes是0,所以计算结果是无穷大。下面是计算公式:100 * (redis_memory_used_bytes / redis_memory_max_bytes)

解决:
1.命令行设置最大内存
10.211.11.110:6379> CONFIG SET maxmemory 1024mb
10.211.11.110:6379> CONFIG SET maxmemory-policy  volatile-ttl

2.redis.conf配置文件添加

maxmemory 2048mb
maxmemory-policy  volatile-ttl

image-20230920105046057

配置prometheusrule

redis-rules.yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: redis-rules
  namespace: monitoring
spec:
  groups:
    - name: redis.rules
      rules:
        - alert: RedisDown
          expr: redis_up == 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis down (instance {{ $labels.instance }})
            description: "Redis instance is down\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisMissingMaster
          expr: (count(redis_instance_info{role="master"}) or vector(0)) < 1
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis missing master (instance {{ $labels.instance }})
            description: "Redis cluster has no node marked as master.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisTooManyMasters
          expr: count(redis_instance_info{role="master"}) > 1
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis too many masters (instance {{ $labels.instance }})
            description: "Redis cluster has too many nodes marked as master.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisDisconnectedSlaves
          expr: count without (instance, job) (redis_connected_slaves) - sum without (instance, job) (redis_connected_slaves) - 1 > 1
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis disconnected slaves (instance {{ $labels.instance }})
            description: "Redis not replicating for all slaves. Consider reviewing the redis replication status.\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisReplicationBroken
          expr: delta(redis_connected_slaves[1m]) < 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis replication broken (instance {{ $labels.instance }})
            description: "Redis instance lost a slave\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisClusterFlapping
          expr: changes(redis_connected_slaves[1m]) > 1
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: Redis cluster flapping (instance {{ $labels.instance }})
            description: "Changes have been detected in Redis replica connection. This can occur when replica nodes lose connection to the master and reconnect (a.k.a flapping).\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisMissingBackup
          expr: time() - redis_rdb_last_save_timestamp_seconds > 60 * 60 * 24
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis missing backup (instance {{ $labels.instance }})
            description: "Redis has not been backuped for 24 hours\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        # The exporter must be started with --include-system-metrics flag or REDIS_EXPORTER_INCL_SYSTEM_METRICS=true environment variable.
        - alert: RedisOutOfSystemMemory
          expr: redis_memory_used_bytes / redis_total_system_memory_bytes * 100 > 90
          for: 2m
          labels:
            severity: warning
          annotations:
            summary: Redis out of system memory (instance {{ $labels.instance }})
            description: "Redis is running out of system memory (> 90%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisOutOfConfiguredMaxmemory
          expr: redis_memory_used_bytes / redis_memory_max_bytes * 100 > 90
          for: 2m
          labels:
            severity: warning
          annotations:
            summary: Redis out of configured maxmemory (instance {{ $labels.instance }})
            description: "Redis is running out of configured maxmemory (> 90%)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisTooManyConnections
          expr: redis_connected_clients > 500
          for: 2m
          labels:
            severity: warning
          annotations:
            summary: Redis too many connections (instance {{ $labels.instance }})
            description: "Redis instance has too many connections\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
#        - alert: RedisNotEnoughConnections
#          expr: redis_connected_clients < 5
#          for: 2m
#          labels:
#            severity: warning
#          annotations:
#            summary: Redis not enough connections (instance {{ $labels.instance }})
#            description: "Redis instance should have more connections (> 5)\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"
        - alert: RedisRejectedConnections
          expr: increase(redis_rejected_connections_total[1m]) > 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: Redis rejected connections (instance {{ $labels.instance }})
            description: "Some connections to Redis has been rejected\n  VALUE = {{ $value }}\n  LABELS = {{ $labels }}"

参考

1.官方redis_exporter仓库:https://github.com/oliver006/redis_exporter

2.https://blog.51cto.com/wutengfei/5997105

3.https://blog.51cto.com/u_14440843/5759684

暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇