您好,登錄后才能下訂單哦!
這篇文章將為大家詳細講解有關(guān)prometheus13-k8s如何部署alertmanager,小編覺得挺實用的,因此分享給大家做個參考,希望大家閱讀完這篇文章后可以有所收獲。
1.四個配置文件
[root@kubemaster01 alertmanager]# ls -l -rw-r--r-- 1 root root 676 Oct 28 15:43 alertmanager-configmap.yaml -rw-r--r-- 1 root root 2183 Oct 28 15:36 alertmanager-deployment.yaml -rw-r--r-- 1 root root 331 Oct 28 15:36 alertmanager-pvc.yaml -rw-r--r-- 1 root root 372 Oct 28 15:36 alertmanager-service.yaml
2.修改pv 以及 config的地址
[root@kubemaster01 alertmanager]# cat alertmanager-pvc.yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: alertmanager namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: EnsureExists spec: storageClassName: managed-nfs-storage accessModes: - ReadWriteOnce resources: requests: storage: "2Gi" [root@kubemaster01 alertmanager]# cat alertmanager-configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: alertmanager-config namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: EnsureExists data: alertmanager.yml: | global: resolve_timeout: 5m smtp_smarthost: 'smtp.163.com:25' smtp_from: 'ww763004768@163.com' smtp_auth_username: 'ww763004768@163.com' smtp_auth_password: '123456' smtp_require_tls: false receivers: - name: default-receiver email_configs: - to: "w673004768@163.com" route: group_interval: 1m group_wait: 10s receiver: default-receiver repeat_interval: 1m [root@kubemaster01 alertmanager]#
3.部署
kubectl apply -f alertmanager-configmap.yaml kubectl apply -f alertmanager-pvc.yaml kubectl apply -f alertmanager-deployment.yaml kubectl apply -f alertmanager-service.yaml
4.Prometheus和alertmanager 通訊配置
修改prometheus的配置config-map 然后從新運用
5.查看是否生效
6.修改configmap 修改prometheus的報警規(guī)則的
(kubectl apply -f prometheus-configmap.yaml)
創(chuàng)建configmap
kubectl apply -f prometheus-rules.yaml
[root@kubemaster01 prometheus]# cat prometheus-rules.yaml apiVersion: v1 kind: ConfigMap metadata: name: prometheus-rules namespace: kube-system data: general.rules: | groups: - name: general.rules rules: - alert: InstanceDown expr: up == 0 for: 1m labels: severity: error annotations: summary: "Instance {{ $labels.instance }} 停止工作" description: "{{ $labels.instance }} job {{ $labels.job }} 已經(jīng)停止5分鐘以上." node.rules: | groups: - name: node.rules rules: - alert: NodeFilesystemUsage expr: 100 - (node_filesystem_free_bytes{fstype=~"ext4|xfs"} / node_filesystem_size_bytes{fstype=~"ext4|xfs"} * 100) > 80 for: 1m labels: severity: warning annotations: summary: "Instance {{ $labels.instance }} : {{ $labels.mountpoint }} 分區(qū)使用率過高" description: "{{ $labels.instance }}: {{ $labels.mountpoint }} 分區(qū)使用大于80% (當前值: {{ $value }})" - alert: NodeMemoryUsage expr: 100 - (node_memory_MemFree_bytes+node_memory_Cached_bytes+node_memory_Buffers_bytes) / node_memory_MemTotal_bytes * 100 > 80 for: 1m labels: severity: warning annotations: summary: "Instance {{ $labels.instance }} 內(nèi)存使用率過高" description: "{{ $labels.instance }}內(nèi)存使用大于80% (當前值: {{ $value }})" - alert: NodeCPUUsage expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100) > 60 for: 1m labels: severity: warning annotations: summary: "Instance {{ $labels.instance }} CPU使用率過高" description: "{{ $labels.instance }}CPU使用大于60% (當前值: {{ $value }})" [root@kubemaster01 prometheus]#
prometheus服務(wù)掛載configmap
關(guān)于“prometheus13-k8s如何部署alertmanager”這篇文章就分享到這里了,希望以上內(nèi)容可以對大家有一定的幫助,使各位可以學(xué)到更多知識,如果覺得文章不錯,請把它分享出去讓更多的人看到。
免責聲明:本站發(fā)布的內(nèi)容(圖片、視頻和文字)以原創(chuàng)、轉(zhuǎn)載和分享為主,文章觀點不代表本網(wǎng)站立場,如果涉及侵權(quán)請聯(lián)系站長郵箱:is@yisu.com進行舉報,并提供相關(guān)證據(jù),一經(jīng)查實,將立刻刪除涉嫌侵權(quán)內(nèi)容。