简介

Metricbeat 是服务器上的轻量级收集器,用于定期收集主机和服务的监控指标【包括events】。

Metribeat 默认收集系统指标,但也包含大量其他模块,用于收集 Nginx、Kafka、MySQL、Redis 等服务的指标。支持的模块的完整列表可以在 Elastic 官网查看:https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-modules.html

kube-state-metrics

官方地址:https://github.com/kubernetes/kube-state-metrics

注:对于使用 prometheus-operator/kube-prometheus 的用户, kube-prometheuskube-state-metrics 作为其组件之一。

如果已经安装了 kube-prometheus,则无需安装 kube-state-metrics

首先,我们需要安装 kube-state-metrics,这是一个组件,它是一个侦听 Kubernetes API 并公开有关每个资源对象状态的指标数据的服务。

安装 kube-state-metrics 也很简单,在对应的 GitHub 仓库下有对应的安装资源清单文件在

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ git clone https://github.com/kubernetes/kube-state-metrics.git
$ cd kube-state-metrics
# 如果没配置k8s.gcr.io的代理仓库,可能会被墙,可以使用`https://hub.docker.com/r/dyrnq/kube-state-metrics`提供的镜像
# 替换image为: image: dyrnq/kube-state-metrics:v2.5.0
# 执行安装命令
$ kubectl apply -f examples/standard/
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics configured
clusterrole.rbac.authorization.k8s.io/kube-state-metrics configured
deployment.apps/kube-state-metrics configured
serviceaccount/kube-state-metrics configured
service/kube-state-metrics configured
$ kubectl get pods -n kube-system -l app.kubernetes.io/name=kube-state-metrics
NAME READY STATUS RESTARTS AGE
kube-state-metrics-6d7449fc78-mgf4f 1/1 Running 0 88s

Pod 变为 Running 时安装成功。

Metricbeat

官方源yaml地址: https://github.com/elastic/beats/blob/main/deploy/kubernetes/metricbeat-kubernetes.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
# 配置metricbeat索引的声明周期,ElasticSearch 的索引生命周期代表了一组规则,这些规则可以根据索引的大小或长度应用于您的索引。
# 例如,索引可以每天或每次超过 1GB 时进行轮换,我们也可以根据规则配置不同的阶段。由于监控会产生大量数据,很可能一天几十G以上,所以为了防止大量数据存储,我们可以使用索引生命周期来配置数据保留
# 在如下所示的文件中,我们将索引配置为每天或每次超过 5GB 时进行轮换,并删除所有超过 3 天的索引文件
---
apiVersion: v1
kind: ConfigMap
metadata:
namespace: kube-system
name: metricbeat-indice-lifecycle
labels:
k8s-app: metricbeat
data:
indice-lifecycle.json: |-
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "5GB" ,
"max_age": "1d"
}
}
},
"delete": {
"min_age": "3d",
"actions": {
"delete": {}
}
}
}
}
}
---
apiVersion: v1
kind: Secret
metadata:
name: metricbeat-ca
namespace: kube-system
data:
# 如果有证书,则配置证书
ca.crt: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURTVENDQWpHZ0F3SUJBZ0lVWTFIQ3Qyc2RWVU9aeEVVYjhvdzRua1R4K1JBd0RRWUpLb1pJaHZjTkFRRUwKQlFBd05ERXlNREFHQTFVRUF4TXBSV3hoYzNScFl5QkRaWEowYVdacFkyRjBaU0JVYjI5c0lFRjFkRzluWlc1bApjbUYwWldRZ1EwRXdIaGNOTWpFd09ERTVNRFl4TXpNNVdoY05NalF3T0RFNE1EWXhNek01V2pBME1USXdNQVlEClZRUURFeWxGYkdGemRHbGpJRU5sY25ScFptbGpZWFJsSUZSdmIyd2dRWFYwYjJkbGJtVnlZWFJsWkNCRFFUQ0MKQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFPQzZWUUJoeFBKUUhYMzQrZDdxY21QbgoyMkRvaVV0NUdERjFycEJ3Zm96Nlo4aVp5VHdNUVlncXZJRUJsUGt3MlF6WThObVVIZGg4RSt3c1c1dlFWdEg4CjFqMHFwaTFxZWNuWXpmTUNzVWVlRmVWamtTRjRMK3JYZXl2RFUyWDgvK1Y1YVhxQ2xOeVJXWUpLRFQxejFVZ00KUEQ3enFPWnFjTFVITE54bG9TMW1vYjl6N29Rekw3clRkTW11WEFBTHFwRW4zLzdnTXJHWFhqV0ZNcHNadTFxYQpSWVBqSU9oUzFKY2tyUmVydmV3RWN2bTBud3lCQnpiQWhHRFFoRCszdjdUS0xRQTBNRytwTXlTWmVSQlJPZnJDCkhWcjh4b04yYVF5bCszWWJKMDBKeGJjYlVkNUl6dWtScmVFeFJnWGRtamRKbEd5N1NMY0NwcVBXQVBXaDUya0MKQXdFQUFhTlRNRkV3SFFZRFZSME9CQllFRkhRUk1oWmJZVlB4WjlVSUxWU3NGS3dzZ3lUZk1COEdBMVVkSXdRWQpNQmFBRkhRUk1oWmJZVlB4WjlVSUxWU3NGS3dzZ3lUZk1BOEdBMVVkRXdFQi93UUZNQU1CQWY4d0RRWUpLb1pJCmh2Y05BUUVMQlFBRGdnRUJBREFRbTNhaEIxSEVTVWU0dEhiTENKaGN5TG1BZFEzdnNLcSs4Lzk3RGo3Q2cyWUEKam5ucHZJUEpVcWdtalcrYm9JL0N1WmRIN1hjZlNKWlVKdG9TbXFmMk1RQU9QOVEwZXd3aEdlKzhSNmRjYllzNwpHc3crV3drYWRoZ2R1N3JYdTBTSFNBZ0lndWVNVmNyY20xL094cEdhbzlGYWsvWEQyUDV1S3F0N0tsR0dNZVJHCjN6STduc0tQblFMTjRlcjhTR3I5SkJYTEszTGhrNHlGN0Znb05UeXBKVFVlWTRTMEN1bFpsTEhMT3MySXlWTGEKenZ0YUJ4T2l0ZjYycm5ZdjBTWWgxYWkvQXBVZ1NMSjZYd2VMK0xMUmgxSGhQR2hDSnlUUUtobFFxTnRuNmJQdQp1YnVMc2pmcm5VOEZJVHdxZDRiYkFMS3RSY1lNamRZcXYwQkQ4Z2M9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K"
---
# 使用 ConfigMap 配置 Metricbeat,包含 ElasticSearch 地址、用户名和密码,以及有关 Kibana 配置、我们要启用的模块和爬取频率的信息
apiVersion: v1
kind: ConfigMap
metadata:
name: metricbeat-daemonset-config
namespace: kube-system
labels:
k8s-app: metricbeat
data:
metricbeat.yml: |-
metricbeat.config.modules:
# Mounted `metricbeat-daemonset-modules` configmap:
path: ${path.config}/modules.d/*.yml
# Reload module configs as they change:
reload.enabled: false

metricbeat.autodiscover:
providers:
- type: kubernetes
scope: cluster
node: ${NODE_NAME}
unique: true
templates:
- config:
# 抓取 kube-state-metrics 数据
- module: kubernetes
hosts: ["kube-state-metrics.kube-system.svc.cluster.local:8080"]
period: 10s
add_metadata: true
metricsets:
- state_node
- state_deployment
- state_daemonset
- state_replicaset
- state_pod
- state_container
- state_cronjob
- state_resourcequota
- state_statefulset
- state_service
- module: kubernetes
metricsets:
- apiserver
hosts: ["https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.certificate_authorities:
- /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
period: 30s
# 打开下面的注释以获取k8s的event:
#- module: kubernetes
# metricsets:
# - event
# To enable hints based autodiscover uncomment this:
#- type: kubernetes
# node: ${NODE_NAME}
# hints.enabled: true

processors:
- add_cloud_metadata:

#cloud.id: ${ELASTIC_CLOUD_ID}
#cloud.auth: ${ELASTIC_CLOUD_AUTH}

output.elasticsearch:
hosts: ['https://${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
# ssl,不需要的话将下面三个去掉
protocol: "https"
ssl.enabled: true
ssl.certificate_authorities:
- /usr/share/metricbeat/ssl/ca.crt
setup.kibana:
host: 'https://${KIBANA_HOST:kibana}:${KIBANA_PORT:5601}'
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
# ssl,不需要的话将下面三个去掉
protocol: "https"
ssl.enabled: true
ssl.certificate_authorities:
- /usr/share/metricbeat/ssl/ca.crt
# 导入已经存在的 Dashboard
setup.dashboards:
enabled: true
# 配置 indice 生命周期
setup.ilm:
policy_file: /etc/indice-lifecycle.json
---
apiVersion: v1
kind: ConfigMap
metadata:
name: metricbeat-daemonset-modules
namespace: kube-system
labels:
k8s-app: metricbeat
data:
system.yml: |-
- module: system
period: 10s
metricsets:
- cpu
- load
- memory
- network
- process
- process_summary
#- core
#- diskio
#- socket
processes: ['.*']
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory

- module: system
period: 1m
metricsets:
- filesystem
- fsstat
processors:
- drop_event.when.regexp:
system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)'
kubernetes.yml: |-
# 抓取 kubelet 监控指标
- module: kubernetes
metricsets:
- node
- system
- pod
- container
- volume
period: 10s
host: ${NODE_NAME}
hosts: ["https://${NODE_NAME}:10250"]
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
ssl.verification_mode: "none"
# If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API,
# remove ssl.verification_mode entry and use the CA, for instance:
#ssl.certificate_authorities:
#- /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt
# Currently `proxy` metricset is not supported on Openshift, comment out section
- module: kubernetes
metricsets:
- proxy
period: 10s
host: ${NODE_NAME}
hosts: ["localhost:10249"]
---
# Deploy a Metricbeat instance per node for node metrics retrieval
# DaemonSet 资源对象列表
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: metricbeat
namespace: kube-system
labels:
k8s-app: metricbeat
spec:
selector:
matchLabels:
k8s-app: metricbeat
template:
metadata:
labels:
k8s-app: metricbeat
spec:
serviceAccountName: metricbeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: metricbeat
image: docker.elastic.co/beats/metricbeat:7.14.2
args: [
"-c", "/etc/metricbeat.yml",
"-e",
"-system.hostfs=/hostfs",
]
env:
- name: KIBANA_HOST
value: "xxxx"
- name: KIBANA_PORT
value: "5601"
- name: ELASTICSEARCH_HOST
value: "xxxx"
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: "elastic"
- name: ELASTICSEARCH_PASSWORD
value: "xxxx"
# - name: ELASTIC_CLOUD_ID
# value:
# - name: ELASTIC_CLOUD_AUTH
# value:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/metricbeat.yml
readOnly: true
subPath: metricbeat.yml
- name: data
mountPath: /usr/share/metricbeat/data
- name: modules
mountPath: /usr/share/metricbeat/modules.d
readOnly: true
- name: indice-lifecycle
mountPath: /etc/indice-lifecycle.json
readOnly: true
subPath: indice-lifecycle.json
- name: proc
mountPath: /hostfs/proc
readOnly: true
- name: secret
mountPath: /usr/share/metricbeat/ssl/ca.crt
# 必须添加subPath, 否则Exiting: error initializing publisher: 1 error: read /usr/share/filebeat/ssl/ca.crt: is a directory reading /usr/share/filebeat/ssl/ca.crt
subPath: ca.crt
- name: cgroup
mountPath: /hostfs/sys/fs/cgroup
readOnly: true
volumes:
- name: proc
hostPath:
path: /proc
- name: cgroup
hostPath:
path: /sys/fs/cgroup
- name: config
configMap:
defaultMode: 0640
name: metricbeat-daemonset-config
- name: modules
configMap:
defaultMode: 0640
name: metricbeat-daemonset-modules

- name: indice-lifecycle
configMap:
defaultMode: 0600
name: metricbeat-indice-lifecycle
- name: secret
secret:
defaultMode: 0640
secretName: metricbeat-ca
- name: data
hostPath:
# When metricbeat runs as non-root user, this directory needs to be writable by group (g+w)
path: /var/lib/metricbeat-data
type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metricbeat
subjects:
- kind: ServiceAccount
name: metricbeat
namespace: kube-system
roleRef:
kind: ClusterRole
name: metricbeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metricbeat
namespace: kube-system
subjects:
- kind: ServiceAccount
name: metricbeat
namespace: kube-system
roleRef:
kind: Role
name: metricbeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metricbeat-kubeadm-config
namespace: kube-system
subjects:
- kind: ServiceAccount
name: metricbeat
namespace: kube-system
roleRef:
kind: Role
name: metricbeat-kubeadm-config
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: metricbeat
labels:
k8s-app: metricbeat
rules:
- apiGroups: [""]
resources:
- nodes
- namespaces
- events
- pods
- services
verbs: ["get", "list", "watch"]
# Enable this rule only if planing to use Kubernetes keystore
#- apiGroups: [""]
# resources:
# - secrets
# verbs: ["get"]
- apiGroups: ["extensions"]
resources:
- replicasets
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources:
- statefulsets
- deployments
- replicasets
verbs: ["get", "list", "watch"]
- apiGroups:
- ""
resources:
- nodes/stats
verbs:
- get
- nonResourceURLs:
- "/metrics"
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: metricbeat
# should be the namespace where metricbeat is running
namespace: kube-system
labels:
k8s-app: metricbeat
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs: ["get", "create", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: metricbeat-kubeadm-config
namespace: kube-system
labels:
k8s-app: metricbeat
rules:
- apiGroups: [""]
resources:
- configmaps
resourceNames:
- kubeadm-config
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metricbeat
namespace: kube-system
labels:
k8s-app: metricbeat

面板查看

一旦 Metricbeat 的 Pod 变为 Running 状态,我们就可以正常到 Kibana 中查看对应的监控信息。

在 Kibana 左侧,进入 Observability → Metrics 进入metrics监控页面,可以看到一些监控数据。

也可以根据自己的需要进行过滤,例如我们可以按 Kubernetes Namespace 分组作为查看监控信息的视图。

如果在配置文件中设置了属性 setup.dashboards.enabled=true,Kibana 将导入一些预先存在的 Dashboards。

例如想查看有关集群节点的信息,可以查看[Metricbeat Kubernetes] Overview ECS仪表板:[Metricbeat Kubernetes] Overview ECS.

kube-state-metrics 常见指标

参见 官网文档