需求描述

使用filebeat从log文件中采集json格式的日志,发送到ES中,并在ES中显示json日志的各字段和数据。

问题一:如何让采集Json格式的日志

在filebeat.yml文件中进行相应的配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
- type: log
enabled: true

paths:
- E:\testjson.log
processors:
- script:
lang: javascript
source: >
function process(event) {
var message = event.Get("message");
message = message.replace(/\\x22/g,'"');
message = message.replace(/\,-/g,'');
event.Put("message", message.trim());
}
- decode_json_fields:
fields: ["message"]
# 是否解析数组
process_array: false
max_depth: 1
# 配置所有解析出来的字段放在哪个key上(空字符串则是放根节点)
target: ""
overwrite_keys: false
add_error_key: true
# `decode_json_fields`和`json`二选一
#json:
# # 默认这个值是FALSE的,也就是我们的json日志解析后会被放在json键上。设为TRUE,所有的keys就会被放到根节点
# keys_under_root: true
# # 是否要覆盖原有的key,这是关键配置,将keys_under_root设为TRUE后,再将overwrite_keys也设为TRUE,就能把filebeat默认的key值给覆盖了
# overwrite_keys: true
# # 添加json_error key键记录: json解析失败错误
# add_error_key: true
# #指定json日志解析后放到哪个key上,默认是json,你也可以指定为log等。该字段必须处于json的root
# message_key: message

具体配置可参考相应文档json部分
注意每条json日志只能占一行,如果换行的话会解析出错!

问题:json格式解析报错 “Error decoding JSON: EOF” 以及" key not found"

具体错误如下:

1
2
2020-12-14T16:01:50.789+0800    ERROR   [reader_json]   readjson/json.go:57     Error decoding JSON: EOF
2020-12-14T16:01:50.789+0800 DEBUG [processors] processing/processors.go:128 Fail to apply processor global{timestamp=[field=start_time, target_field=@timestamp, timezone=Asia/Shanghai, layouts=[2006-01-02 15:04:05.999]], drop_fields={"Fields":["log","host","input","agent","ecs","start_time"],"IgnoreMissing":false}}: failed to get time field start_time: key not found

错误原因
刚开始报了上述错误,只有第一条日志发出去了,后边的日志都会报错Error decoding JSON: EOF,打开debug日志还会发现json日志的第一个字段会“key not found”
原因是json格式的日志每条最后必须加逗号",",而我的日志中仅将其换行。
解决方法
json日志的大括号后加逗号

1
2
{json日志},
{json日志},

问题二:如何发送到ElasticSearch

在filebeat.yml中设置output.elasticsearch:

1
2
3
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["ip:9200"]

这种设置之后,日志存入ES的index默认是"filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

可以通过index参数设置插入索引

1
2
3
4
5
6
7
8
9
10
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["IP:9200"]
# 修改的配置如下,这里的%{+yyyy.MM.dd}取得是@timestamp字段
index: "test-%{[agent.version]}-normal-%{+yyyy.MM.dd}"
setup.template.overwrite: true
# 如果需要修改index,以下两个属性必须配置
setup.template.name: "test"
setup.template.pattern: "test-*"
setup.ilm.enabled: false

也可以按照官方描述动态配置索引官方文档

1
2
# 可能出现如下错误
failed to publish events: temporary bulk send failure

问题三:如何用自己的时间戳替换@timestamp

配合使用timestamp处理器drop_fields处理器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
processors: 
- timestamp:
# 格式化时间值 给 时间戳
field: start_time
# 使用我国东八区时间 格式化log时间, filebeat默认是utc+0,如果没改filebeat时区,那么这里设置东8的话会让你的timestamp减去8个xiao
timezone: Asia/Shanghai
layouts:
# 具体日期格式格式可参考文档中的说明
- '2006-01-02 15:04:05.999'
test:
#test字段:必须能够成功解析的日期内容
- '2019-06-22 16:33:51.111'
- drop_fields:
fields: ["log","host","input","agent","ecs","start_time"]
ignore_missing: false

start_time字段中提取日志时间,并放入时间戳中,并在drop_fields中将该字段删掉即可。

整体配置文件如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
filebeat.inputs:
- type: log
enabled: true
- E:\testjson.log
json:
keys_under_root: true
overwrite_keys: true
add_error_key: true
message_key: message
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["192.168.73.101:9200"]
processors:
- timestamp:
# 格式化时间值 给 时间戳
field: start_time
# 使用我国东八区时间 格式化log时间
timezone: Asia/Shanghai
layouts:
- '2006-01-02 15:04:05.999'
test:
- '2019-06-22 16:33:51.111'

- drop_fields:
fields: ["log","host","input","agent","ecs","start_time"]
ignore_missing: false

最终效果

原始日志:

1
2
{"start_time": "2020-12-13 10:37:01.072","type": "CsbTest","level": "INFO","message":"数据1","parameter":["6677"]},
{"start_time": "2020-12-13 10:37:01.072","type": "CsbTest","level": "INFO","message":"数据2","parameter":["12121"]},

ES存储日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
{
"_index" : "filebeat-7.10.0-2020.12.14-000001",
"_type" : "_doc",
"_id" : "8d1FYHYBjbaP5OPrGcZL",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2020-12-14T02:37:01.072Z",
"message" : "数据2",
"parameter" : [
"6677"
],
"type" : "CsbTest",
"level" : "INFO",
"fields" : {
"log_type" : "normal"
}
}
}

参考

https://www.elastic.co/guide/index.html
https://www.elastic.co/guide/en/beats/filebeat/7.9/processor-[timestamp](https://so.csdn.net/so/search?q=timestamp&spm=1001.2101.3001.7020).html
https://blog.csdn.net/yk20091201/article/details/90756738
https://blog.csdn.net/qq_27818541/article/details/108063235
https://www.cnblogs.com/manhoo/archive/2009/06/25/1511066.html