filebeat日志收集实践

内容目录

filebeat基本介绍

filebeat是什么

filebeat是用于转发和集中日志数据的轻量型数据采集器；
filebeat会监视指定的日志文件路径，收集日志事件并将数据转发到elasticsearch、logstash、redis、kafka存储服务器

file

filebeat主要组件

filebeat包含两个主要组件，输入和收割机，两个组件协同工作将文件尾部最新数据发送出去
- 1.输入put：输入负责管理收割机从哪个路径查找所有可读取的资源
- 2.收割机harvester：负责逐行读取单个文件的内容，然后将内容发送到输出

filebeat工作流程

当filebeat启动后，filebeat通过input读取指定的日志路径，然后为该日志启动一个收割进程
harvester，每一个收割进程读取一个日志文件的新内容，并发送这些新的日志数据到处理程序spooler，处理程序会集合这些事件，最后filebeat会发送集合的数据到你指定的位置

file

filebeat配置说明

file

filebeat基本使用

filebeat安装

rpm -ivh filebeat-7.8.1-x86_64.rpm
systemctl enable filebeat
systemctl start filebeat

filebeat配置

配置filebeat从终端读入，从终端输出；

cat /etc/filebeat/test.yml
filebeat.inputs:
- type: stdin
  enabled: true
output.console:
  pretty: true
  enable: true

输入一条测试语句

filebeat -e -c test.yml
hello world

# 返回结果
{
  "@timestamp": "2024-08-01T00:34:36.868Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.8.1"
  },
  "message": "hello world",
  "log": {
    "offset": 0,
    "file": {
      "path": ""
    }
  },
  "input": {
    "type": "stdin"
  },
  "agent": {
    "hostname": "yj",
    "ephemeral_id": "00dc0753-d9e0-467c-a518-faec28803329",
    "id": "66d89ba9-f03d-4350-a06a-6d836d20b525",
    "name": "yj",
    "type": "filebeat",
    "version": "7.8.1"
  },
  "ecs": {
    "version": "1.5.0"
  },
  "host": {
    "name": "yj"
  }
}

filebeat从文件读取

配置filebeat从文件中读取数据；

cat /etc/filebeat/test2.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/test.log
output.console:
  pretty: true
  enable: true

# 新建日志文件，追加数据
touch /var/log/test.log
echo "hello thursday" >> /var/log/test.log

# 返回结果
{
  "@timestamp": "2024-08-01T00:44:21.269Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.8.1"
  },
  "agent": {
    "ephemeral_id": "6435c813-fd58-401c-997e-7b274f4ab81c",
    "id": "66d89ba9-f03d-4350-a06a-6d836d20b525",
    "name": "yj",
    "type": "filebeat",
    "version": "7.8.1",
    "hostname": "yj"
  },
  "log": {
    "offset": 0,
    "file": {
      "path": "/var/log/test.log"
    }
  },
  "message": "hello thursday",
  "input": {
    "type": "log"
  },
  "ecs": {
    "version": "1.5.0"
  },
  "host": {
    "name": "yj"
  }
}

filebeat输出至es集群

输出读取内容至elasticsearch

cat /etc/filebeat/test3.yml
filebeat.inputs:
- type: log  # 收集日志的类型
  enabled: true  # 启用日志收集
  paths:  # 日志所在路径
    - /var/log/test.log
output.elasticsearch:  # 输出日志至es
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]  # es
  集群ip与端口

# 默认不写index，索引名为filebeat

# 往监控的日志文件追加数据
echo "hello thursday" >> /var/log/test.log

# 在kibana上看结果
GET /filebeat-7.8.1-2024.08.01-000001/_search

filebeat自定义索引名称

默认filebeat写入es的索引名称为filebeat-*，如果希望修改索引名称：
- 1.修改filebeat配置文件
- 2.删除es的索引，删除kibana上的索引模式匹配
- 3.重启filebeat服务重新产生新的索引

cat /etc/filebeat/test3.yml
filebeat.inputs:
- type: log  # 收集日志的类型
  enabled: true  # 启用日志收集
  paths:  # 收集日志所在路径
    - /var/log/test.log

output.elasticsearch:  # 输出日志至es
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]  # es
  集群ip与端口
  index: "wordpress-system-%{[agent.version]}-%{+yyyy.MM.dd}"  # 自定义索引名称

setup.ilm.enabled: false  # 索引生命周期ilm功能默认开启，开启情况下索引名称只能为filebeat-*
setup.template.name: "wordpress-system"  # 定义模板名称
setup.template.pattern: "wordpress-system-*"  # 定义模板的匹配索引名称

systemctl restart filebeat

默认情况下filebeat写入到es的索引分片为1，如果需要修订分片，可以通过如下两种方式：

修改filebeat配置文件，增加如下内容；然后在cerebro中删除索引的模板和索引，重新产生数据。
```
setup.template.settings:
index.number_of_shards: 3
index.number_of_replicas: 1
```
使用cerebro web页面修改
修改模板settings配置，调整分片以及副本；
删除模板关联的索引；
重启filebeat产生新的索引；

filebeat收集系统日志实践

系统日志有哪些

系统日志其实很宽泛，通常我们说的是
message、secure、cron、dmesg、ssh、boot等日志

系统日志收集思路

系统中有很多日志，挨个配置收集就变得非常麻烦了。
所以我们需要对这些日志进行统一、集中的管理。
可以通过rsyslog将本地所有类型的日志都写入/var/log/system.log文件中，
然后使用filebeat对该文件进行收集即可。

系统日志收集架构图

rsyslog+filebeat --> elasticsearch <--kibana

file

系统日志收集实践

环境准备

主机名称	ip地址
wordpress(rsyslog)	192.168.99.1
es-node1	192.168.99.11
es-node2	192.168.99.12
es-node3	192.168.99.13

配置rsyslog

安装rsyslog
yum install rsyslog

配置rsyslog

vim /etc/rsyslog.conf
# 配置收集日志的方式
#*.* @@remote-host:514  # 将本地所有日志通过网络发送给远程服务器
*.*     /var/log/system.log  # 将本地所有日志保存至本地/var/log/system.log

重启rsyslog；然后测试

systemctl restart rsyslog.service

# 测试
logger "rsyslog test from me"
grep "test" /var/log/system.log
Aug  4 21:26:10 wordpress root: rsyslog test from me

配置filebeat

编辑filebeat配置文件，将本地/var/log/system.log日志采集至elasticsearch集群中

vim /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/system.log

output.elasticsearch:
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]
  index: "wordpress-system-%{[agent.version]}-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "wordpress-system"
setup.template.pattern: "wordpress-system-*"

重启filebeat
systemctl restart filebeat

配置kibana

配置kibana读取elasticsearch索引中的数据；然后进行展示

配置kibana，创建索引模式wordpress-system*
配置索引筛选名称wordpress-system*
点击kibana的discover查看索引的日志数据；

优化filebeat

kibana展示的结果上有很多debug和info消息，其实该类消息无需收集，所以我们可以对收集的日志内容进行优化，只收集警告WARN、ERR、sshd相关的日志；

修改filebeat配置文件如下：

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/system.log
  include_lines: ["ERR","WARN","sshd"]  # 错误、警告、sshd的记录
  exclude_lines: ["DEBUG"]  # 与debug相关的排除

output.elasticsearch:
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]
  index: "wordpress-system-%{[agent.version]}-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "wordpress-system"
setup.template.pattern: "wordpress-system-*"

# 重启filebeat
systemctl restart filebeat

删除es以及kibana的索引和索引模式，然后重新生成索引

file

filebeat收集nginx日志实践

为什么要收集nginx日志

需要获取用户的信息，比如：来源的ip是哪个地域，网站的pv、uv、状态码、访问时间等等；
所以需要收集nginx日志；

nginx日志收集架构图

nginx+filebeat --> elasticsearch <-- kibana

file

nginx日志收集实践

配置filebeat
配置filebeat，收集腾讯云wordpress nginx的日志

cat /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log

output.elasticsearch:
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]
  index: "tencentnginx-access-%{[agent.version]}-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "tencentnginx"
setup.template.pattern: "tencentnginx-*"

# 重启filebeat
systemctl restart filebeat

使用kibana展示
在kibana添加索引模式，然后展示数据；

nginx json日志收集实践

收集问题

我们实现了nginx日志的收集，但是所有的数据都在message字段中，无法满足分析的需求，比如：
- 需要统计状态码的情况；
- 统计所有请求总产生的流量大小；
- 统计来源使用的客户端等等；
这些是没有办法实现的

解决方案

需要将日志中的每一个选项都拆分出来，拆分成key-value的形式，那么就需要借助json的格式
当需要筛选的时候，通过json的方式就能很好的提取出对应指标的值。这样也便于后续的分析。

配置json

将nginx日志格式转换成json格式；

cat /etc/nginx/nginx.conf

http {
    log_format json '{"time_local": "$time_local", '
         '"remote_addr": "$remote_addr", '
         '"referer": "$http_referer", '
         '"request": "$request", '
         '"status": $status, '
         '"bytes": $body_bytes_sent, '
         '"user_gent": "$http_user_agent",'
         '"x_forwarded": "$http_x_forwarded_for",'
         '"up_addr": "$upstream_addr",'
         '"up_host": "$upstream_http_host", '
         '"upstream_time": "$upstream_response_time",'
         '"request_time": "$request_time"'
        ' }';

    access_log  /var/log/nginx/access.log  json;
}

配置filebeat
nginx修改日志为json格式后，需要修改filebeat配置文件；

cat /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log
  json.keys_under_root: true  # false会将json解析的格式存储至message，改为true则不存储至message
  json.overwrite_keys: true  # 覆盖默认message字段，使用自定义json格式的key

output.elasticsearch:
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]
  index: "tencentnginx-access-%{[agent.version]}-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "tencentnginx"
setup.template.pattern: "tencentnginx-*"

# 重启filebeat、nginx，然后清空日志，在重新产生json格式的日志
systemctl restart filebeat
systemctl restart nginx
> /var/log/nginx/access.log

nginx多个日志收集实践

nginx存在访问日志和错误日志，那么如何使用filebeat同时收集nginx的访问日志、错误日志；
nginx访问日志--存储-->nginx-access索引
nginx错误日志--存储-->nginx-error索引

file

配置filebeat收集多个日志,需要通过tags标签进行区分;

cat /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /var/log/nginx/access.log
  json.keys_under_root: true
  json.overwrite_keys: true
  tags: ["nginx-access"]

- type: log
  enabled: true
  paths:
    - /var/log/nginx/error.log
  tags: ["nginx-error"]

output.elasticsearch:
  hosts: ["192.168.99.11:9200","192.168.99.12:9200","192.168.99.13:9200"]
  indices:
    - index: "tencentnginx-access-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-access"
    - index: "tencentnginx-error-%{[agent.version]}-%{+yyyy.MM.dd}"
      when.contains:
        tags: "nginx-error"

setup.ilm.enabled: false  # 索引生命周期ilm功能默认开启,开启情况下索引名称只能为filebeat-*
setup.template.name: "tencentnginx"
setup.template.pattern: "tencentnginx-*"

重启filebeat
systemctl restart filebeat
使用kibana添加nginx错误日志索引模式,然后展示数据

filebeat日志收集实践

filebeat基本介绍

filebeat是什么

filebeat主要组件

filebeat工作流程

filebeat配置说明

filebeat基本使用

filebeat安装

filebeat配置

filebeat从文件读取

filebeat输出至es集群

filebeat自定义索引名称

filebeat收集系统日志实践

系统日志有哪些

系统日志收集思路

系统日志收集架构图

系统日志收集实践

环境准备

配置rsyslog

配置filebeat

配置kibana

优化filebeat

filebeat收集nginx日志实践

为什么要收集nginx日志

nginx日志收集架构图

nginx日志收集实践

nginx json日志收集实践

收集问题

解决方案

配置json

nginx多个日志收集实践

留言

撰写回覆或留言取消回复

归档

分类列表

filebeat日志收集实践

filebeat基本介绍

filebeat是什么

filebeat主要组件

filebeat工作流程

filebeat配置说明

filebeat基本使用

filebeat安装

filebeat配置

filebeat从文件读取

filebeat输出至es集群

filebeat自定义索引名称

filebeat收集系统日志实践

系统日志有哪些

系统日志收集思路

系统日志收集架构图

系统日志收集实践

环境准备

配置rsyslog

配置filebeat

配置kibana

优化filebeat

filebeat收集nginx日志实践

为什么要收集nginx日志

nginx日志收集架构图

nginx日志收集实践

nginx json日志收集实践

收集问题

解决方案

配置json

nginx多个日志收集实践

留言

撰写回覆或留言 取消回复

归档

撰写回覆或留言取消回复