# logs-docker

**Repository Path**: dabai2015/logs-docker

## Basic Information

- **Project Name**: logs-docker
- **Description**: Kafka+Zookeeper+Logstash+ElasticSearch+Kibana+Filebeat
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2019-01-23
- **Last Updated**: 2020-12-19

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

日志处理框架
==

### 组件简介

- Filebeat: 

![Filebeat](https://www.elastic.co/assets/blt121ead33d4ed1f55/icon-beats-bb.svg)

> Beats is the platform for single-purpose data shippers. They send data from hundreds or thousands of machines and systems to Logstash or Elasticsearch.

> Filebeat is a Lightweight Shipper for Logs

> Forget using SSH when you have tens, hundreds, or even thousands of servers, virtual machines, and containers generating logs. Filebeat helps you keep the simple things simple by offering a lightweight way to forward and centralize logs and files.

- Logstash:

![Logstash](https://www.elastic.co/assets/blt946bc636d34a70eb/icon-logstash-bb.svg)

> Centralize, Transform & Stash Your Data

> Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite “stash.” (Ours is Elasticsearch, naturally.)

- ElasticSearch:

![ElasticSearch](https://www.elastic.co/assets/blt9a26f88bfbd20eb5/icon-elasticsearch-bb.svg)

> Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected.


- Kabana:

![kabana](https://www.elastic.co/assets/blt282ae2420e32fc38/icon-kibana-bb.svg)

> Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack, so you can do anything from learning why you're getting paged at 2:00 a.m. to understanding the impact rain might have on your quarterly numbers.

- Kafka:

![Kafka](http://kafka.apache.org/images/logo.png)

> Apache Kafka® is a distributed streaming platform. What exactly does that mean?

> A streaming platform has three key capabilities:

>Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.

>Store streams of records in a fault-tolerant durable way.

>Process streams of records as they occur.

>Kafka is generally used for two broad classes of applications:

>Building real-time streaming data pipelines that reliably get data between systems or applications

>Building real-time streaming applications that transform or react to the streams of data

### 解决方案

对于不太复杂的系统, 我们可以用下面的方案来处理日志的搜集. 

这里有两点需要注意:

1.  Logstash 同时具备日志采集和数据清洗的功能,  也就是说, 它包含了 Filebeat 的功能.  那为什么我们还要用Filebeat呢?   因为Filebeat 非常的轻量级, 当你有很多个web服务器的时候, 你不大会选择在每个web服务器上都安装一个既有数据采集又有数据清洗功能的庞然大物(Logstash), 而是安装一个小巧的只有采集功能的小软件即可(Filebeat).

2.  Logstash 具备从多个终端接受数据的能力,  当并发不是特别大, 集群节点的数量不是特别多的时候,  你就不需要在中间加一个消息队列中间件来做接力了.

![log](doc/log.png)



而下面这个方案就是来解决高并发, 多节点数量的方案了.   和上面的区别仅仅是多了一个消息队列中间件而已,  这样可以防止消息过多, 处理不过来. 提高系统的柔韧性.

![log](doc/logs.png)


### 安装步骤

- ###  Filebeat  [官方指南](https://www.elastic.co/guide/en/beats/filebeat/6.4/filebeat-getting-started.html)

- 下载解压即用, 无需特别安装

```
wget -O filebeat-6.5.4-linux-x86_64.tar.gz  https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.5.4-linux-x86_64.tar.gz
tar -zxf  filebeat-6.5.4-linux-x86_64.tar.gz
cd  filebeat-6.5.4-linux-x86_64
```

- 配置信息

filebeat 的默认配置文件 filebeat.yml ,  我们关注的主要是  inputs 段, 和 output 端. 

参考下面的代码, 修改 inputs 段

```

#=========================== Filebeat inputs =============================

filebeat.inputs:
- type: log
  enabled: true               ## 开关,  设置成false, 就可以停止监控, 省的添加/删除配置项麻烦.
  paths:
    - /root/setup-estandard/estandard/app-log/app.log      ## 待采集日志的路径, 支持通配符
  fields:
    document_type: tomcat     ## 这个字段是自定义的, 啥名字都可以, 用来对日志的类型做区分.


- type: log
  enabled: true
  paths:
    - /var/log/nginx/*log                                   ## 待采集日志的路径, 支持通配符
  fields:
    document_type: nginx      ## 这个字段是自定义的, 啥名字都可以, 用来对日志的类型做区分.

	
```

参考下面的代码, 修改 output 段

观察下面的配置, 我们不难发现, filebeat 是支持多种输出通道的,  但是同一时间, 你只能开启其中的一个. 
```
#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["localhost:9200"]
  enabled: false               ## 同一时间, 只有一个能为 true 
  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"


#
#----------------------------- Logstash output --------------------------------
output.logstash:
  # The Logstash hosts
  hosts: ["localhost:5044"]
  enabled: false                ## 同一时间, 只有一个能为 true 
  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#------------------------------- Kafka output ----------------------------------

output.kafka:
   enabled: true                ## 同一时间, 只有一个能为 true 
   hosts: ["localhost:9092"]     
   topic: test
```

-  启动 Filebeat

在其它用于接收数据的组件没装好之前, 你不要启动这个.

```
./filebeat -e -c  filebeat.yml -d "publish" 

#后台启动
nohup ./filebeat -e -c   filebeat.yml -d "publish"  >log.log &

```

- ### Kafka

-  可参考 [Kafka®使用指南](kafka.md)  (安装指南只保证单机可以跑通, 你一个机器上去访问另一个机器上的Kafka就不灵了, 所以最终我们没有用Kafka)

-  可参考 [官方指南](http://kafka.apache.org/quickstart) , Apache 出的产品风格都相识 (类似 tomcat) 

 
- ###   ElasticSearch +  Kabana + Logstash

-  用git下载本指南全部内容.

-  执行 ``` docker-compose  up -d  ```  即可自动搭建好  ElasticSearch +  Kabana + Logstash

-  执行  ``` docker-compose down ```  停下

-  验证一下  [http://localhost:5601](http://192.168.1.207:5601)

> 注意:  我们在启动 ElasticSearch 的时候, 把数据映射到了宿主机上, 存在读写权限问题,  启动之前必须用 chmod 777 <folder>  修改读写权限. 

- ###  Logstash 

参考如下代码,   编辑 logstash.conf.

```
input {
#   当我们从Filebeat 里面采集时, 可以把这个打开. 把下面的kafka注释掉.
#	beats {
#		port => 5044
#	}

	kafka {
		bootstrap_servers => "{{http://192.168.1.207:9092}}"
		topics => ["test"]
		codec => "json"
		consumer_threads => 2
		enable_auto_commit => true
		auto_commit_interval_ms => "1000"
	}

}

output {
	elasticsearch {
		hosts => ["http://elasticsearch:9200"]
		index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
	}
}

output {
	elasticsearch {
		hosts => ["http://elasticsearch:9200"]
		index => "%{[fields][document_type]}-%{+YYYY.MM.dd}"
	}
}
```

- ### logstash 安装

- 解压即用

```
wget -O logstash-6.4.3.tar.gz  https://artifacts.elastic.co/downloads/logstash/logstash-6.4.3.tar.gz
tar -zxf logstash-6.4.3.tar.gz
```

- 指定文件启动

```
 bin/logstash -f logstash.conf
```


- ### zookeeper 安装 (可选项, Kafka 里面自带了一个zookeeper) [安装指南](zookeeper.md)