What is the role of Logstash Shipper and Logstash Indexer in ELK stack?

Asked 6/6, 2017 at 11:14 Answered 27/10, 2018 at 7:44

Solved elastic-stack

I have been studying online about ELK stack for my new project.

Although most of the tech blogs are about how to set ELK up.

Although I need more information to begin with.

What is Logstash ? Further, Logstash Shipper and Indexer.
What is Elasticsearch's role ?

Any leads will be appreciated too if not a proper answer.

Dearr answered 6/6, 2017 at 11:14 Comment(1)

Elasticsearch is a glorified database with an effective search mechanism. Logstash is one of many data frontends that can deliver data in Elasticsearch-friendly way - consequently, Logstash's indexer indexes the data (extracting fields, deciding what index to store the data in, etc.), and its Shipper ships the data to Elasticsearch... – Damon 6/6, 2017 at 11:42

I will try to explain the elk stack to you with an example.

Applications generate logs which all have the same format ( timestamp | loglevel | message ) on any machine in our cluster and write those logs to some file.

Filebeat (a logshipper from elk) tracks that file, gathers any updates to the file periodically and forwards them to logstash over the network. Unlike logstash Filebeat is a lightweight application that uses very little resources so I don't mind running it on every machine in the cluster. It notices when logstash is down and waits with tranferring data until logstash is running again (no logs are lost).

Logstash receives messages from all log shippers through the network and applies filters to the messages. In our case it splits up each entry into timestamp, loglevel and message. These are separate fields and can later be searched easily. Any messages that do not conform to that format will get a field: invalid logformat. These messages with fields are now forwarded to elastic search in a speed that elastic search can handle.

Elastic search stores all messages and indexes ( prepares for quick search) all the fields im the messages. It is our database.

We then use Kibana (also from elk) as a gui for accessing the logs. In kibana I can do something like: show me all logs from between 3-5 pm today with loglevel error whose message contains MyClass. Kibana will ask elasticsearch for the results and display them

Enrollment answered 6/6, 2017 at 12:22 Comment(6)

I believe when you say Logstash in this case, you mean Logstash Indexer. Becaus elogstash Shipper only ships logs. Am i correct ? – Dearr 6/6, 2017 at 12:47

As far as I know logstash shipper was replaced by beats. There are different beats which all serve as shippers to logstash but for different kinds of data (logs in files, performance data, network activity...). And yes when I talk anout logstash its logstsash indexer – Enrollment 6/6, 2017 at 13:2

Beats is installed on machines from where the lightweight beat agent send data to logstash shipper. My doubt being (If i am right), Does logstash shipper fetches log from various beats and sends them to indexer by consolidating them somehow ? Just a doubt. – Dearr 6/6, 2017 at 13:12

Basically forget all about logstash shipper. Even its github page refers to some other solution which is better. – Enrollment 6/6, 2017 at 13:12

In a clustered system you can have the follwowing setups as far as I understand: Shipper --> Logstash --> Elastic ||| Logstash --> Logstash --> Elastic ||| Shipper --> Elastic ||| Logstash --> Elastic In case you use logstash on each node you would then differentiate between logstash shipper (the logstash that runs on each node and simply forwards messages) and logstash indexer (the one applying filter and forwarding to elasticsearch) – Enrollment 6/6, 2017 at 13:30

Meaning Filebeat sits on server where my logs are, in case i have logstash installed over multiple boxes, all other will behave as shipper and one will act as indexer. Correct ? – Dearr 6/6, 2017 at 13:59

I don't know, if this helps, but ... whatever... Let's take some really stupid example: I want to do statistics about squirrels in my neighborhood. Every squirrel has a name and we know what they look like. Each neighbor makes a log entry whenever he sees a squirrel eating a nut.

ElasticSearch is a document database that structures data in so called indices. It is able to save pieces (shards) of those indices redundantly on multiple servers and gives you great search functionalities. so you can access huge amounts of data very quickly.

Here we might have finished events that look like this:

{
  "_index": "squirrels-2018",
  "_id": "zr7zejfhs7fzfud",
  "_version": 1,
  "_source": {
    "squirrel": "Bethany",
    "neighbor": "A",
    "@timestamp": "2018-10-26T15:22:35.613Z",
    "meal": "hazelnut",
  }
}

Logstash is the data collector and transformator. It's able to accept data from many different sources (files, databases, transport protocols, ...) with its input plugins. After using one of those input plugins all the data is stored in an Event object that can be manipulated with filters (add data, remove data, load additional data from other sources). When the data has the desired format, it can be distributed to many different outputs.

If neighbor A provides a MySQL database with the columns 'squirrel', 'time' and 'ate', but neighbor B likes to write CSVs with the columns 'name', 'nut' and 'when', we can use Logstash to accept both inputs. Then we rename the fields and parse the different datetime formats those neighbors might be using. If one of them likes to call Bethany 'Beth' we can change the data here to make it consistent. Eventually we send the result to ElasticSearch (and maybe other outputs as well).

Kibana is a visualization tool. It allows you to get an overview over your index structures and server status and create diagrams for your ElasticSearch data

Here we can do funny diagrams like 'Squirrel Sightings Per Minute' or 'Fattest Squirrel (based on nut intake)'

Ebullient answered 27/10, 2018 at 7:44 Comment(0)

Recommended topics

Hot tags