What is an index in Elasticsearch
Asked Answered
L

4

41

What is an index in Elasticsearch? Does one application have multiple indexes or just one?

Let's say you built a system for some car manufacturer. It deals with people, cars, spare parts, etc. Do you have one index named manufacturer, or do you have one index for people, one for cars and a third for spare parts? Could someone explain?

Lida answered 22/2, 2013 at 13:59 Comment(0)
G
88

Good question, and the answer is a lot more nuanced than one might expect. You can use indices for several different purposes.

Indices for Relations

The easiest and most familiar layout clones what you would expect from a relational database. You can (very roughly) think of an index like a database.

  • MySQL => Databases => Tables => Rows/Columns
  • ElasticSearch => Indices => Types => Documents with Properties

An ElasticSearch cluster can contain multiple Indices (databases), which in turn contain multiple Types (tables). These types hold multiple Documents (rows), and each document has Properties (columns).

So in your car manufacturing scenario, you may have a SubaruFactory index. Within this index, you have three different types:

  • People
  • Cars
  • Spare_Parts

Each type then contains documents that correspond to that type (e.g. a Subaru Imprezza doc lives inside of the Cars type. This doc contains all the details about that particular car).

Searching and querying takes the format of: http://localhost:9200/[index]/[type]/[operation]

So to retrieve the Subaru document, I may do this:

  $ curl -XGET localhost:9200/SubaruFactory/Cars/SubaruImprezza

.

Indices for Logging

Now, the reality is that Indices/Types are much more flexible than the Database/Table abstractions we are used to in RDBMs. They can be considered convenient data organization mechanisms, with added performance benefits depending on how you set up your data.

To demonstrate a radically different approach, a lot of people use ElasticSearch for logging. A standard format is to assign a new index for each day. Your list of indices may look like this:

  • logs-2013-02-22
  • logs-2013-02-21
  • logs-2013-02-20

ElasticSearch allows you to query multiple indices at the same time, so it isn't a problem to do:

  $ curl -XGET localhost:9200/logs-2013-02-22,logs-2013-02-21/Errors/_search=q:"Error Message"

Which searches the logs from the last two days at the same time. This format has advantages due to the nature of logs - most logs are never looked at and they are organized in a linear flow of time. Making an index per log is more logical and offers better performance for searching.

.

Indices for Users

Another radically different approach is to create an index per user. Imagine you have some social networking site, and each users has a large amount of random data. You can create a single index for each user. Your structure may look like:

  • Zach's Index
    • Hobbies Type
    • Friends Type
    • Pictures Type
  • Fred's Index
    • Hobbies Type
    • Friends Type
    • Pictures Type

Notice how this setup could easily be done in a traditional RDBM fashion (e.g. "Users" Index, with hobbies/friends/pictures as types). All users would then be thrown into a single, giant index.

Instead, it sometimes makes sense to split data apart for data organization and performance reasons. In this scenario, we are assuming each user has a lot of data, and we want them separate. ElasticSearch has no problem letting us create an index per user.

Guitarist answered 22/2, 2013 at 14:29 Comment(6)
Cleared all my doubts. Thanks.Mortie
This is valid for older version of elastic search. Not a valid answer with current versionColumbia
@NitinSaxena Agreed but it would be better if you can provide an explanation on why it's no longer valid, such as Removal of type of types etc.Cw
There will be no Type in ES 6.0.0 ElasticSearch => Indices => Documents with Properties elastic.co/guide/en/elasticsearch/reference/6.1/…Glaze
If you're going to cut and paste your answer from somewhere else, you should at least give credit: elastic.co/blog/what-is-an-elasticsearch-indexHispidulous
@Hispidulous Check the author of that blog ;) But yes to everyone else, this is old and no longer valid for modern versions of ElasticsearchGuitarist
S
24

@Zach's answer is valid for elasticsearch 5.X and below. Since elasticsearch 6.X Type has been deprecated and will be completely removed in 7.X. Quoting the elasticsearch docs:

Initially, we spoke about an “index” being similar to a “database” in an SQL database, and a “type” being equivalent to a “table”. This was a bad analogy that led to incorrect assumptions.

Further to explain, two columns with the same name in SQL from two different tables can be independent of each other. But in an elasticsearch index that is not possible since they are backed by the same Lucene field. Thus, "index" in elasticsearch is not quite same as a "database" in SQL. If there are any same fields in an index they will end up having conflicts of field types. To avoid this the elasticsearch documentation recommends storing index per document type.

Refer: Removal of mapping types

Scathe answered 16/1, 2018 at 17:56 Comment(0)
P
1

An index is a data structure for storing the mapping of fields to the corresponding documents. The objective is to allow faster searches, often at the expense of increased memory usage and preprocessing time.

The number of indexes you create is a design decision that you should take according to your application requirements. You can have an index for each business concept... You can an index for each month of the year...

You should invest some time getting acquainted with lucene and elasticsearch concepts.

Take a look at the introductory video and to this one with some data design patterns

Phraseograph answered 22/2, 2013 at 14:27 Comment(0)
B
0

Above one is too detailed in very short it could be defined as

Index: It is a collection of different type of documents and document properties. Index also uses the concept of shards to improve the performance. For example, a set of document contains data of a social networking application. Answer from tutorialpoints.com

Since index is collection of different type of documents as per question depends how you want to categorize.

Do you have one index named manufacturer? Yes , we will keep one document with manufacturer thing.

do you have one index for people, one for cars and a third for spare parts? Could someone explain? Think of instance car given by same manufacturer to many people driving it on road .So there could be many indices depending upon number of use.

If we think deeply we will found except first question all are invalid ones. Elastic-search documents are much different that SQL docs or csv or spreadsheet docs ,from one indices and by good powerful query language you can create millions type of data categorised documents in CSV style.

Due to its blazingly fast and indexed capability we create one index only for one customer , from that we create many type of documnets as per our need . For example:

All old people using same model.Or One Old people using all model .

Permutation is inifinite.

Backblocks answered 16/5, 2017 at 7:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.