Something "Materialized view"-like in ElasticSearch
Asked Answered
C

2

9

I have a query which runs every time a website is loaded. This Query aggregates over three different term-fields and around 3 million documents and therefore needs 6-7 seconds to complete. The data does not change that frequently and the currentness of the result is not critical.

I know that I can use an alias to create something "View" like in the RDMS world. Is it also possible to populate it, so the query result gets cached? Is there any other way caching might help in this scenario or do I have to create an additional index for the aggregated data and update it from time to time?

Claimant answered 3/1, 2017 at 10:10 Comment(4)
aggregating 3 million docs is not really a lot. so it would be nice to know why it takes 6-7 seconds to run your query and maybe optimize it first.Platto
Okay, how would I do this? I enabled slowlogs, but thy don't give me much information. The query is quite simple, doing aggs for three fields like this: "aggFieldName": { "terms": { "field": "myField", "size": 0 } },Claimant
First you can use explain=true in your query to get some insights. Then you can also use the Profile API for profiling your aggregationsPlatto
Okay, I will try this and see if I gain some insights.Claimant
L
2

I know that the post is old, but about view, elastic add the Data frames in the 7.3.0. You could also use the _reindex api

POST /_reindex
{
  "source": {
    "index": "live_index"
  },
  "dest": {
    "index": "caching_index"
  }
}

But it will not change your ingestion problem. About this, I think the solution is sharding for your index. with 2 or more shards, and several nodes, elastic will be able to paralyze.

But an easier thing to test is to disable the refresh_interval when indexing and to re-enable it after. It generally improve a lot the ingestion time.

You can see a full article on this use case on https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html

Landwehr answered 4/12, 2019 at 7:40 Comment(3)
Thanks for the late answer. Actually I can not remember most of the problem myself, so it is difficult to me to say, if it solves the question. How ever I hope it will help someone elseClaimant
I Hope so to :)Landwehr
Thanks, it helps :) Details about data frames transform here elastic.co/guide/en/elasticsearch/reference/master/…Clarhe
A
-4

You create materialised view.Its a table eventually which has data of aggregated functions. As you have already inserted the aggregated data ,now when you query it, it will be faster. I feel there is no need to cache as well.Even i have created the MVs , it improves the performance tremendously. Having said that you can even go for elastic search as well where you can cache the aggregated queries if your data is not changing frequently.I feel MV and elastic search gives the same performance.

Alcorn answered 14/8, 2017 at 8:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.