Apache Ignite: How does the indexing work?
Asked Answered
M

1

10

How does Apache Ignite's indexing work? I haven't found those technical details in the documentation.

  1. Is it using a B-tree?
  2. Where is the index stored?
  3. How is it stored?
  4. What performance (in Big-O notation) does the index provide after build in usage?
  5. How fast does it build, when does it build?
  6. Ignite can store arbitrary serializable Java objects. How does it deal with composites when I want to index a field of a sub-sub-object?
  7. Ignite Cache is a key-value store. Am I able to have different classes (=types as objects) as values? In other words, is Ignite Cache Schemaless? If yes, how does this fit with my SQL-queries?
  8. Ignite Cache is a key-values store. How does do the keys come into play if I SQL-query for my values? What am I querying for?
  9. The keys can be arbitrary, serializable Java objects - am I able to query for the keys or only the values?
Monahon answered 27/11, 2015 at 9:6 Comment(0)
C
5

This information is not covered really much in docs because it is mostly implementation detail and can change from version to version. After all the source code is available if you are interested in details. To be specific I'm talking about Ignite 1.5 which is about to be released.

  1. Before 1.5 the default data structure was a snap-tree (variant of avl-tree), since 1.5 skip-list option was added as well and it is a default now.
  2. In java heap or in off-heap memory depending on config.
  3. Reliably :) I don't understand this question.
  4. log(N) on update and lookup.
  5. Index is getting updated on each transaction commit (or just cache update in case of atomic cache), there is no separate build phase. You can expect you indexes to be in correct state after each update.
  6. Ignite has two options (since 1.5): either to store objects in binary format which allows to get separate field values or keep the whole object deserialized and use reflection.
  7. etc.

Have fun!

Covenantee answered 27/11, 2015 at 15:42 Comment(8)
Regarding 6: So does Ignite look up the entire object tree of composites to find the one field that has the same name as mentioned in my SQL statement?Monahon
Since SQL schema for cache is defined before any queries, property access object is prepared beforehand, thus lookup by name does not happen.Covenantee
What do you mean by lookup by name? Am I not able to write something like SELECT firstname FROM mytable?Monahon
Probably you did not setup indexing correctly, please read docs and examples once more.Covenantee
As for lookup by name, let me explain. Take a look at the example here github.com/apache/ignite/tree/master/examples/src/main/java/org/… . We have DimStore type which has fields annotated with @QuerySqlField. Using these fields Ignite creates SQL table, thus it knows in advance which fields we need to access, prepare access object for each field and after query parsing we will be able to use that access objects without any lookups by name.Covenantee
Let's see if I got it: When I write s.th. into my Ignite Cache (which is a KEY-VALUE-store), Ignite inspects the inserted VALUE-objects and creates a look-up-table for the @QuerySqlField-fields, while additionally indexing those fields, which are to be indexed. In this look-up-table Ignite also saves "access-objects", which are copies of KEY-objects of my Ignite Cache. Ignite only makes copies of those KEY-objects which corresponding VALUE-objects are relevant to my SQL-queries. Here, the look-up-table would be | id | name | KEY |, while the Ignite Cache would be | KEY | VALUE |.Monahon
When makeing a SQL-query, Ignite does not fish in my Ignite Cache, but instead goes to the look-up-table. There it finds those rows which I searched with my SQL-query (here at id and name ). From those rows Igite takes the KEYs = "access-objects" that are relevant to my SQL-query. With those KEYs Ignite goes to my Ignite Cache and fetches those VALUE-objects which I've been looking for. Did I get it right? (Thanks for your patience.)Monahon
Ignite inspects object type at the schema initialization, it does not need to inspect each and every object stored in cache. Thus field access object is initialized only once for type and can be used for any object of that type.Covenantee

© 2022 - 2024 — McMap. All rights reserved.