How can Datomic users cope without composite indexes?
Asked Answered
Y

3

6

In Datomic, how do you efficiently perform queries such as 'find all people living in Washington older than 50' (city and age may vary)? In relational databases and most of NoSQL databases you use composite indexes for this purpose; Datomic, as far as I'm aware of, does not support anything like this.

I built several, say, medium-sized web-apps and not a single one would perform quick enough, if not for composite indexes. How are Datomic users dealing with this? Or are they just playing with datasets small enough not to suffer from this? Am I missing something?

Yellowgreen answered 3/7, 2014 at 19:0 Comment(3)
Same question here. Did you find out any solution to your problem? Thanks.Illona
I'm merely playing with Datomic, so I don't have an actual problem :) However, I would like to know, what are the limitations of it and whether I can use it in some real project.Yellowgreen
One - quite ugly - approach that comes to mind is to create special 'indexing' attribute, in which multiple other attributes are concatenated (so, given example above, its value is such as 'washington-1983-01-10'). Now you can query for entities within range 'washington-startdate' and 'washington-enddate'. It works, but it smells a lot.Yellowgreen
J
0

Update 2019-06-28: Since 0.9.5927 (Datomic On-Prem) / 480-8770 (Datomic Cloud), Datomic supports Tuples as a new Attribute Type, which allows you to have compound indexes.

Jampack answered 28/6, 2019 at 16:42 Comment(3)
While tuples are really cool, does this address the problem? Say I want to build a composite index on two columns: integer and date. Without tuples, I can encode values such as left_padded_number-YYYY-MM-DD strings and benefit from lexicographical indexing of such a field. Sure, with tuples, I don't have to bother with parsing strings (great!) but I still have to organize my data in quite a strange manner only to gain the indexing capabilities..? Or do I get it wrong?Yellowgreen
@TomasKulich I haven't used Tuples yet, but I think Composite Tuples are what you are looking for: docs.datomic.com/cloud/schema/…Jampack
Ah, I see, this is awesome! This seems to be exactly the composite index semantics! From the brief skimming of the docs, even range queries should work just fine.Yellowgreen
P
3

This problem and its solution are not identical in Datomic due to the structure of data (datoms) in Datomic. There are two performance characteristics/strategies that may add some shading to this:

(1) When you fetch data in Datomic, you fetch an entire leaf segment from the index tree (not an individual item) - with segments being composed of potentially many thousands of datoms. This is then cached automatically so that you don't have to reach out over the network to get more datoms.

If you're querying a single person - i.e., a single entity, for their age and where they live, it's very likely the query's navigation of the EAVT or AEVT indexes may have cached everything you need. You've effectively cached the datom, how to navigate to it to it, and related datoms (by locality in the index).

(2) Partitions can provide a manual means to specify locality of reference. Partitions impact the entity ID's value (it's encoded in the high bits) and ensure that related entities are sorted near each other. So for an alternative implementation of the above problem, if you needed information from the city and person entities both, you could include them in the same partition.

Pacifa answered 16/12, 2014 at 23:51 Comment(0)
P
2

I've written a library to handle this: https://github.com/arohner/datomic-compound-index

Planography answered 31/3, 2015 at 23:10 Comment(0)
J
0

Update 2019-06-28: Since 0.9.5927 (Datomic On-Prem) / 480-8770 (Datomic Cloud), Datomic supports Tuples as a new Attribute Type, which allows you to have compound indexes.

Jampack answered 28/6, 2019 at 16:42 Comment(3)
While tuples are really cool, does this address the problem? Say I want to build a composite index on two columns: integer and date. Without tuples, I can encode values such as left_padded_number-YYYY-MM-DD strings and benefit from lexicographical indexing of such a field. Sure, with tuples, I don't have to bother with parsing strings (great!) but I still have to organize my data in quite a strange manner only to gain the indexing capabilities..? Or do I get it wrong?Yellowgreen
@TomasKulich I haven't used Tuples yet, but I think Composite Tuples are what you are looking for: docs.datomic.com/cloud/schema/…Jampack
Ah, I see, this is awesome! This seems to be exactly the composite index semantics! From the brief skimming of the docs, even range queries should work just fine.Yellowgreen

© 2022 - 2024 — McMap. All rights reserved.