boost::multi_index composite keys efficiency
Asked Answered
E

1

2

Long time reader first time poster! I'm playing around with the boost::multi_index container stuff and have a rather in-depth question that hopefully a boost or C++ container expert might know (my knowledge in C++ containers is pretty basic). For reference, the boost documentation on composite keys can be found here: boost::multi_index composite keys.

When using a composite key, the documentation states that "Composite keys are sorted by lexicographical order, i.e. sorting is performed by the first key, then the second key if the first one is equal, etc". Does this mean that the structure is stored such that a lookup for a specific 2-part composite key will take O(n=1) time, i.e. is the container sorted such that there is a pointer directly to each item, or does the boost container retrieve a list that matches the first part of the composite key and then need to perform a search for items matching the second part of the key and thus is slower?

For example, if I was to maintain two containers manually using two different indices and wanted to find items that matched a specific 2-part query I would probably filter the first container for all items matching the 1st part of the query, and then filter the result for items that match the 2nd part of the query. So this manual method would effectively involve two searches. Does boost effectively do this or does it improve on efficiency somehow via the use of composite keys?

Hopefully I've explained myself here but please ask questions and I will try my best to clarify exactly what I mean!

Enchiridion answered 1/12, 2013 at 14:18 Comment(2)
An example says 1000 words. It also gives us a quick chance to do some measurements and to delve into the library implementation.Bigener
I figured that since there is an example in the boost::multi_index documentation (via the hyperlink I put in) it wasn't worth repeating it here?Enchiridion
V
4

Lookups involving composite keys do not go through any two-stage process as you describe. composite_key-induced orderings are normal orderings, the only special thing about it being its dependence on two or more element keys rather than one.

Maybe an example will clarify. Consider this use of composite_key:

struct element
{
  int x,y,z;
};

typedef multi_index_container<
  element,
  indexed_by<
    ordered_unique<
      composite_key<
        element,
        member<element,int,&element::x>,
        member<element,int,&element::y>,
        member<element,int,&element::z>
      >
    >
  >
> multi_t;

The resulting container is in a sense equivalent to this:

struct element_cmp
{
  bool operator()(const element& v1, const element& v2)const
  {
    if(v1.x<v2.x)return true;
    if(v2.x<v1.x)return false;
    if(v1.y<v2.y)return true;
    if(v2.y<v1.y)return false;
    return v1.z<v2.z;
  }
};

typedef std::set<element,element_cmp> set_t;

composite_key automatically generates equivalent code to that in element_cmp::operator(), and additionally allows for lookup on just the first n keys, but the underlying data structure does not change with respect to the case using std::set.

Vagarious answered 1/12, 2013 at 19:49 Comment(6)
So in terms of lookup time, how does a lookup compare for each of the three indices? i.e. can you retrieve all keys with say x=1 quicker than all keys with say x=1, y=2, z=3? What if you perform a lookup for just z=3? Thanks for the response so far btw, I think we're honing in on my answer :) !Enchiridion
The example provided does not have three indices but just one, which, by virtue of the composite key, allows for lookups on x, (x,y) and (x,y,z), all of them (roughly) with the same performance. To efficiently look up for z values you'd need an additional index on it.Pyonephritis
Ah ok so if I had three individual indices AND the composite with all three in it would be slower to use the composite than just using the z index? This is the point in my question, how efficient is the boost composite key searching, is it actually any faster than getting an iterator for one of the individual indices, followed by another filter outside of the multi_index container using that iterator you've retrieved. Or is it effectively doing all that for you, and therefore is just a nicer, cleaner way? Sorry, I know I'm asking a weird question.Enchiridion
If you want to look up for say x=1, y=2, z=3, using composite_key is much faster than looking up for x=1 and postfiltering the result (by linear scanning). Moreover, there is no way to use three different indices on x, y and z, respectively, to do the (x=1,y=2,z=3) lookup any more efficiently. I think you're confusing terms: an index with a composite key on (x,y,z) is not the same as three individual indices on x, y and z.Pyonephritis
Ok thanks, that's exactly what I wanted to know :)! So it does create some sort of pointer structure that means a good speed up over three individual indices. Thank you!Enchiridion
Using composite_key does not create any special pointer structure beyond what an individual index provides: it is just a smart way to use a regular index for doing multi-field lookups.Pyonephritis

© 2022 - 2024 — McMap. All rights reserved.