what the author of nedtries means by "in-place"?

Asked 14/1, 2011 at 14:29 Answered 19/6, 2011 at 18:9

Solved c binary-tree bit-manipulation trie patricia-trie

I. Just implemented a kind of bitwise trie (based on nedtries), but my code does lot Of memory allocation (for each node). Contrary to my implemetation, nedtries are claimed to be fast , among othet things, Because of their small number of memory allocation (if any). The author claim his implementation to be "in-place", but what does it really means in this context ? And how does nedtries achieve such a small number of dynamic memory allocation ?

Ps: I know that the sources are available, but the code is pretty hard to follow and I cannot figure how it works

Phionna answered 14/1, 2011 at 14:29 Comment(0)

I took a look at the nedtrie.h source code. It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.

You use the NEDTRIE_ENTRY macro to add parent/child/next/prev links to your data structure, and you can then pass that data structure to the various trie routines, which will extract and use those added members.

So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.

At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:

Dworman answered 20/1, 2011 at 17:32 Comment(2)

So, it works just like Linux kernel's linked list implementation. Thx for having decrypted the code, I had hard time trying to figure out what all these macros did. – Phionna 23/1, 2011 at 23:59

Yes, that's commonly referred to as an "intrusive" linked list. – Seismology 30/1, 2013 at 15:37

I'm the author, so this is for the benefit of the many according to Google who are similarly having difficulties in using nedtries. I would like to thank the people here on stackflow for not making unpleasant comments about me personally which some other discussions about nedtries do.

I am afraid I don't understand the difficulties with knowing how to use it. Usage is exceptionally easy - simply copy the example in the Readme.html file:

typedef struct foo_s foo_t;
struct foo_s {
  NEDTRIE_ENTRY(foo_t) link;
  size_t key;
};
typedef struct foo_tree_s foo_tree_t;
NEDTRIE_HEAD(foo_tree_s, foo_t);
static foo_tree_t footree;

static size_t fookeyfunct(const foo_t *RESTRICT r)
{
  return r->key;
}

NEDTRIE_GENERATE(static, foo_tree_s, foo_s, link, fookeyfunct, NEDTRIE_NOBBLEZEROS(foo_tree_s));

int main(void)
{
  foo_t a, b, c, *r;
  NEDTRIE_INIT(&footree);
  a.key=2;
  NEDTRIE_INSERT(foo_tree_s, &footree, &a);
  b.key=6;
  NEDTRIE_INSERT(foo_tree_s, &footree, &b);
  r=NEDTRIE_FIND(foo_tree_s, &footree, &b);
  assert(r==&b);
  c.key=5;
  r=NEDTRIE_NFIND(foo_tree_s, &footree, &c);
  assert(r==&b); /* NFIND finds next largest. Invert the key function to invert this */
  NEDTRIE_REMOVE(foo_tree_s, &footree, &a);
  NEDTRIE_FOREACH(r, foo_tree_s, &footree)
  {
    printf("%p, %u\n", r, r->key);
  }
  NEDTRIE_PREV(foo_tree_s, &footree, &a);
  return 0;
}

You declare your item type - here it's struct foo_s. You need the NEDTRIE_ENTRY() inside it otherwise it can contain whatever you like. You also need a key generating function. Other than that, it's pretty boilerplate.

I wouldn't have chosen this system of macro based initialisation myself! But it's for compatibility with the BSD rbtree.h so nedtries is very easy to swap in to anything using BSD rbtree.h.

Regarding my usage of "in place" algorithms, well I guess my lack of computer science training shows here. What I would call "in place" is when you only use the memory passed into a piece of code, so if you hand 64 bytes to an in place algorithm it will only touch that 64 bytes i.e. it won't make use of extra metadata, or allocate some extra memory, or indeed write to global state. A good example is an "in place" sort implementation where only the collection being sorted (and I suppose the thread stack) gets touched.

Hence no, nedtries doesn't need a memory allocator. It stores all the data it needs in the NEDTRIE_ENTRY and NEDTRIE_HEAD macro expansions. In other words, when you allocate your struct foo_s, you do all the memory allocation for nedtries.
Regarding understanding the "macro goodness", it's far easier to understand the logic if you compile it as C++ and then debug it :). The C++ build uses templates and the debugger will cleanly show you state at any given time. In fact, all debugging from my end happens in a C++ build and I meticulously transcribe the C++ changes into macroised C.
Lastly, before a new release, I search Google for people having problems with my software to see if I can fix things and I am typically amazed what someone people say about me and my free software. Firstly, why didn't those people having difficulties ask me directly for help? If I know that there is something wrong with the docs, then I can fix them - equally, asking on stackoverflow doesn't let me know immediately that there is a docs problem bur rather relies on me to find it next release. So all I would say is that if anyone finds a problem with my docs, please do email me and say so, even if there is a discussion say like here on stackflow.

Niall

Campman answered 19/6, 2011 at 18:9 Comment(2)

I just wanted to point out the "in place" is pretty ambiguous terminology. I think the term people are looking for here is "Intrusive" which is pretty unambiguous and which I've seen used with many other data structures that expect their book keeping pointers to be allocated inside their data. – Seismology 30/1, 2013 at 15:35

Point taken. I've also noticed that different parts of the world like to use different nomenclature for various things. Also, engineers aged different ages and of different generations also have particular preferences for what to call things. – Campman 2/2, 2013 at 23:29

In-place means you operate on the original (input) data, so the input data becomes the output data. Not-in-place means that you have separate input and output data, and the input data is not modified. In-place operations have a number of advantages - smaller cache/memory footprint, lower memory bandwidth, hence typically better performance, etc, but they have the disadvantage that they are destructive, i.e. you lose the original input data (which may or may not matter, depending on the use case).

Embracery answered 14/1, 2011 at 14:34 Comment(1)

Your definition correspond to what I mean by "in-place", but it seems like the auhtor of nedtries has a slighly different one; in his presentation of nedtries he says "nedtries uses an in-place algorithm i.e. it does NOT use dynamic memory". Plus, there is no malloc/calloc/mmap call in his code. How is this possible ? Here is the link to his page : nedprod.com/programs/portable/nedtries – Phionna 14/1, 2011 at 22:18

I took a look at the nedtrie.h source code. It seems that the reason it is "in-place" is that you have to add the trie bookkeeping data to the items that you want to store.

So it is "in-place" in the sense that you augment your existing data structures and the trie code piggybacks on that.

At least that's what it looks like. There's lots of macro goodness in that code so I could have gotten myself confused (:

Dworman answered 20/1, 2011 at 17:32 Comment(2)

So, it works just like Linux kernel's linked list implementation. Thx for having decrypted the code, I had hard time trying to figure out what all these macros did. – Phionna 23/1, 2011 at 23:59

Yes, that's commonly referred to as an "intrusive" linked list. – Seismology 30/1, 2013 at 15:37

-1

In-place means to operate on the input data and (possibly) update it. The implication is that there no copying and/moving of the input data. This may result in loosing the input data original values which you will need to consider if it is relevant for your particular case.

Doolittle answered 14/1, 2011 at 14:53 Comment(0)

Recommended topics

Hot tags