Simple Set implementation in D?
Asked Answered
C

4

7

I was fishing around in D's standard library looking for a Set implementation, and I only found these:

  • BinaryHeap
  • RedBlackTree

These both would work fine, if I could only figure out how to use them. I started with the RedBlackTree (because I'm already familiar with how they work), and this is what I came up with:

auto rbt = redBlackTree!string();
foreach(s; setOfStrings) {
    rbt.insert(s);
}

foreach(s; rbt) {
    if (s[0 .. 3] == "sth") {
        rbt.removeKey(s);
    }
}

I know I could have done the condition in the first foreach, but it's just an example showing that I need to add and remove elements from the Set. This would work, but I get compile errors:

Error: template std.container.RedBlackTree!(string).RedBlackTree.removeKey(U) if (isImplicitlyConvertible!(U,Elem)) does not match any function template declaration

Error: template std.container.RedBlackTree!(string).RedBlackTree.removeKey(U) if (isImplicitlyConvertible!(U,Elem)) cannot deduce template function from argument types !()(string

I don't need a red-black tree (anything without duplicates), and speed isn't too important. I could do something like this:

string[] arr;
foreach(s; setOfStrings) {
    // check for duplicate code here...

    arr ~= s;
}
for(int i = 0; i < arr.length; i++) {
    if (s[0 .. 3] == "sth") {
        arr = arr[0 .. i] ~ arr[i + 1 .. $];
        i++;
    }
}

Is there anything in the standard library for a simple Set?

Calumnious answered 11/12, 2011 at 1:41 Comment(0)
I
8

RedBlackTree is Phobos' set implementation. The problem that you're running into with removeKey is the fact that it's variadic. It will take either an array of keys to remove or multiple keys (e.g. removeKey(arr) or removeKey(key1, key2, key3)). string is an array of immmutable chars, so it tries to instantiate removeKey with char instead of string, which doesn't work, because your tree holds strings, not chars. You wouldn't have this sort of problem if you were dealing with a RedBlackTree of ints or any other non-array type.

What you need to do is either give it an array of strings or to instantiate it directly i.e. removeKey([s]) or removeKey!string(s).

By the way, std.container has gone the route of naming its container types based on their data structure rather than what they're used for. So, when you say that you don't need a red-black tree, that's not quite right. You want a set. You just don't care how it's implemented. The two typical ways to implement sets involve using either a red-black tree or a hash table. So, RedBlackTree gives you one way to have a set. It's just that its named after its data structure, not how you might use it, so if you're looking for a container name Set in std.container, you're not going to find it.

EDIT: A bug report exists for this, and a fix has been submitted. So, in a future release of dmd, it should be possible to pass a string to removeKey without having to directly instantiate it or pass the string in inside an array.

Intermezzo answered 11/12, 2011 at 2:30 Comment(2)
I tried the first suggestion, removeKey([s]), but that didn't work so well. I totally spaced about the second, which should work (removeKey!string(s)). Thanks for the info!Calumnious
BTW, I ended up just using an Associative Array (I found the in keyword very handy), but I'll probably end up using the RedBlackTree sometime soon.Calumnious
C
4

Not that I know of.

Your best bet is to just use a hashtable by the keys (bool[key] yourTable;) and ignore the values.

Colpitis answered 11/12, 2011 at 1:54 Comment(1)
That's a good idea, especially since they have a remove function. Thanks!Calumnious
P
1

I know of at least one: http://www.dsource.org/projects/dcollections

Plumbing answered 11/12, 2011 at 2:2 Comment(5)
std.container.RedBlackTree's implementation comes from dcollections, though its API does not.Intermezzo
Does that mean that other parts may have a chance at getting into the standard library? std.container looks a little sparse...Calumnious
IIRC there is a bit of a disagreement between Steven and Andrei as to what the API should be like, so it didn't get merged in. I'm not familiar with specifics though.Ziagos
The reason that std.container is so sparse is that the custom allocator API hasn't been fully fleshed out yet, and Andrei doesn't want to have to change all of the containers after it's been completed. Once that's been sorted out, many more containers will be added. Their implementations more or may not come from dcollections, but their API will not, because std.container has a different API that it's using. The primary disagreement between Andrei and Steven has to do with cursors. dcollections has cursors, std.container does not and will not. It uses ranges exclusively.Intermezzo
@tjameson, to add on what Jonathan said. Yes, both parties are willing to have such migrated. Steven however may not be the one to do it, others are welcome to.Rubdown
C
1

Improving on Mehrdad's suggestion of using bool[key], you can avoid some space wastage by using byte[0][key]. byte[0] is a static array of size zero, so it's a type that uses no space. Usage:

byte[0][string] mySet;

// Insert an element.
mySet["foo"] = (byte[0]).init;

// Lookup
assert("foo" in mySet);

// Remove
mySet.remove("foo");
Camaraderie answered 11/12, 2011 at 17:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.