Help me understand mnesia (NoSQL) modeling

Asked 6/11, 2010 at 13:41 Answered 24/11, 2010 at 7:36

In my Quest to understanding Mnesia, I still struggle with thinking in relational terms. So I will put my struggles up here and ask for the best way to solve them.

one-to-many-relations Say I have a bunch of people,

-record(contact, {name, phone}).

Now, I know that I can define phone to always be saved as a list, so people can have multiple phone numbers, and I suppose that's the way to do it (is it? How would I then look this up the other way around, say, finding a name to a number?).

many-to-many-relations now let's suppose I have multiple groups I can put people in. The group names don't have any significance, they are just names; the concept is "unix system groups" or "labels". Naively, I would model this membership as a proplist, like

{groups [{friends, bool()}, {family, bool()}, {work, bool()}]} %% and so on...

as a field within the "contact" record from above, for example. What is the best way to model this within mnesia if I want to be able to lookup all members of a group based on group name quickly, and also want to be able to lookup all group an individual is registered in? I also could just model this as a list containing just the group identifiers, of course. For use with mnesia, what is the best way to model this?

I apologize if this question is dumb. There's plenty of documentation on mnesia, but it's lacking (IMO) some good examples for the overall use.

Grave answered 6/11, 2010 at 13:41 Comment(1)

No need to apologize IMHO the question is not dumb at all +1 for that – Organizer 6/11, 2010 at 17:2

For the first example, consider this record:

-record(contact, {name, [phonenumber, phonenumber, ...]}).

contact is a record with two fields, name and phone where phone is a list of phone numbers. As user425720 said it could make sense to store these as something else than strings, if you have extreme requirements for small storage footprint, for example.

Now here comes the part that is hard to "get" with key-value stores: you need to also store the inverse relationship. In other words, you need something similar to the following:

-record(phone, {phonenumber, contactname}).

If you have a layer in your application to abstract away database handling, you could make it always add/change the phone records when adding/changing a contact.

For the second example, consider these two records:

-record(contact, {uuid, name, [group_id, group_id]}).
-record(group, {uuid, name, [contact_id, contact_id]}).

The easiest way is to just store ids pointing to the related records. As Mnesia has no concept of referential integrity, this can become out of sync if you for example delete a group without removing that group from all users.

If you need to store the type of group on the contact record, you could use the following:

-record(contact, {name, [{family, [group_id, group_id]}, {work, [..]}]}).

Your second problem could also be solved by using a intermediate record, which you can think of as "membership".

-record(contact, {uuid, name, ...}).
-record(group, {uuid, name, ...}).
-record(membership, {contact_uuid, group_uuid}). # must use 'bag' table type

There can be any number of "membership" records. There will be one record for every users group.

Bombastic answered 24/11, 2010 at 7:36 Comment(0)

-1

First of all, you ask for key-value store design patters. Perfectly fine. Before I will try to answer your question lets make it clear - what is Mnesia. It is k-v DB, which is included in OTP. Because it is native, it is very comfortable to use from Erlang. But be careful. This is old database with very ancient assumptions (e.g. data distribution with linear hashing). So go ahead, learn and play with it, but for production take your time and browse NoSQL shop to find the best for your needs.

@telephone example. Do not store stuff as strings (list()) - it is very heavy for GC. I would make couple fields like phone_1 :: < < binary > > , phone_2 :: < < binary > >, phone_extra :: [ < < binary > > ] and build index on the most frequent query-field. Also mnesia indicies are tricky - when node crashes and goes up, they need to rebuild themselves (it can take awfully lot of time).

@family example. It quite hard with flat namespace. You may play with more complex keys.. Maybe create separate table for TheGroup and keep identifiers of members? Or each member would have ids of groups he belongs (hard to maintain..). If you want to recognize friends I would implement some sort of contract before presenting data (A is B's friend iff B is A's friend) - this approach would cope with eventual consistency and conflicts in data.

Mcinnis answered 6/11, 2010 at 23:13 Comment(2)

While I didn't vote you down, it seems clear to me that you didn't get what I was asking in the second question (or I didn't make it clear enough). I want to model many-to-many-relationships; "family" and "friends" didn't refer to any concept in particular. I also already pointed out that lists with members are a way, but asked for a better way. – Kibbutznik 7/11, 2010 at 17:1

yeah, nev. However your tuple example seems to be what I write in second part. The first part says to model many-to-many as separate table holding keys of members. It depends on DB. In Riak there is 2-level key structure (bucket, key)-(value) so you may have all goup members in one bucket. From definition KV stores are not good in representing relations. – Mcinnis 7/11, 2010 at 17:49

Recommended topics

Hot tags