How to implement user defined fields and grouping for the multi-tenant application: EAV, fixed tables pattern, NoSQL
Asked Answered
S

1

12

I'm working at a SaaS, where any tenant can have several lists of contacts, each list can have any number of custom fields contacts of this list can store and any number of groups the contancts of the list can be includes (groups are used for segmenting contacts of the list). Each contact has one required fields: email_address and any number of userdefined fields that are defined for the list where it is as I mentioned about. We must be able to find contacts of the lists base on the groups they're in and the values of user defined values. We must provise up to 30 user defined fields. I see now three ways of solving this problem:

  1. Using kind of EAV (we try to do it like this) but it looks rather complex. We have a table lists (lists of tenants), a related tables custom_fields, a related tables subscribers that stored email_addreses of subscribers of the list, table subscribers_custom_data which is related with subscribers and custom_fields tables (stored values of the custom fields of the subscribers).

  2. Field tables pattern. The descriptions of it is here http://blog.springsource.com/arjen/archives/2008/01/24/storing-custom-fields-in-the-database/. In thise case we'd use a field related to custom fields, that would store in columns all custom fields for example have 30 columns to store values of each possible custom fields and a table that stored mapping of columns name and name of the user defined field. It looks complex too. We'd have to have at least 30 indexes at least to search by the values of custom fields, there other problems too,

  3. To use some sort of NoSQL database at least for storing of user defined fields and maybe groups of the list. Do you think such databases can help here and if so how to design for storing custom fields and groups. I try to look at different types NoSQL, for example, document oriented like MongoDb, but right away I can't see how it can help solve this problem. We can store arbitrary attributes here but to search for the values of the custom fields we need to index them in advance so we have to know what custom fields we'll have.

Thank you for any information about it.

Sobersided answered 21/11, 2010 at 8:38 Comment(0)
J
12

If you want all fields to be indexed all the time, try a technology like Apache Solr that indexes everything. The main purpose of Solr is to be a fulltext search engine, but it's basically a document-oriented database.

Here are comments about other options:

  1. EAV is no good, and I'm against use it. It breaks many rules of relational database design, and it won't scale. I've written about this on Stack Overflow a lot, so search for my answers under the eav tag.

  2. You don't need just 30 indexes -- you need up to 30-factorial indexes to handle any possible combination of indexes. Keep in mind you can make multi-column indexes, and these types of indexes are important to support certain queries. Of course this is totally impractical to create so many indexes; you need to create indexes to match the queries you want to optimize for. If you don't know what fields you'll have and what queries you'll have against them, you can't optimize.

  3. Document-oriented databases like MongoDB/CouchDB aren't magic, no matter how much their advocates try to claim that they are. They requires that you index documents for fast searches, and that means you need to know the indexable fields of a document.

    Creating an index at runtime is a problem, because it can take a long time, depending on how much data there is to index. You'll have to find a way to run the index creation "offline" (i.e. don't make the user wait for it during a single http request) and then notify them when it's complete.

  4. You should read about How FriendFeed uses MySQL to store schema-less data. They use a Serialized LOB, basically combine all the custom attributes together into one XML or JSON blob. So users can create any number of additional custom fields anytime they want. But before a given custom field can be made searchable, you would create a child table that references rows where that field contains a given value. Thus you get an index that is only as large as the number of instances of a given user-defined custom field. And you don't need to make every field searchable.

Jurassic answered 23/1, 2012 at 7:23 Comment(9)
With my usecase, I will be able to know the indexable fields (but only at runtime). The tenants are able to define their own field set (by chosing from a set of pre-defined and/or adding their own field descriptors). So at that time they add a new field, I should be able to fire a (sparse) index creation. So for this particular scenario, would a document oriented store fit best ?Wegner
Yes, a document store could work in this situation. See my edit above.Jurassic
As for using blob it's not clear how to delete/edit custom fields. For example, the user can delete a field in it's container and the field should be deleted in all entities of this container. Can you please tell me how to delete/edit custom fields and reflect it in all entities? Besides the for example adding/deleting custom fields by use should make it available using and filtering. In case EAV cascade deleted happens. The size of one field is limited and it's hard to predict what it will hit limit of blob size. But It's difficult to say if nosql can give any benefits.Sobersided
Bill, have you ever met using NoSQL as alternative of EAV?Sobersided
The problem is that custom fields should be added/edited/deleted online by user and the changes should be reflected to all entities at once. I'd rather say different rows can have different sets of attributes.Sobersided
@user366534: Most NoSQL databases don't solve that requirement either. That is, delete a custom attribute and have it change all entries that have that attribute. You'd have to manually update each document. NoSQL databases also share with EAV the problems that you can't have constraints or any assurance that any particular set of attributes exist. But at least when you use NoSQL, those failings aren't breaking any rules. :-)Jurassic
In case of custom fields delete data associated with custom fields and effective filtering seem big issues not matter what approach to try to use.Sobersided
Yes -- when the requirements are complex (e.g. support custom fields per user), then the solution is bound to be complex. See TANSTAAFL.Jurassic
There is also a problem that in other method than EAV (it fields of different type are stored in different tables) we can;t guarantee type of custom fields both in blob and perhaps in nosql solutions (to make sure that integer field is proccessed as integer and can't be anything else). It look like it's more complex taks I that I thought..Sobersided

© 2022 - 2024 — McMap. All rights reserved.