Array, EAV, Serialized LOB for custom fields?
Asked Answered
U

2

8

I've been trying to answer a complex Mysql data structure problem for custom fields for an online app. I'm fairly new to Mysql so any input is appreciated.

The current database is a relational database and each user of the service will share the same database and tables.

Here is an example of what I'm trying to do.

Let's say I'm trying to create a list. This list can contain up to 30 custom fields. The user can choose between 12 unique elements and each element can have up to 15 user defined attributes.

Each list can be unique within an account as well as between accounts. Accounts can have numerous lists and each list could have different quantities of elements as well as different attributes per element.

An element can be many things, for example: multiple choice, radio button, phone field, address, single line text, multi-line text, etc.

An example of attributes for a multiple choice (checkbox) element could be: red, green, blue, orange, white, black

An example of a single line text element could be: First Name input field.

Each element must also have a user defined title field and tag field which can be referenced and used in other features of the app.

Segmentation is very important as well. A user needs to be able to segment a list based on any element. For example, a user may want to segment list "ABC" based on all records where "red" is present in multiple choice element #1 (they may have more than 1 multiple choice element for a list).

In this example I would assume that arrays, EAV, Serialized LOB would work fine. However, I'm not sure what would be the best structure for my needs at my scale.

In reality, there will most likely be up to 50,000 records per list and there is a real possibility of 20,000+ accounts - each with numerous lists. Therefore, I'm looking for the most efficient and flexible structure.

To make matters even more complex I also need to ensure an efficient way to add/ delete elements to any particular list at any given time. For example, if a user creates a list with the maximum allow number of custom fields (30) and then three months later decides they want to delete a field, I need a way to find that list and all associated values for that custom field and then delete all the values, element type and its attributes. The user would then be allowed to add a new element to this list.

I've reviewed many of the EAV posts on this site, as well as this http://www.martinfowler.com/eaaCatalog/serializedLOB.html It doesn't seem that EAV would be very efficient for my needs due to the data retrieval downsides.

I was also wondering how well a multi-dimensional array would work at this scale? I believe wordpress uses this for their custom fields.

Any input would be greatly appreciated as to how best to structure the database for this situation. Thank you!

Undergarment answered 11/10, 2010 at 19:26 Comment(1)
I'm also facing the same challenge - what solution did you go with? I'd be very interested in your experiences.Playsuit
R
3

You can read about how FriendFeed implements custom fields: http://bret.appspot.com/entry/how-friendfeed-uses-mysql

They use a combination of Serialized LOB, with extra tables containing inverted indexes. You don't need an extra table for every possible attribute in your LOB, only the ones you want to search for with assistance from an index.

Rioux answered 16/11, 2011 at 19:10 Comment(2)
@wangyiran, not surprising, that blog is over 6 years old. I describe the same technique of inverted indexing in my presentation here: slideshare.net/billkarwin/extensible-data-modeling.Rioux
@BillKarwin I recommend everyone to check that link with slidesNorton
L
0

You can use json enconding and decoding (i'm assuming you're using PHP) to store the input info in a table with a collumn to store the user and other to store this data as text. The answers have to be stored in another table (with a FK to use CASCADE ON DELETE).

If you can specify the max size of the input specification, use a varchar field.

This can't be the best aprouch (need some profiling tests to make sure it's robust enough) but can sure be used.

Lobate answered 15/11, 2011 at 19:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.