Key-Value Stores vs. RDBMs vs. "Cloud" DBs (SDB) [closed]

M

2

6

I'm comfortable in the MySQL space having designed several apps over the past few years, and then continuously refining performance and scalability aspects. I also have some experience working with memcached to provide application side speed-ups on frequently queried result sets. And recently I implemented the Amazon SDB as my primary "database" for an ecommerce experiment.

To oversimplify, a quick justification I went through in my mind for using the SDB service was that using a schema-less database structure would allow me to focus on the logical problem of my project and rapidly accumulate content in my data-store. That is, don't worry about setting up and normalize all possible permutations of a product's attributes before hand; simply start loading in the products and the SDB will simply remember everything that is available.

Now that I have managed to get through the first few iterations of my project and I need to setup simple interfaces to the data, I am running to issues that I had taken for granted working with MySQL. Ex: grouping in select statements and limit syntax to query "items 50 to 100". The ease advantage I gained using schema free architecture of SDB, I lost to a performance hit of querying/looping a resultset with just over 1800 items.

Now I'm reading about projects like Tokyo Cabinet that are extending the concept of in-memory key-value stores to provide pseudo-relational functionality at ridiculously faster speeds (14x i read somewhere).

My question: Are there some rudimentary guidelines or heuristics that I as an application designer/developer can go through to evaluate which DB tech is the most appropriate at each stage of my project.

Ex: At a prototyping stage where logical/technical unknowns of the application make data structure fluid: use SDB. At a more mature stage where user deliverables are a priority, use traditional tools where you don't have to spend dev time writing sorting, grouping or pagination logic.

Practical experience with these tools would be very much appreciated.

Thanks SO!

Shaheeb R.

Martinemartineau answered 13/3, 2009 at 16:49 Comment(0)

P

4

The new solutions are not silver bullets.

Compared to traditional RDBMS, these systems make improvements in some aspect (scalability, availability or simplicity) by trading-off other aspects (reduced query capability, eventual consistency, horrible performance for certain operations).

Think of these not as replacements of the traditional database, but they are specialized tools for a known, specific need.

Take Amazon Simple DB for example, SDB is basically a huge spreadsheet, if that is what your data looks like, then it probably works well and the superb scalability and simplicity will save you a lot of time and money.

If your system requires very structured and complex queries but you insist with one of these cool new solution, you will soon find yourself in the middle of re-implementing a amateurish, ill-designed RDBMS, with all of its inherent problems.

In this respect, if you do not know whether these will suit your need, I think it is actually better to do your first few iterations in a traditional RDBMS because they give you the best flexibility and capability especially in a single server deployment and under modest load. (see CAP Theorem).

Once you have a better idea about what your data will look like and how will they be used, then you can match your need with an alternative solution.

If you want the simplicity of a cloud hosted solution, but needs a relational database, you can check out: Amazon Relational Database Service

Preshrunk answered 25/10, 2011 at 2:34 Comment(0)

F

5

The problems you are finding are why RDBMS specialists view some of the alternative systems with a jaundiced eye. Yes, the alternative systems handle certain specific requirements extremely fast, but as soon as you want to do something else with the same data, the fleetest suddenly becomes the laggard. By contrast, an RDBMS typically manages the variations with greater aplomb; it may not be quite as fast as the fleetest for the specialized workload which the fleetest is micro-optimized to handle, but it seldom deteriorates as fast when called upon to deal with other queries.

Furgeson answered 13/4, 2009 at 5:12 Comment(1)

So true, people are hyping no-SQL solutions as a magic hammer replacement instead of a specialized surgeon's tool for specific situations. – Preshrunk 1/11, 2011 at 23:9

P

4