The Next-gen Databases [closed]
Asked Answered
K

8

55

I'm learning traditional Relational Databases (with PostgreSQL) and doing some research I've come across some new types of databases. CouchDB, Drizzle, and Scalaris to name a few, what is going to be the next database technologies to deal with?

Kellie answered 12/11, 2008 at 2:2 Comment(5)
Could someone please update this question to refer to "databases" instead of "SQL"?Flexure
Even though randin is using the term SQL incorrectly, I think that change would be against the spirit of peer editing.World
too late.. sorry Bill. Feel free to roll back my edit if you feel strongly. I made my change before you posted your comment. I think rephrasing it the way I did is both educational to the OP and more useful to the community.Gibe
Well, it's good to be correct. A tech writer friend of mine said, "you can't get the right answers unless you ask the right questions."World
Ah, sorry about the misleading question, my knowledge of SQL and databases was non-existent when I asked the question.Kellie
W
105

I would say next-gen database, not next-gen SQL.

SQL is a language for querying and manipulating relational databases. SQL is dictated by an international standard. While the standard is revised, it seems to always work within the relational database paradigm.

Here are a few new data storage technologies that are getting attention (circa 2008 when I wrote this answer):

  • CouchDB is a non-relational database. They call it a document-oriented database.
  • Amazon SimpleDB is also a non-relational database accessed in a distributed manner through a web service. Amazon also has a distributed key-value store called Dynamo, which powers some of its S3 services.
  • Dynomite and Kai are open source solutions inspired by Amazon Dynamo.
  • BigTable is a proprietary data storage solution used by Google, and implemented using their Google File System technology. Google's MapReduce framework uses BigTable.
  • Hadoop is an open-source technology inspired by Google's MapReduce, and serving a similar need, to distribute the work of very large scale data stores.
  • Scalaris is a distributed transactional key/value store. Also not relational, and does not use SQL. It's a research project from the Zuse Institute in Berlin, Germany.
  • RDF is a standard for storing semantic data, in which data and metadata are interchangeable. It has its own query language SPARQL, which resembles SQL superficially, but is actually totally different.
  • Vertica is a highly scalable column-oriented analytic database designed for distributed (grid) architecture. It does claim to be relational and SQL-compliant. It can be used through Amazon's Elastic Compute Cloud.
  • Greenplum is a high-scale data warehousing DBMS, which implements both MapReduce and SQL.
  • XML isn't a DBMS at all, it's an interchange format. But some DBMS products work with data in XML format.
  • ODBMS, or Object Databases, are for managing complex data. There don't seem to be any dominant ODBMS products in the mainstream, perhaps because of lack of standardization. Standard SQL is gradually gaining some OO features (e.g. extensible data types and tables).
  • Drizzle is a relational database, drawing a lot of its code from MySQL. It includes various architectural changes designed to manage data in a scalable "cloud computing" system architecture. Presumably it will continue to use standard SQL with some MySQL enhancements.
  • Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store, developed at Facebook by one of the authors of Amazon Dynamo, and contributed to the Apache project.
  • Project Voldemort is a non-relational, distributed, key-value storage system. It is used at LinkedIn.com
  • Berkeley DB deserves some mention too. It's not "next-gen" because it dates back to the early 1990's. It's a popular key-value store that is easy to embed in a variety of applications. The technology is currently owned by Oracle Corp.

Also see this nice article by Richard Jones: "Anti-RDBMS: A list of distributed key-value stores." He goes into more detail describing some of these technologies.

Relational databases have weaknesses, to be sure. People have been arguing that they don't handle all data modeling requirements since the day it was first introduced.

Year after year, researchers come up with new ways of managing data to satisfy special requirements: either requirements to handle data relationships that don't fit into the relational model, or else requirements of high-scale volume or speed that demand data processing be done on distributed collections of servers, instead of central database servers.

Even though these advanced technologies do great things to solve the specialized problem they were designed for, relational databases are still a good general-purpose solution for most business needs. SQL isn't going away.


I've written an article in php|Architect magazine about the innovation of non-relational databases, and data modeling in relational vs. non-relational databases. http://www.phparch.com/magazine/2010-2/september/

World answered 12/11, 2008 at 2:24 Comment(3)
Hey Bill, we do tend to answer the same questions a lot.. your answer here is thorough enough I don't feel writing my own would be of much use -- want to add some info about Vertica et al, and Greenplum and friends, to make it more complete?Gibe
Thank you Bill for the through answer, I'll just stick with PostgreSQL for the time being.Kellie
PostgreSQL is a fine choice for RDBMS. Have fun!World
M
25

I'm missing graph databases in the answers so far. A graph or network of objects is common in programming and can be useful in databases as well. It can handle semi-structured and interconnected information in an efficient way. Among the areas where graph databases have gained a lot of interest are semantic web and bioinformatics. RDF was mentioned, and it is in fact a language that represents a graph. Here's some pointers to what's happening in the graph database area:

I'm part of the Neo4j project, which is written in Java but has bindings to Python, Ruby and Scala as well. Some people use it with Clojure or Groovy/Grails. There is also a GUI tool evolving.

Maricruzmaridel answered 26/3, 2009 at 22:26 Comment(2)
How about db4o.com, an object-database, but its designed around managing object graphs.Denise
Object databases (OODB) are different from graph databases. Simply put a graphdb won't tie your data directly to your object model. In a graphdb relationships are first class citizens, while you'd have to implement that on your own in an OODB. In a graphdb you can have different object types represent different views on the same data. Graphdbs typically support things like finding shortest paths and the like.Maricruzmaridel
C
10

Might be not the best place to answer with this, but I'd like to share this taxonomy of noSQL world created by Steve Yen (please find it at http://de.slideshare.net/northscale/nosqloakland-200911021)

  1. key‐value‐cache

    • memcached
    • repcached
    • coherence
    • 
infinispan
    • eXtreme
scale
    • 
jboss
cache
    • velocity
    • terracoqa

  2. key‐value‐store

    • 
keyspace
    • 
flare
    • schema‐free
    • 
RAMCloud

  3. eventually‐consistent key‐value‐store

    • dynamo
    • 
voldemort
    • 
Dynomite
    • 
SubRecord
    • 
MongoDb
    • 
Dovetaildb
  4. ordered‐key‐value‐store

    • tokyo
tyrant
    • lightcloud
    • 
NMDB
    • luxio
    • 
memcachedb
    • 
actord

  5. data‐structures server

    • redis

  6. tuple‐store

    • gigaspaces
    • 
coord
    • 
apache
river
  7. object database

    • ZopeDB
    • db4o
    • 
Shoal

  8. document store

    • CouchDB
    • Mongo
    • 
Jackrabbit
    • 
XML
Databases
    • 
ThruDB
    • 
CloudKit
    • 
Perservere
    • 
Riak
Basho
    • 
Scalaris

  9. wide columnar store

    • BigTable
    • Hbase
    • Cassandra
    • Hypertable
    • KAI
    • 
OpenNep
Conakry answered 19/3, 2011 at 10:26 Comment(0)
D
2

For a look into what academic research is being done in the area of next gen databases take a look at this: http://www.thethirdmanifesto.com/

In regard to the SQL language as a proper implementation of the relational model, I quote from wikipedia, "SQL, initially pushed as the standard language for relational databases, deviates from the relational model in several places. The current ISO SQL standard doesn't mention the relational model or use relational terms or concepts. However, it is possible to create a database conforming to the relational model using SQL if one does not use certain SQL features."

http://en.wikipedia.org/wiki/Relational_model (Referenced in the section "SQL and the relational model" on March 28, 2010

Denise answered 28/3, 2010 at 11:15 Comment(0)
A
1

Not to be pedantic, but I would like to point out that at least CouchDB isn't SQL-based. And I would hope that the next-gen SQL would make SQL a lot less... fugly and non-intuitive.

Adolfo answered 12/11, 2008 at 2:5 Comment(2)
A friend of mine said, "It's supposed to be hard to read! It's called code for a reason!" :-)World
My brain is broken, I like SQL, too much looking at it grow on to you :)Disoperation
U
1

There are special databases for XML like MarkLogic and Berkeley XMLDB. They can index xml-docs and one can query them with XQuery. I expect JSON databases, maybe they already exist. Did some googling but couldn't find one.

Unteach answered 22/3, 2009 at 17:30 Comment(1)
There are a few that provide a JSON interface to the data. Terrastore is one example.Anklet
L
0

SQL has been around since the early 1970s so I don't think that it's going to go away any time soon.

Maybe the 'new(-ish) sql' will oql (see http://en.wikipedia.org/wiki/ODBMS)

Lengel answered 12/11, 2008 at 2:15 Comment(0)
D
0

I heard also about NimbusDB by Jim Starkey

Jim Starkey is the man who "create" Interbase

who work on Vulcan (a Firebird fork)

and who was at the begining of Falcon for MySQL

Dunghill answered 8/4, 2009 at 22:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.