Is SimpleDB similar to MongoDB?
The most substantial similarity is the fact that they both avoid the relational model. Other than that, they are mainly different any way you look at them. Here is a breakdown of a dozen or so ways to compare them.
SimpleDB
- An Amazon service hosted, maintained and scaled by Amazon. You are billed for what you use each month beyond the free usage tier.
- All data is replicated live in the background across multiple data centers
- All replicas are able to service live requests
- After a network or server failure any out of sync nodes will resync automatically
- Background replication results in eventual consistency but higher (theoretical) availability
- All data is stored as String name / String value pairs, each associated with an ItemName
- Each item is limited to half a megabyte (each name or value can only be 1024 bytes long, each item holds 256 name / value pairs) and each domain can hold 10GB
- These limits make it suitable for data sets that can be broken down into small pieces.
- SimpleDB is optimized for many small requests executed in parallel
- Throughput limits are in place for each domain of data
- Horizontal Scalability is achieved by spreading your data across more domains
- All attributes values are indexed automatically, compound indexes don't exist (but can be simulated)
- Queries are performed using a (stripped down) SQL Select-like query language
MongoDB
- An open source product that you install and maintain on your own servers.
- Data can be replicated in master-slave mode
- Only the master can service live write requests, slave can service queries (except in non-recommend limited-master-master mode)
- After a network or server failure or when a replica falls too far behind, operator intervention will always be required.
- The single master is strongly consistent.
- All data is stored as serialized JSON documents, allowing a large set of data types
- Each document is limited to 4MB, larger documents can be stored using a special document chunking system
- Most Suitable for small and medium sized data, and small binary objects
- Throughput limits are dictated by MongoDB and your hardware
- Vertical scalability via a bigger server, potential for future horizontal scalability across your own server cluster via a sharding module currently in development.
- The document id is indexed automatically. Indexes can be created and deleted as needed. Indexes can be for a single key or compound.
- Queries are performed using a JSON style query language.
SimpleDB is described as:
The data model is simply:
- Large collections of items organized into domains.
- Items are little hash tables containing attributes of key, value pairs.
- Attributes can be searched with various lexicographical queries.
MongoDB is a bit simpler:
The database manages collections of JSON-like documents which are stored in a binary format referred to as BSON.
I have a decent knowledge of mongodb and just started to work with SimpleDB. So first of all both of them are not key-value storage. Mongodb and SimpleDB is a document based nosql database which are schema-free. This means that you do not need to create a schema for a 'table' before entering the data in it (basically it means you can store there everything you want).
Basically here the similarity ends. I will use S for SimpleDB and M for Mongo.
- M is written in C++, S is written in Erlang (not the fastest language)
- M is open source, installed everywhere, S is proprietary, can run only on amazon AWS. You should also pay for a whole bunch of staff for S
- S has whole bunch of strange limitations. M limitations are way more reasonable. The most strange limitations are:
- maximum size of domain (table) is 10 GB
- attribute value length (size of field) is 1024 bytes
- maximum items in Select response - 2500
- maximum response size for Select (the maximum amount of data S can return you) - 1Mb
- S supports only a few languages (java, php, python, ruby, .net), M supports way more
- both support REST
- S has a query syntax very similar to SQL (but way less powerful). With M you need to learn a new syntax which looks like json (also it is straight-forward to learn the basics)
- with M you have to learn how you architect your database. Because many people think that schemaless means that you can throw any junk in the database and extract this with ease, they might be surprised that Junk in, Junk out maxim works. I assume that the same is in S, but can not claim it with certainty.
- both do not allow case insensitive search. In M you can use regex to somehow (ugly/no index) overcome this limitation without introducing the additional lowercase field/application logic.
- in S sorting can be done only on one field
- because of 5s timelimit count in S can behave strange. If 5 seconds passed and the query has not finished, you end up with a partial number and a token which allows you to continue query. Application logic is responsible for collecting all this data an summing up.
- everything is a UTF-8 string, which makes it a pain in the ass to work with non string values (like numbers, dates) in S. M type support is way richer.
- both do not have transactions and joins
- M supports compression which is really helpful for nosql stores, where the same field name is stored all-over again.
- S support just a single index, M has single, compound, multi-key, geospatial etc.
- both support replication and sharding
One of the most important things you should consider is that SimpleDB has a very rudimentary query language. Even basic things like group by
, sum
average
, distinct
as well as data manipulation is not supported, so the functionality is not really way richer than Redis/Memcached. On the other hand Mongo support a rich query language.
© 2022 - 2024 — McMap. All rights reserved.