Transactions in NoSQL?
Asked Answered
H

11

83

I'm looking into NoSQL for scaling alternatives to a database. What do I do if I want transaction-based things that are sensitive to these kind of things?

Hagood answered 6/2, 2010 at 5:55 Comment(4)
FYI... NoSQL databases are still DBs, they are just not relational. As to the transactions, A transaction is simply the logical grouping of queries and updates. Non-Relational DBs still provide both of those functions. What kind of things are sensitive to what things?Kindergarten
well, i want to do money transactions, or at least think about them. but i still want some integrity in that sense.Hagood
How many terabytes of data do you have that you can't use a standard, mainstream RDBMS that has built-in transaction support?Mosora
@Mosora Number of TB of data has nothing to do with necessity to use NoSQL DBs. Maybe he wants to get rid of EAV model in his relational DB.Phosphorite
S
49

Generally speaking, NoSQL solutions have lighter weight transactional semantics than relational databases, but still have facilities for atomic operations at some level.

Generally, the ones which do master-master replication provide less in the way of consistency, and more availability. So one should choose the right tool for the right problem.

Many offer transactions at the single document (or row etc.) level. For example with MongoDB there is atomicity at the single document - but documents can be fairly rich so this usually works pretty well -- more info here.

Sternutatory answered 6/2, 2010 at 19:35 Comment(7)
Some NoSQL databases, like MarkLogic, actually provide real ACID transactions.Disobey
RavenDB also provides real ACID transactions.Antinomy
FoundationDB also provides multi-key ACID transaction in multi-node cluster.Underwear
Neo4j is a NoSQL store and does provide ACID properties.Nobby
RavenDB does not provide true ACID transactions. It uses a weak form of isolation called "snapshot isolation". It provides global transactions via an external coordinator, but use is discouraged foundationdb.com/acid-claimsRecitativo
Orientdb seems to support ACID transactions as wellBushed
Couchbase now supports ACID transactions as wellBaeyer
B
18

This is the closest answer I found which would apply to any NoSQL database. It's on a 2007 blog post from Adam Wiggins of Heroku.com:

The old example of using a database transaction to wrap the transfer of money from one bank account to another is total bull. The correct solution is to store a list of ledger events (transfers between accounts) and show the current balance as a sum of the ledger. If you’re programming in a functional language (or thinking that way), this is obvious.

From: A World Without SQL (His website is great for ideas on scalability.)

I interpreted the above paragraph as:

  1. Create a database for member accounts.
  2. Create a messaging queue. Nickname it "ledger".
  3. Add in background workers to fulfill each request in the queue.

More info. on queues/background workers: Building a Queue-Backed Feed Reader, Part 1

The client (aka member or customer) follows these steps to take out money:

  1. Submit a request to take out money.
  2. Request is sent to server.
  3. Server places it in a queue. The message is: "Take out $5,000."
  4. Client is shown: "Please wait as request is being fulfilled..."
  5. Client machines polls server every 2 seconds asking, "Has the request been fulfilled?"
  6. On server, background workers are fulfilling previous requests from other members in first-in/first-out fashion. Eventually, they get to your client's request to take out money.
  7. Once request has been fulfilled, client is given a message with their new balance.

You can use Heroku.com to create a small mock-up quickly if you are comfortable with Node.js or Ruby/Rack.

The general idea seems pretty easy and much better than using transactions baked into the database that make it super-hard to scale.

Disclaimer: I haven't implemented this in any way yet. I read about these things for curiosity even though I have no practical need for them. Yes, @gbn is right that a RDBMS with transactions would probably be sufficient for the needs of Timmy and me. Nevertheless, it would be fun to see how far you can take NoSQL databases with open-source tools and a how-to website called, "A Tornado of Razorblades".

Barthold answered 15/8, 2010 at 18:2 Comment(4)
Seems to be a strange criticism to the "hello world" example for transactions. What happens if during creation of one of the "ledger events" something fails? Then the balance for that account would be wrong. This does not sound like a workable replacement for transactions to me.Tousle
The linked webpage shows a stunning degree of ignorance about the necessity for ACID in virtually all financial systems. Firstly, the article argues for 'performance' whilst it ignores the performance cost of having to read EVERY SINGLE TRANSACTION from the history in order to process a new transaction. Secondly, and more importantly, how does this solution work in a case when there are CONCURRENT requests happening on the same account, and when a business transaction consists of updates to several entities? What happens if the server dies in the middle of the processing?Quarters
This is all about two-phase commits. Google around and you'll see that you can get consistency without transactions.Orizaba
Andrew, what happens if your card transaction fails half way thru. Have you ever seen a bank statement with a reverse transaction?Berchtesgaden
S
17

NoSQL covers a diverse set of tools and services, including key-value-, document, graph and wide-column stores. They usually try improving scalability of the data store, usually by distributing data processing. Transactions require ACID properties of how DBs perform user operations. ACID restricts how scalability can be improved: most of the NoSQL tools relax consistency criteria of the operatioins to get fault-tolerance and availability for scaling, which makes implementing ACID transactions very hard.

A commonly cited theoretical reasoning of distributed data stores is the CAP theorem: consistency, availability and partition tolerance cannot be achieved at the same time. SQL, NoSQL and NewSQL tools can be classified according to what they give up; a good figure might be found here.

A new, weaker set of requirements replacing ACID is BASE ("basically avalilable, soft state, eventual consistency"). However, eventually consistent tools ("eventually all accesses to an item will return the last updated value") are hardly acceptable in transactional applications like banking. Here a good idea would be to use in-memory, column-oriented and distributed SQL/ACID databases, for example VoltDB; I suggest looking at these "NewSQL" solutions.

Swim answered 23/8, 2012 at 16:39 Comment(2)
"most of these tools give up consistency and therefore ACID" It seams, you confuse consicency as in ACID with consistency as in CAP. C in CAP means all replicas of the data are equal. while C in ACID is a vague and ambiguous term... generally speaking availability does not contradict ACID. Example of Google Spinner proves it.V1
ACID consistency requires that transactions, as a series of client operations can only originate from and end in valid database states. It is only similar to C in CAP, so that's right, these are not the same and do not contradict. It is only very hard to implement ACID transactions in an AP system, which is usually assumed for scalability. I rephrase my answer. Looking back now, I find that CAP theorem and CAP categories were too vague, not providing real help in categorizing these tools. I think CAP only remains an interesting theoretical example of distributed system design compromises.Swim
P
14

Just wanted to comment to money transaction advice on this thread. Transactions are something you really want to use with money transfers.

The example given how do que the transfers is very nice and tidy.

But in real life transferring money may include fees or payments to other accounts. People get bonuses for using certain cards that come from another account or they may get fees taken from their account to another account in same system. The fees or payments can vary by financial transaction and you may need to keep up bookkeeping system that shows credit and debit of each transaction as it comes.

This means you want to update more than one row same time since credit on one account can be debit on one or more accounts. First you lock the rows so nothing can change before update then you make sure data written is consistent with the transaction.

That's why you really want to use transactions. If anything goes wrong writing to one row you can rollback whole bunch of updates without the financial transaction data ending inconsistent.

Paronym answered 7/11, 2011 at 22:58 Comment(1)
There are other, arguably better ways to handle the side effects of the transaction. The transaction is the original event and as long as it is recorded atomically, any other error or problem can be traced back to that event.Impale
C
5

The problem with one transaction and two operations (for example one pay $5,000, second receive $5,000) - is that you have two accounts with same priority. You cannot use one account to confirm second (or in reverse order). In this case you can guaranty only one account will be correct (that is confirmed), second (that confirm) may have fails. Lets look why it can fails (using message aproatch, sender is confirmed by receiver):

  1. Write +$5,000 to receiver account
  2. If success - write -$5,000 to sender account
  3. If fails - try againt or cancel or show message

It will guaranty save for #1. But who guaranty if #2 fails? Same for reverse order.

But this is possible to implements to be safe without transactions and with NoSQL. You are always allowed use third entity that will be confirmed from sender and receiver side and guaranty your operation was performed:

  1. Generating unique transaction id and creating transaction entity
  2. Write +$5,000 to receiver account (with reference to transaction id)
  3. If success - set state of transaction to send
  4. Write -$5,000 to sedned account account (with reference to transaction id)
  5. If success - set state of transaction to receive

This transaction record will guaranty that is was ok for send/receive massages. Now you can check every message by transaction id and if it has state received or completed - you take it in account for user balance.

Camillacamille answered 23/4, 2012 at 7:50 Comment(3)
What if steps 3 and 5 were to fail? This adds a lot of complexity that is the reason db transactions are so useful.Tuckerbag
Typically such a system never relies on just sql ability to validate a transaction. And also in real scenario credit and debit are mostly happening across time and bank - which is beyond sql or nosql - capabilities...such thing can only be taken care by a well designed architecture - which works smoothly for transactions within a system or across the systems.Semiautomatic
I think this approach is good. However, we also must think of having distributed execution of the transaction parts (one part running in, say, micro-service 1 and another part in, say, micro-service 2 that is running on a server in a different domain in the cloud). Without some kind of a background job that handles these transactions by appropriately setting the statuses of associated records residing in multiple servers, the distributed transactions in NoSQL are hard to do (but inevitable).Tuckie
M
2

Depends on your DB, but ... I would say in general, you can use 'Optimistic transactions' to achieve this but I imagine one should make sure to understand the database implementation's atomicity guarantees (e.g. what kind of write and read operations are atomic).

There seems to be some discussions on the net about HBase transactions, if thats any help.

Merri answered 6/2, 2010 at 18:26 Comment(0)
S
1

You can always use a NoSQL approach in a SQL DB. NoSQL seems to generally use "key/value data stores": you can always implement this in your preferred RDBMS and hence keep the good stuff like transactions, ACID properties, support from your friendly DBA, etc, while realising the NoSQL performance and flexibility benefits, e.g. via a table such as

CREATE TABLE MY_KEY_VALUE_DATA
(
    id_content INTEGER PRIMARY KEY,
    b_content  BLOB
);

Bonus is you can add extra fields here to link your content into other, properly relational tables, while still keeping your bulky content in the main BLOB (or TEXT if apt) field.

Personally I favour a TEXT representation so you're not tied into a language for working with the data, e.g. using serialized Java means you can access the content from Perl for reporting, say. TEXT is also easier to debug and generally work with as a developer.

Shier answered 23/2, 2010 at 9:32 Comment(0)
I
1

have a look at scalaris its a no sql db with strong consistency and implemented transactions.

Interplay answered 5/6, 2012 at 19:44 Comment(0)
S
1

That's why I'm creating a NoSQL Document store solution to be able to use "real" transactions on Enterprise applications with the power of unstructured data approach. Take a look at http://djondb.com and feel free to add any feature you think could be useful.

Stook answered 6/7, 2012 at 20:31 Comment(0)
E
1

surely there are others

Evetteevey answered 16/1, 2013 at 8:51 Comment(0)
S
0

You can implement optimistic transactions on top of NoSQL solution if it supports compare-and-set. I wrote an example and some explanation on a GitHub page how to do it in MongoDB, but you can repeat it in any suitable NoSQL solution.

Saying answered 6/10, 2012 at 7:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.