How should I make a clojure STM program persistent?
Asked Answered
S

4

17

I am writing a clojure program which uses the STM. At the moment I am populating the STM (using refs) at startup from a database, and then asynchronously updating the database whenever a dosync transaction succeeds. I have no idea if I am doing this the right way though, or if there is a better standard technique for doing this. Could anyone explain to me how they make the ACI properties of STM into ACID in their Clojure programs?

Sanborn answered 23/12, 2010 at 22:6 Comment(0)
C
13

In general, adding the 'D' in ACID to any program is not trivial and depends on the program's requirements. There is one important specification that needs to be determined before the implementation can be determined.

Is there multi-threaded/multi-process access to the database?

From the question body, your program appears to only read at startup and write after a change in the STM, where the database would lag the values in the STM by some small amount of time. However, if the database is accessed by other programs including other instances of your program, then you will need to use locks where you lock access to the database right before the transactions, and unlock after the write to the database (as a side note, note that the database in your case can be anything, including a simple file in the filesystem). There is no way around this when you have multiple read and writes, because they are both side effects that involve the database.

If there is not multiple access, then asynchronous writing is fine because the code is guaranteed to always work in order since your program is single threaded when in comes to access.

If you only have multiple write threads and no reads after startup with only a single instance, then you only need to ensure correct write order. You can do this with agents, where the agent is basically a queue of write operations to the database. You wrap the dosync around the reference transactions and the agent, giving you durability in addition to persistence.

In general the more complicated the requirements that involve side effects, the more tricks you will have to do to ensure ACID. If you have additional requirements, then the implementations I gave might have to change.

EDIT:

(def db-agent (agent dummy-value))
(defn db-write [_ data] ;; make this intelligent to handle when db is not up
    (try
        (write-to-db data)
    (catch ... database fails, do a retry or let user know of problem))
    _)
;; in the transaction code
(dosync
    (alter my-ref ...)
    (send-off db-agent db-write @my-ref)) ;; ensure db gets written to
Cayuse answered 29/12, 2010 at 20:14 Comment(6)
Ah, finally someone tries to answer the actual question, thanks :) There is no multiprocess access to the database (H2 or Oracle). The only process that accesses the database is the Clojure program itself.Sanborn
Does your program have multi-threaded read or multi-threaded write? If multi-threaded read, is it ever after startup?Cayuse
After startup, is the multi-threaded read from the persistent store, or from the in-memory reference?Cayuse
Ok, then my suggested use of agents should suffice, I'll edit the post to reflect some code.Cayuse
thanks, useful way of saving with agents, make sure writes are serial i guess?Sanborn
Yup, to preserve write order aka serialize.Cayuse
M
3

You might be interested in:

  1. Alyssa Kwan's modified Clojure core that adds persistence to refs, see: ANN: Durable refs with ACID guarantees - Phase I, ANN: Durable Clojure - Phase II - Persistent Data Structures, ANN: Durable Clojure - Functions and Closures

  2. Sergey Didenko's library, that does not give a strong durability guarantee but is quite close to that: Simple-Persistence-for-Clojure

The other approaches can be not so transparent for programmer.

Martian answered 30/12, 2010 at 23:45 Comment(1)
FleetDB link goes to a blog about wine.Kwei
F
2

The STM model is very well suited to tracking multiple access to systems as they change. It is less directly suited to data persistence where the changes need to be accessable beyond the life of the threads that are accessing them.

It's generally good to think about the 'D' in ACID separately from the STM

Fetial answered 24/12, 2010 at 0:6 Comment(0)
C
1

If you want a database that has fast in-memory access, and persists behind the scenes from time to time, then use a real data store, rather than trying to build your own, which would be quite a big job.

Redis and MongoDB are two good options, but there are many others. You can find Clojure libraries at https://github.com/ragnard/redis-clojure and https://github.com/somnium/congomongo for Redis and Mongo, respectively.

Chokebore answered 29/12, 2010 at 20:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.