Lock-free programming in Haskell

Asked 4/8, 2011 at 16:44 Answered 4/8, 2011 at 17:30

Does anyone know if it is possible to do lock-free programming in Haskell? I'm interested both in the question of whether the appropriate low-level primitives are available, and (if they are) on any information on what works in terms of using these to build working larger-scale systems in the pure functional context. (I've never done lock-free programming in a pure functional context before.) For instance, as I understand it the Control.Concurrent.Chan channels are built on top of MVars, which (as I understand things) use locks---could one in principle build versions of the Chan primitive which are lock free internally? How much performance gain might one hope to get?

I shoudl also say that I'm familiar with the existence of TVars, but don't understand their internal implementation---I've been given to understand that they are mostly lock free, but I'm not sure if they're entirely lock free. So any information on the internal implementation of TVars would also be helpful!

(This thread provides some discussion, but I wonder if there's anything more up to date/more comprehensive.)

Guyot answered 4/8, 2011 at 16:44 Comment(0)

Not only does an MVar use locks, it is a lock abstraction. And, as I recall, individual STM primitives are optimistic, but there are locks used in various places in the STM implementation. Just remember the handy rhyme: "If it can block, then beware of locks".

For real lock-free programming you want to use IORefs directly, along with atomicModifyIORef.

Edit: regarding black holes, as I recall the implementation is lock free, but I can't vouch for the details. The mechanism is described in "Runtime Support for Multicore Haskell": http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/multicore-ghc.pdf

But that implementation underwent some tweaks, I think, as described in Simon Marlow's 2010 Haskell Implementors Workshop talk "Scheduling Lazy Evaluation on Multicore": http://haskell.org/haskellwiki/HaskellImplementorsWorkshop/2010. The slides are unfortunately offline, but the video should still work.

Outtalk answered 4/8, 2011 at 17:3 Comment(9)

My impression was that, if anything, STM tends to fail at the other extreme--with high contention and no locking, the boundless optimism results in transactions cheerfully invalidating each other left and right until nobody can get anything done. – Soutor 4/8, 2011 at 17:15

Right! I agree, MVars are just simply locks; the question was worded non-optimally. I agree that I want to use IORefs and atomic modification. But atomicModifyIORef isn't the only lock-free primitive you might want---critically, one might want a compare-and-set! primitive which modifies an IORef only if its current value is still what you expected it to be... – Guyot 4/8, 2011 at 17:41

Also: is the black hole mechanism implemented using locks? (In GHC.) – Guyot 4/8, 2011 at 17:42

Yeah, no compare-and-swap as far as I know. But thunk delay semantics should be enough to give you anything you need wiht atomicModifyIORef. The downside, of course, is that you're paying, in exchange, the typical constant-factor overhead of a lazy garbage collected language. – Outtalk 4/8, 2011 at 18:17

@C. A. McCann I don't know about STM, but I recall Herlithy designed some algorithms which guarantee progress in his "Art of Multicore Programming" book; I'd be surprised if those didn't influence STM. – Agronomy 5/8, 2011 at 1:54

There is no need for compare and swap with atomicModifyIORef, all you need to do is in the function you use to do the modification, you compare is to what you want it to be, and if it is, swap it, if it isn't, leave the original value there. You can do whatever you want inside your function, doesn't matter how complex it is, because all that atomicModifyIORef does is place a thunk inside itself to the computation, which can be evaluated in the future by threads that need it. See my reply to the question and the referenced slides for details. – Hartzke 5/8, 2011 at 2:51

@AxMan6, @sclv. I see how this works, but I do have one worry: is the 'black hole' mechanism implemented using locks? If it is, then your solution will (it seems to me) cause you to end up using the implicit locks inside the black holes. So for this to be a 'lock free' solution we need to know the black holes are lock free. (Contrariwise, if the B.H.s aren't lock free, to write lock free code you'll need to evaluate to WHNF---or probably even NF---before the atomicModifyIORef, and suggests you still need a compare/set primitive.) – Guyot 5/8, 2011 at 12:54

@Guyot see my references above re bhs. – Outtalk 5/8, 2011 at 16:48

@C.A.McCann The STM implementation is guaranteed to make progress. One specific transaction can get stuck, not all transactions simultaneously. Somebody has to be making progress. – Hickey 6/7, 2012 at 13:31

Lock free programming is trivial in Haskell. The easiest way to have a shared piece of data that needs to be modified by many threads is to start with any normal haskell type (list, Map, Maybe, whatever you need), and place it in an IORef. Once you've done this, you have the ability to use atomicModifyIORef to perform modifications in place, which are guaranteed to take next to no time.

type MyDataStructure = [Int]
type ConcMyData = IORef MyDataStructure

main = do
    sharedData <- newIORef []
    ...
    atomicModifyIORef sharedData (\xs -> (1:xs,()))

The reason this works is that a pointer to the think that will eventually evaluate the result inside the IORef is stored, and whenever a thread reads from the IORef, they get the thunk, and evaluate as much of the structure as it needs. Since all threads could read this same thunk, it will only be evaluated once (and if it's evaluated more than once, it's guaranteed to always end up with the same result, so concurrent evaluations are ok). I believe this is correct, I'm happy to be corrected though.

The take home message from this is that this sort of abstraction is only easily implemented in a pure language, where the value of things never change (except of course when they do, with types like IORef, MVars, and the STM types). The copy on write nature of Haskell's data structures means that modified structures can share a lot of data with the original structure, while only allocating anything that's new to the structure.

I don't think i've done a very good explaining how this works, but I'll come back tomorrow and clarify my answer.

For more information, see the slides for the talk Multicore programming in Haskell by Simon Marlow of Microsoft Research (and one of the main GHC implementors).

Hartzke answered 4/8, 2011 at 17:30 Comment(0)

Look into stm, specifically its TChan type.

Richelieu answered 4/8, 2011 at 16:49 Comment(5)

Thanks---I knew of STM (see edit above) but didn't know so much about how it's implemented internally. Is it lock free? Mostly lock free? Also, are any lower-level lock-free primitives exposed? – Guyot 4/8, 2011 at 16:52

@Guyot I'm pretty sure STM is lock-free, but if its not, its the best you are going to get. – Richelieu 4/8, 2011 at 16:55

OK, fair enough! Although I'm still interested if anyone has more detailed information (or can point to such...) – Guyot 4/8, 2011 at 16:58

STM is optimistic, but not completely lock free as I recall. – Outtalk 4/8, 2011 at 16:58

I'd also agree that STM isn't lock free. It lets you think in a lock free manor, but it's more of an abstraction over something that almost certainly has to use locking somewhere. – Hartzke 5/8, 2011 at 2:53

Recommended topics

Hot tags