Non-blocking IO with Haskell [duplicate]
Asked Answered
O

2

5

Possible Duplicate:
What is the Haskell response to Node.js?
How can I watch multiple files/socket to become readable/writable in Haskell?

Is it possible to write a Haskell program that performs IO in a non-blocking way like in nodejs?

For example, i would like to get 10 records from a database that is far away, so I would like to fire 10 requests concurrently, and when the result is available, then return this collection. The IO monad is not going to help, because the monad explicitly serializes the computations with bind. I think the continuation passing style where you pass around the computation you want next have the same problem, again it serializes the computation. I do not want to work with threads, I am looking for another solution. Is this possible?

Obligation answered 10/12, 2012 at 23:34 Comment(12)
When you say you don't want to work with threads, would it be acceptable to use a library implemented with threads so long as you don't have to manage them yourself?Extension
You should say what it is about threads that you don't like, rather than just that you don't want to use them.Carcass
Why the artificial exclusion of threads? That would be the natural solution in Haskell.Athey
Incidentally, there is at least one way to accomplish this using just IO without any extra threads or libraries, just an unsafe function. I can't really recommend it though.Extension
I think you mean you want event driven IO. Is it for web server programming?Franzen
(I should point out that Haskell's take on threads is extraordinarily lightweight compared with OS threads, and that's one of the reasons the existing web server frameworks scale up very well.)Franzen
Well, actually I would like to develop a futrue library for Haskell where unfinished computations would be captured with a future object. You will have one future object essentially for each function call, so using threads would be heavy weight (you might be hundreds of thousand outstanding requests). If Haskell can do all kinds of new control structures, then how can do this type?Obligation
Haskell's threads are lightweight enough to have many many of them. How exactly do you think this can work without some form of "run multiple things at once" capability anyway? If you've sent off the request, either you wait for it to respond, or you arrange for something else (such as a thread) to wait for it to respond so that you can later ask the "something else" if it has received a response. If you do neither of those things, then nobody will be listening when the response comes in.Hued
5 seconds of googling hints to me that node.js' non-blocking operations are implemented using threads. So if you want to implement something like that, you need threads to be involved.Hued
@Obligation Right, I thought that might be your complaint. Haskell threads scale to the hundreds of thousands range, so just use them! (This is also why I suggested the other StackOverflow question I did, which had the same concerns as you.)Carcass
You say, that Haskell threads scale to the hundreds of thousands, and I do not doubt it, but would like to know how it is actually implemented? Is there an event loop for each machine thread? Do they use work stealing? Does it use epoll internally?Obligation
@Hued No node.js does it via an event loop system. You could argue there are threads involved giving your own definition but I am not sure if it could be called one by any acceptable definition of threads. There's not really two execution paths, basically. There's a listener which listens for interrupts from IO.Signification
A
21

Haskell threads are exceedingly light weight. What is more, GHCs IO monad uses event driven scheduling much of the time, meaning ordinary Haskell code is like continuation passing style node.js code (only compiled to native code and run with multiple CPUs...)

Your example is trivial

import Control.Concurrent.Async

--given a list of requests
requests :: [IO Foo]

--you can run them concurrently
getRequests :: IO [Foo]
getRequests = mapConcurrently id requests

Control.Concurrent.Async is probably exactly what you are looking for with respect to a library for futures. Haskell should never choke on mere thousands of (ordinary) threads. I haven't ever written code that uses millions of IO threads, but I would guess your only problems would be memory related.

Ankeny answered 11/12, 2012 at 1:9 Comment(3)
I accept the solution, but this does not really answer the question I had. Is it possible to devise a control structure that allows a "single thread" to perform non-blocking concurrent programming (with some underlying thread pool)? By a "single thread" I mean to make sure that only one thread is executing concurrently (all others are blocked waiting for IO), so one can use regular IORefs with no synchronization/blocking.Obligation
@Obligation Hm, one way to do this would be to build a simple monad with both references (implemented with IO refs) and FFI actions/system calls but separated into two universes by the type system (not so hard to do), and then anytime you want to perform an FFI call in your monad with references, the code you write is the equivalent of async foo >>= unsafeInterleaveIO . wait this would give you a strait ahead style of programming, and perform all IO in an asynchronous way, but has all the downsides of lazy IO.Ankeny
Hmm, I have to read up on this. I have no problem with the "downsides" of lazy IO, since most of what I want to do is "functional": read immutable data from a database (e.g. git)Obligation
D
8

TO flesh out the comments on Control.Concurrent.Async, here is an example using the async package.

import Network.HTTP.Conduit
import Control.Concurrent.Async

main = do
    xs <- mapM (async . simpleHttp) [ "www.stackoverflow.com"
                                    , "www.lwn.net"
                                    , "www.reddit.com/r/linux_gaming"]
    [so,lwn,lg] <- mapM wait xs
    -- parse these how ever you'd like

So in the above we define three HTTP get requests for three different websites, launch those requests asynchronously, and wait for all three to finish before proceeding.

Doscher answered 11/12, 2012 at 4:24 Comment(1)
Threaded code is not the same thing as a single threaded event driven non-blocking code. (I know that nodejs uses a thread pool internally, that is not the point). You have to use MVars to communictae between the threads, which involves synchronization and on the larger scale transactions. However, with events I know that nothing else is modifying the program state if I do not call anything that needs a callback. Your solution uses "wait", which will block, so any caller of such an async method needs to be put in a separate thread to be able to continue: one thread per method call, no?Obligation

© 2022 - 2024 — McMap. All rights reserved.