Getting to know the basics of Asynchronous programming on *nix
Asked Answered
F

2

16

For some time now I have been googling a lot to get to know about the various ways to acheive asynchronous programming/behavior on nix machines and ( as known earlier to me ) got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).

Below are the few alternatives present for linux:

  1. select/poll/epoll :: Cannot be done using single thread as epoll is still blocking call. Also the monitored file descriptors must be opened in non-blocking mode.
  2. libaio:: What I have come to know about is that its implementation sucks and its still notification based instead of being completion based as with windows I/O completion ports.
  3. Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern
  4. libevent :: Any reason to go for it if I prefer ASIO?

Now Here comes the questions :)

  1. What would be the best design pattern for writing fast scalable network server using epoll (ofcourse, will have to use threads here :( )
  2. I had read somewhere that "only sockets can be opened in non-blocking mode" hence epoll supports only sockets and hence cannot be used for disk I/O. How true is the above statement and why async programming cannot be done on disk I/O using epoll ?
  3. Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself. Similar question
  4. How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern ?

Hope somebody will able to answer all the questions with nice explanations also. Any link to source where the implementation details of epoll and AIO design patterns are exaplained is also appreciated.

Fullerton answered 8/1, 2012 at 8:57 Comment(8)
All of select, poll and epoll have a timeout parameter that can be zero which makes the functions return immediately.Apocynaceous
Also, having notifications, like in the case of the aio_* functions is asynchronous. You ask to be notified when an event occurs, and then go about your business doing other things while the kernel handles your I/O.Apocynaceous
epoll_wait is not necessarily a blocking call; it depends on the timeout parameter passed to it ("man epoll_wait" for details).Etem
Any reason to avoid threads? Do you know about performance issues?Misogamy
@Joachim: Yes, I am aware of that fact. Does that mean I can achieve true asyn design pattern using single thread using epoll with zero timeout ?Fullerton
@stefaanv: Avoiding thread to avoid race issues, may be.Fullerton
@Joachim: But still not good as completion handler as it take cares of reading and writing for me and I expect it to be more efficient than notification based.Fullerton
The majority of server-software is using select in a single thread. Most probably with non-zero timeouts, but using these functions in a polling way with zero timeout and in a single thread is far from uncommon. However, it's not really asynchronous, but polling.Apocynaceous
G
12

Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern

This is not correct. The Asio library uses epoll() by default on most recent Linux kernel versions. however, threads invoking io_service::run() will invoke callback handlers as needed. There is only one place in the Asio library that a thread is used to emulate an asynchronous interface, it is well described in the documentation:

An additional thread per io_service is used to emulate asynchronous host resolution. This thread is created on the first call to either ip::tcp::resolver::async_resolve() or ip::udp::resolver::async_resolve().

This does not make the library "not a true async pattern" as you claim, in fact its name would disagree with you by definition.

1) What would be the best design pattern for writing fast scalable network server using epoll (of course, will have to use threads here :( )

I suggest using Boost Asio, it uses the proactor design pattern.

3) Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself

The epoll reactor uses a mutex to dispatch handlers, though in practice this is not a big concern for most applications. There are application specific ways to mitigate this behavior, such as an io_service per CPU to exploit data locality. See my answer to a similar question on this topic. It is also discussed on the Asio mailing list frequently.

4) How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern?

The Asio library does not natively support file I/O as you noted. There have been several attempts to add it to the library, I'd suggest discussing on the mailing list.

Gilbart answered 8/1, 2012 at 13:6 Comment(1)
Sam, thanks for the super answer. Can you shed some light on question number 2 also? I did not get the asio lock around epoll part clearly, but I will try to delve more into it and get the answer.Fullerton
E
1

First of all:

got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).

You probably has a small misconception, asynchronous can be build on top of "polling" api.

More then that "reactor" (epoll-like) API is more powerful then "proactor" API (IOCP) as the second can be implemented in terms of the first one (but not the other way around).

Also some operations that are "truly" asynchronous for example like disk I/O, some some other tools can be with combination of signals and Linux specific signalfd can provide full coverage of some other cases.

Bottom line. epoll is truly asynchronous I/O

Earwax answered 8/1, 2012 at 13:26 Comment(3)
Thanks for the answer, but can you check the last comment by Joachim to my question.. his view is otherwiseFullerton
I think he means the memcopy when reading from a socket is not async., hence epoll is not fully async. iocp allows you to specify the buffer you want the data in when you fire off the async. operation, which you can't do with epoll.Feeder
a decade late, but epoll (like ppoll, poll, and select before it) is fundamentally a synchronous I/O demultiplexer, roughly fitting the reactor pattern. ASIO uses it effectively inside a proactor, which is an asynchronous pattern, but that doesn't make epoll asynchronous.Outboard

© 2022 - 2024 — McMap. All rights reserved.