With a single file descriptor, Is there any performance difference between select, poll and epoll and ...?
Asked Answered
F

3

15

The title really says it all.

The and ... means also include pselect and ppoll..

The server project I'm working on basically structured with multiple threads. Each thread handles one or more sessions. All the threads are identical. The protocol takes care of which thread will host the session.

I'm using an inhouse socket class that wraps things up. The point of interest is a checkread call which calls either poll (linux) or select (windows).

In summary each thread currently calls poll on a single socket. From what I can tell, using epoll would only be of benefit if this thread was looking at multiple sockets such as what you'd get in say an HTTP server. That's not what I'm doing in my case. And the class only handles a single socket at a time.

There is some brief discussion about edge and level triggering in the man pages for epoll. I'm not really sure what it means. In the socket class I see an optimization in the windows part of the code that shortcuts the select call with an ioctlsocket & FIONREAD to check if there is any data. Wondering if that would return > 0 even if a complete UDP packet hadn't arrived at the time of the call. Is this what edge triggering is in epoll?

In some rudimentary testing, I'm also seeing no noticeable difference between using select and poll.

I can see that using ppoll might be of benefit though due to greater precision in the timeout. Any thoughts?

And yes, I am trying to optimize throughput for a session that is receiving lots of data. The server is more Network & Disk bound than CPU.

Festination answered 13/4, 2011 at 10:9 Comment(0)
F
7

The main difference between epoll vs select or poll is that epoll scales a lot better when run in a single thread. I don't know how this would compare to using a multithreaded server using select or poll. Look at this http://monkey.org/~provos/libevent/libevent-benchmark2.jpg

The reason for this(as far as I can tell) is that when you are using select or poll you must loop through all the connected sockets to determine which ones have data to be read. When you are using epoll, it keeps a seperate array which contains references only to sockets which have data to be read. This saves you lots of loop cycles, and the difference becomes more and more noticeable the more sockets that are connected.

Another thing to look into if performance ever becomes a major issue is io completion ports(windows only) and kqueue(FreeBSD only). It's also important to remember that epoll is linux only. In most cases select or poll will work just fine.

In the case of a single file descriptor, select and poll are more efficient than epoll due to being much simpler. (epoll has some overhead which doesn't make itself useful with only a single socket)

Fry answered 25/4, 2011 at 23:16 Comment(2)
Thanks, that basically confirmed what I found by experimentation.Festination
You don't want to use select if the single file descriptor has a large number, since select is O(max fd number), whereas poll is O(number of fds in the request).Paramedic
E
5

According to the link: http://www.intelliproject.net/articles/showArticle/index/io_multiplexing.

If you use only one descriptor:

  • select: 201 micro seconds.
  • poll: 159 micro seconds.
  • epoll: 176 micro seconds.

Seems poll will be a better solution in such situation.

Experiential answered 3/2, 2015 at 8:40 Comment(0)
T
1

If you have only a single socket, what's the point of polling in the first place? Wouldn't the best performance then be by just using blocking read/write?

Wrt. the performance, with only a single file descriptor I don't think there is much, if any, difference between the various approaches. If you really care, I suppose you could measure, but I find it difficult that this would particularly matter for the overall performance of your program.

Level/edge triggering. Consider you're monitoring a signal, for simplicity say some voltage in a line. Edge triggering means that something triggers when the voltage goes over or under some specific limit. Level triggering means that something is considered to be in a triggered state as long as the voltage is over/under the limit. That is, edge triggering triggers when some event happens (crossing some threshold), level triggering reflects the state of some "thing" (in this case, voltage).

To get back to network programming, and edge triggered system might be one where you get some kind of signal when a packet is received. If you don't handle the event then the signal is lost. A level triggered system, OTOH, is something like asking "is there data waiting in the buffer for me?"; if you don't handle the event and ask again, the data will still be there waiting for you.

Teets answered 13/4, 2011 at 11:26 Comment(1)
By using poll I can timeout and do other work within the same thread. I can also use the poll to block when there is nothing else to do and when I'm waiting for data from the network. I could create another thread but this creates other problems with shared data among other things. More threads = more complexity and no benefit in my case. Tried lots of approaches.Festination

© 2022 - 2024 — McMap. All rights reserved.