Is there any benefit to using epoll with a very small number of file descriptors?
Asked Answered
G

1

2

Would the following single threaded UDP client application see a performance benefit from using epoll over simply calling recvfrom/sendto on non-blocking sockets?

Let me explain the client.

I am writing single threaded UDP based client (custom protocol) that both sends and receives data using non-blocking I/O and my colleague suggested I use epoll for this. The client sends and receives multiple packets of information that are all associated with a unique session id and multiple sessions can be run simultaneously.

If I use epoll, there will be a limited number of maybe 10-20 file descriptors which epoll_wait could wait on. Each file descriptor would be associated with one session. So that's maximum 10 - 20 sessions and this number will be enforced.

Each session has it's own state machine. From a single thread I need to run each state machine reasonably frequently and poll the associated socket as well.

In my case, I'd have to use epoll_wait with a timeout of zero or some very small value so that I can give CPU time to run the state machines for each session. If there is data for a session then it needs to be directed to the associated state machine.

However, I can't really see much benefit of this design with such a small number of file descriptors.

The way I see it is I have two design options: 1. In my main loop using epoll I can poll the descriptors using epoll_wait with either a small timeout or no timeout.

How it handles data at this point is where I'm getting a bit stuck... either I read it right away and then throw it into a queue for each state machine to pick up when it's run, or I set a flag on the state machine to tell it that data is waiting and when the state machine runs it'll pick it up with a call to recvfrom. Or, I read the data and handle it right away and run the state machine for it.

Or... 2, Just run each state machine from the main loop and call recvfrom. If I get some data, handle it. If I don't then do whatever else the state machine requires. Is there huge overhead calling recvfrom when there is no data?

With going the epoll route I'm coding in some extra complexity. If there is a strong likelyhood for it be faster in my case then I will start doing it. However, if the second way which really simple works just as well then I would not use epoll.

Any thoughts?

Genova answered 21/12, 2011 at 22:50 Comment(2)
While interesting, your question seems more like an invitation to a discussion.Burne
Maybe I should do a trial and error approach and just start simple.Genova
P
2

No, and in fact performance will be much worse using epoll if adding and removing file descriptors from the set to poll is anything but an extremely rare event. With poll, a single syscall performs the entire operation. With epoll, you need multiple syscalls to modify the set and then wait on it.

Unless you're writing a server that's intended to scale to tens, hundreds, or thousands of thousands of long-term persistent connections, epoll is not only premature optimization, but actually a pessimization. It's also completely nonstandard and non-portable.

Partition answered 22/12, 2011 at 1:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.