Does epoll(), do its job in O(1)?

Asked 24/6, 2011 at 23:5 Answered 13/1, 2023 at 22:28

Wikipedia says

unlike the older system calls, which operate at O(n), epoll operates in O(1) [2]).

http://en.wikipedia.org/wiki/Epoll

However, the source code at fs/eventpoll.c on Linux-2.6.38, seems it is implemented with an RB tree for searching, which has O(logN)

/*
 * Search the file inside the eventpoll tree. The RB tree operations
 * are protected by the "mtx" mutex, and ep_find() must be called with
 * "mtx" held.
 */
static struct epitem *ep_find(struct eventpoll *ep, struct file *file, int fd)
{

In fact, I couldn't see any man page saying the complexity of epoll() is O(1). Why is it known as O(1)?

Mesognathous answered 24/6, 2011 at 23:5 Comment(11)

I would vote this up because you really did your homework, but I'm maxed out on daily votes :P – Zeralda 24/6, 2011 at 23:13

@ddoman: What Keoki Zee meant was that you did prior research on your own before asking this question. A lot of new (and old!) users of Stack Overflow can't even be bothered to do that. – Arioso 24/6, 2011 at 23:15

Research, for doing research :) – Zeralda 24/6, 2011 at 23:15

Well, google seems to agree – Consideration 24/6, 2011 at 23:17

There's no way epoll executes in O(1), as the number of file descriptors it needs to watch isn't constant. There has to be some kind of repetition construct. – Arioso 24/6, 2011 at 23:21

Would someone explain this epoll() function people keep talking about? I can't find any function with that name. epoll is a whole family of functions, not all of which have the same complexity. – Sustentation 24/6, 2011 at 23:24

@Insilico, you're making the assumption that epoll() must scan the file descriptors to find ready ones. epoll() can also wait on the descriptors to become ready... – Criminate 24/6, 2011 at 23:25

@sarnold: I made no assumption that epoll is scanning anything. What I meant was that if you're waiting on events that can happen at anytime then O(1) operation is not guaranteed by definition. Either that or everyone talking about the epoll API meant something else by O(1). – Arioso 24/6, 2011 at 23:40

@Insilico, hehe, good point. O(asleep). :) – Criminate 24/6, 2011 at 23:42

It is somewhat inaccurate to say that epoll operates at O(1). epoll operates at O(log(n)) in respect to adding and removing descriptors, at O(n) in respect of ready descriptors, and at O(1) in respect of monitored descriptors -- which is the whole point of epoll. Adding/removing descriptors is a rare thing, waiting for readiness is frequent, and you usually have many more descriptors in your set than are actually ready. Insofar, while epoll does not really operate at O(1), it does so where it matters. The common case is fast, the uncommon case is not. – Labialized 25/6, 2011 at 12:55

What does n in O(n) for ready descriptors signify? I thought it would be different from the input set. O(m) where m is the number of ready events. – Gramicidin 10/8, 2023 at 5:37

This makes sense once you look for ep_find. I only spent a few minutes with it and I see ep_find is only called by epoll_ctl.

So indeed, when you add the descriptors (EPOLL_CTL_ADD) that costly operation is performed. BUT when doing the real work (epoll_wait) it isn't. You only add the descriptors in the beginning.

In conclusion, it's not enough to ask the complexity of epoll, since there is no epoll system call. You want the individual complexities of epoll_ctl, epoll_wait etc.

Other stuff

There are other reasons to avoid select and use epoll. When using select, you don't know how many descriptors need attention. So you must keep track of the biggest and loop to it.

rc = select(...);
/* check rc */
for (s = 0; s <= maxfd; s++) {
    if (FD_ISSET(s)) {
        /* ... */
    }
}

Now with epoll it's a lot cleaner:

nfds = epoll_wait(epollfd, events, MAX_EVENTS, -1);
/* check nfds */
for (n = 0; n < nfds; ++n) {
    /* events[n].data.fd needs attention */
}

Beograd answered 25/6, 2011 at 9:55 Comment(4)

select returns the number of ready fd's too, in the worst case you need to loop through them all though (if maxfd has an event). – Abbevillian 25/6, 2011 at 10:5

@Abbevillian And how would you use that number to replace the loop ? – Beograd 25/6, 2011 at 10:6

Somewhere you need to have a list of all your file descripors. I often use a linked list if I'm using select.

int handled = 0; while(client) { if(FD_ISSET(client->fd,read_set) {handle(client); handled++; } if(handled == rc) break; client = clients->next;}

, no need to check from 0 to rc, only check the FDs you actually are monitoring, and no need to check more fds when you handled all the events select reported. – Abbevillian 25/6, 2011 at 10:10

Yes, you're right, you can shorten the loop. But, as you correctly pointed in your comment, you still have to process all your list (in the unfortunate case a descriptor near "the end" needs attention). And if that list is large and the descriptors are the first and the last.. – Beograd 25/6, 2011 at 10:13

I think epoll wait is O(1) with epollet if you ask for 1 event. And upd and add could be amortized O(1) if they used a descent hashtbl implementation.

This needs checking and man pages should mention complexity!

Somerset answered 13/1, 2023 at 22:28 Comment(0)

Other stuff

Recommended topics

Hot tags