multiple threads doing poll() or select() on a single socket or pipe
Asked Answered
C

4

27

What do POSIX and other standards say about the situation where multiple threads are doing poll() or select() calls on a single socket or pipe handle at the same time?

If any data arrives, does only one of the waiting threads get woken up or do all of the waiting threads get woken up?

Correggio answered 19/9, 2013 at 9:51 Comment(0)
M
23

Interesting question ... I read through the current POSIX and did not find a specific answer, i.e., no specification about concurrent invocations. So I'll explain why I think the standard means all will wake up.

The relevant part of the text for select / pselect is:

Upon successful completion, the pselect() or select() function shall modify the objects pointed to by the readfds, writefds, and errorfds arguments to indicate which file descriptors are ready for reading, ready for writing, or have an error condition pending, respectively, [...]

and later

A descriptor shall be considered ready for reading when a call to an input function with O_NONBLOCK clear would not block, whether or not the function would transfer data successfully. (The function might return data, an end-of-file indication, or an error other than one indicating that it is blocked, and in each of these cases the descriptor shall be considered ready for reading.)

In short (the reading case only), we can understand this as:

select does not block this means that the next call to an input function with O_NONBLOCK would not return an error with errno==EWOULDBLOCK. [Note that the "next" is my interpretation of the above.]

If one admits to this interpretation then two concurrent select calls could both return the same FD as readable. In fact even if they are not concurrent, but a first thread calls select with some FD being readable and later e.g., read, a second thread calling select between the two could return the FD as readable for the second thread.

Now the relevant part for the "waking up" part of the question is this:

If none of the selected descriptors are ready for the requested operation, the pselect() or select() function shall block until at least one of the requested operations becomes ready, until the timeout occurs, or until interrupted by a signal.

Here clearly the above interpretation suggests that concurrently waiting calls will all return.

Monadism answered 24/9, 2013 at 14:8 Comment(0)
C
8

I just found a bug because of this question: I have two threads selecting on the same socket, and will call accept when the fd comes back as isset(). In fact the select comes back for both threads, the fd isset() for that fd in both threads, and both threads call accept(), one wins and the other blocks waiting for another connection to come in.

So in fact select will return in all threads that it is blocking on for the same fd.

Consecutive answered 25/5, 2016 at 14:49 Comment(2)
This surely isn't surprising. For example, thread A might come off the select() and then before it has a chance to call accept(), thread B may be scheduled to run and come off select() for the same reason as the descriptor is in the same state. Thus you now have both threads about to call accept(). You should always assume that there will be an infinitely large amount of time between one piece of code running and the next, keep in mind that any other process or thread might run during that time, and whether doing so could be a problem for your code.Corroborant
Gravity isn't surprising if you know how it works. But newton found it surprising. :-)Consecutive
A
5

To avoid system overloading, I did the work on kernel to implement EPOLLEXCLUSIVE for Linux epoll. This flag set to resource will ensure that only one listener will receive event even in case if several threads or processes are listening on give file descriptor via epoll() (linux version of poll/select). This is very useful feature. As for example Enduro/X middleware is multi-process based middleware where several load balanced executables monitor same set of file descriptors (queues) by using epoll. Thus when event arrives without EPOLLEXCLUSIVE, lot of processes gets false wake ups (where some first wake up did already remove the event from FD - thundering herd issue) and others get empty notifications. And that empty processing cost a CPU processing time, if having say 500 binaries waiting for events...

IBM AIX 7.3 introduced new poll() flag - POLLEXCL which functions in similar way as EPOLLEXCLUSIVE.

Amphiarthrosis answered 22/5, 2020 at 9:2 Comment(0)
P
3

They should all wake up, all return the same result value, and all all do the same thing to the FD sets. They are all asking the same question, so they should all get the same answer.

What select() is supposed to do, according to the POSIX documentation which has been cited here, and my mere 25 years' exerience with it, is to return the number of FDs that are readable, writable, etc., at that instant. It would therefore be completely incorrect for all the concurrent select() calls not to all return the same thing.

The select() function can't predict the future, i.e. which thread is actually going to do a read or write, and therefore which thread will succeed in that. They contend. It's a thundering-herd problem.

Phyllida answered 19/9, 2013 at 10:3 Comment(13)
Since you have no intention to answering the question, why did't you make this a comment?Daddylonglegs
@EJP: Me testing on, say, Linux does not say anything about, say, HP-UX. I want some reasoned answer, not an empiric test, if I can have it.Correggio
@Daddylonglegs It seems to me that I have answered the question actually, so the second part of your question is a non sequitur.Phyllida
@Correggio This is a 'reasoned answer'. Clearly there is no pleasing everybody. One person considers a reasoned answer not to be an answer at all, and another doesn't recognize a reasoned answer when he sees it.Phyllida
I agree the answer is reasoned. It's just not an answer to the question. I appreciate the reasoning.Daddylonglegs
@Daddylonglegs The OP has specifically asked here for a 'reasoned answer'. You can't have it both ways.Phyllida
Well the question asks what posix says. If posix doesn't really say anything, any observed behavior on a platform could be purely accidental - unless that particular platform has it's own guarantees about how this is handled.Polaroid
@Polaroid I agree, but that evidently wouldn't satisfy the OP, who has expressed a preference here for a 'reasoned answer'. I've amended mine above.Phyllida
@Daddylonglegs I cannot understand you. First you say you wanted a reasoned answer, and now when you get one you state it isn't an answer to the question in some other unspecified way.Phyllida
@EJP You realize this is roughly 4 years ago right. I don't know what "you're coma" means. My last comment here is from 10:13, which means that I saw nothing but this answer. I don't think it's surprising that people would be a bit skeptical of an answer that just says "sure, why not. strange question". I like the current answer though. Enjoy the upvote.Daddylonglegs
@Daddylonglegs I don't know what "you're coma" means either. I didn't write it. It's not in the revision history. Perhaps you could make yourself more clear? and specifically why you are quoting other words in 2017 that were deleted four years earlier?Phyllida
Oh well. You're replying to a message of 2654 days ago. I also don't know where "you're coma" came from. FWIW comments do not have revision history (but you can see when they've been edited). Your last question was answered in 2017, which ironically was already 4 years after the the fact. I admire your exhaustive attention to historical posts, but I honestly don't think it's worth the time. Interestingly, these days I would almost concur with your assessment that the question was strange/the answer seemed obvious. However, this dropped in 2020...:)Daddylonglegs
@Daddylonglegs EPOLLEXCLUSIVE proves my point completely. "when event arrives without EPOLLEXCLUSIVE, lot of processes gets false wake ups (where some first wake up did already remove the event from FD - thundering herd issue)", exactly as I said above.Phyllida

© 2022 - 2024 — McMap. All rights reserved.