How to set socket timeout in C when making multiple connections?

Asked 15/11, 2010 at 5:55 Answered 22/1, 2013 at 12:54

I'm writing a simple program that makes multiple connections to different servers for status check. All these connections are constructed on-demand; up to 10 connections can be created simultaneously. I don't like the idea of one-thread-per-socket, so I made all these client sockets Non-Blocking, and throw them into a select() pool.

It worked great, until my client complained that the waiting time is too long before they can get the error report when target servers stopped responding.

I've checked several topics in the forum. Some had suggested that one can use alarm() signal or set a timeout in the select() function call. But I'm dealing with multiple connections, instead of one. When a process wide timeout signal happens, I've no way to distinguish the timeout connection among all the other connections.

Is there anyway to change the system-default timeout duration ?

Ellga answered 15/11, 2010 at 5:55 Comment(2)

Do you mean connect() takes too long to timeout or you are already connected and go through a long period when there is nothing to read? – Headley 15/11, 2010 at 6:40

@Duck: My problem is that connect() takes too long to timeout. Each connection in my program is temporarily; it's supposed to be disconnected immediately after a status-check handshaking procedure is performed. There is no need to adjust TCP_KEEP_ALIVE duration individually in my case. – Ellga 15/11, 2010 at 6:59

136

You can use the SO_RCVTIMEO and SO_SNDTIMEO socket options to set timeouts for any socket operations, like so:

    struct timeval timeout;      
    timeout.tv_sec = 10;
    timeout.tv_usec = 0;
    
    if (setsockopt (sockfd, SOL_SOCKET, SO_RCVTIMEO, &timeout,
                sizeof timeout) < 0)
        error("setsockopt failed\n");

    if (setsockopt (sockfd, SOL_SOCKET, SO_SNDTIMEO, &timeout,
                sizeof timeout) < 0)
        error("setsockopt failed\n");

Edit: from the setsockopt man page:

SO_SNDTIMEO is an option to set a timeout value for output operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for output operations to complete. If a send operation has blocked for this much time, it returns with a partial count or with the error EWOULDBLOCK if no data were sent. In the current implementation, this timer is restarted each time additional data are delivered to the protocol, implying that the limit applies to output portions ranging in size from the low-water mark to the high-water mark for output.

SO_RCVTIMEO is an option to set a timeout value for input operations. It accepts a struct timeval parameter with the number of seconds and microseconds used to limit waits for input operations to complete. In the current implementation, this timer is restarted each time additional data are received by the protocol, and thus the limit is in effect an inactivity timer. If a receive operation has been blocked for this much time without receiving additional data, it returns with a short count or with the error EWOULDBLOCK if no data were received. The struct timeval parameter must represent a positive time interval; otherwise, setsockopt() returns with the error EDOM.

Denoting answered 15/11, 2010 at 8:21 Comment(21)

Are you certain this works for connect()? I don't believe so. – Headley 15/11, 2010 at 16:5

What makes you think it doesn't work for connect()? I am sure that I have used them to set timeouts on connect() calls. – Denoting 15/11, 2010 at 21:57

@Toby: I've tried, both blocking mode and non-blocking mode, but it doesn't work. These two parameters apparently only work with send() and recv(), not connect(). – Ellga 16/11, 2010 at 2:8

@Ellga - I threw together a sloppy test program and it did seem to work for blocking - returns -1, EINPROGRESS after the timeout. I didn't test non-blocking but I don't see how it saves anything for non-blocking. Connect() will return immediately and you still have to wait for the timeout in select(), test the FD, check error code, etc. Basically you are just setting two timers – Headley 16/11, 2010 at 2:25

@Headley That's weird. I've got different result. First I wrote a piece of code that opens a TCP socket, bind() and listen(), with backlog set to 1, but do not call accept() until I press Ctrl+C. Then I telnet to it. The telnet session is connected but not accepted, so the upcoming connections will be pending until timeout. Then I can make the second connection with my test code, to see whether the timeout duration would be affected. I've tried both blocking or non-blocking mode, with or without setting SO_RCVTIMEO and SO_SNDTIMEO. But all of these test gave identical timeout duration. – Ellga 16/11, 2010 at 6:37

@Ellga - It seems SO_WHATEVER doesn't interrupt TCP operations under the covers, it just makes the connect/send/recv calls return when the timer expires. The 3-way hand shake keeps going like it always does until it times out on its own. The SO_WHATEVER or select() methods just give you the opportunity to say "the heck with it" and close the socket. – Headley 16/11, 2010 at 17:42

It definitely breaks a connect() for me and, just as @Ellga mentions, it returns a EINPROGRESS when the timeout expires. – Denoting 17/11, 2010 at 5:8

@Denoting So this may be kernel-dependent. My code is running on an ARM-Linux box with kernel 2.6.17 or so, and I compose / test run / cross-compile my code on a Mac OS X 10.6.4 machine. Both platform had fixed timeout duration, and give only ETIMEOUT error from connect(), even with SO_WHATEVER applied. May I ask your kernel version ? – Ellga 17/11, 2010 at 6:16

My dev box is x86 2.6.31 and my target platform is powerpc running 2.6.24, both running with glibc (not sure what versions). According to W. Richard Steven's Unix Network Programming those two socket options aren't meant to work with connect(), so maybe it's best to not rely on this behaviour ;) On other (Mac OSX) platforms, I've had luck using a separate pthread monitoring the connect() and then calling shutdown() and close() on the socket if the connect() hasn't returned after a certain period. – Denoting 17/11, 2010 at 23:11

According to this pubs.opengroup.org/onlinepubs/009695399/functions/connect.html, "If the connection cannot be established immediately and O_NONBLOCK is not set for the file descriptor for the socket, connect() shall block for up to an unspecified timeout interval until the connection is established." It looks like those timeout intervals are for read or write operations only. – Koss 8/11, 2011 at 23:56

1. This does not work for connect() or in general for any socket operations'. It works for reading and writing respectively. 2. I can't see how this answer can possibly be correct in non-blocking mode, which is what the OP is asking about. The correct answer is to use a select() timeout and keep track of which sockets had connections initiated at which times. – Digestant 17/6, 2013 at 10:18

This does work for connect(). See SO_RCVTIMEO and SO_SNDTIMEO in socket(7): Linux manpage, which says "if no data has been transferred and the timeout has been reached then -1 is returned with errno set to EAGAIN or EWOULDBLOCK, or EINPROGRESS (for connect(2)) just as if the socket was specified to be nonblocking" – Struble 24/5, 2016 at 5:11

I may upvote it if you also mentioned the #include's that have to be included ;) That's harder to find on the internet... will take me another 5min... – Elsieelsinore 1/6, 2016 at 18:10

does not work under "Darwin Kernel Version 14.5.0" OS X 10.10.5。should use select – Cachou 21/7, 2016 at 4:56

@CraigMcQueen The sentence from which your selective quotation is taken starts 'If an input or output function blocks for this period of time, and data has been sent or received, the return value of that function will be the amount of data transferred; ...' connect() is not an input or output function and does not return a count. The remark about connect() appears to be an irrelevant digression. You need select() for a connect timeout shorter than the platform default. – Digestant 10/3, 2017 at 1:24

on Linux 4.13 only SO_SNDTIMEO sets a connect() timeout, SO_RCVTIMEO seems to be ignored. – Frambesia 6/4, 2018 at 17:29

Why do you need to cast the timeval struct address to a char pointer? – Jolson 22/12, 2018 at 20:36

@CraigMcQueen 'Just as if the socket was specified to be non-blocking': in other words this is for sockets that have not been specified to be non-blocking, which is not what the question is about. – Digestant 14/4, 2019 at 10:26

is it just for me ? the sock waits for the timeout also AFTER it read all the messagge ! – Grouchy 23/9, 2021 at 21:17

The "setsockopt" is invoked one after another. How do you guarantee that the latest call does not overwrite the value(s) from the previous? – Lake 29/10, 2022 at 19:30

@Digestant I'm also confused why the Linux Man Page mentions connect(), the SO_SNDTIMEO or SO_RCVTIMEO seems to have nothing to with connect(). – Reface 13/1 at 15:48

am not sure if I fully understand the issue, but guess it's related to the one I had, am using Qt with TCP socket communication, all non-blocking, both Windows and Linux..

wanted to get a quick notification when an already connected client failed or completely disappeared, and not waiting the default 900+ seconds until the disconnect signal got raised. The trick to get this working was to set the TCP_USER_TIMEOUT socket option of the SOL_TCP layer to the required value, given in milliseconds.

this is a comparably new option, pls see https://www.rfc-editor.org/rfc/rfc5482, but apparently it's working fine, tried it with WinXP, Win7/x64 and Kubuntu 12.04/x64, my choice of 10 s turned out to be a bit longer, but much better than anything else I've tried before ;-)

the only issue I came across was to find the proper includes, as apparently this isn't added to the standard socket includes (yet..), so finally I defined them myself as follows:

#ifdef WIN32
    #include <winsock2.h>
#else
    #include <sys/socket.h>
#endif

#ifndef SOL_TCP
    #define SOL_TCP 6  // socket options TCP level
#endif
#ifndef TCP_USER_TIMEOUT
    #define TCP_USER_TIMEOUT 18  // how long for loss retry before timeout [ms]
#endif

setting this socket option only works when the client is already connected, the lines of code look like:

int timeout = 10000;  // user timeout in milliseconds [ms]
setsockopt (fd, SOL_TCP, TCP_USER_TIMEOUT, (char*) &timeout, sizeof (timeout));

and the failure of an initial connect is caught by a timer started when calling connect(), as there will be no signal of Qt for this, the connect signal will no be raised, as there will be no connection, and the disconnect signal will also not be raised, as there hasn't been a connection yet..

Protecting answered 18/10, 2012 at 6:4 Comment(2)

This answer helped me after I didn't work it out with KEEPALIVE settings. Thanks! – Allerus 13/4, 2015 at 9:9

Thanks! this helped me to tackle the 15 min delay in disconnect caused by the TCP retransmit timer. – Finale 9/12, 2015 at 0:18

Can't you implement your own timeout system?

Keep a sorted list, or better yet a priority heap as Heath suggests, of timeout events. In your select or poll calls use the timeout value from the top of the timeout list. When that timeout arrives, do that action attached to that timeout.

That action could be closing a socket that hasn't connected yet.

Bejarano answered 15/11, 2010 at 6:39 Comment(4)

Hmm...that would be a good idea, but it would take some effort to do it. I was hoping something like setsockopt() function call that could set connection timeout duration individually. BTW, what would happen to select() if I close a connection-pending socket in another thread ? Will it cause some thread chasing situation ? – Ellga 15/11, 2010 at 7:7

This solution is the cleanest and most robust one, and it has no upvotes? Here's mine. – Yangyangtze 18/10, 2012 at 6:28

This is the way I've always done it, and it works well. I suspect it's more portable than the other solutions proposed above also. – Ary 31/1, 2013 at 16:26

This is a great way to do that, +1. I like to do this with a priority heap -- saves a little O() time over keeping a fully sorted list when we always only read the highest-priority element. – Patriliny 2/3, 2013 at 0:24

connect timeout has to be handled with a non-blocking socket (GNU LibC documentation on connect). You get connect to return immediately and then use select to wait with a timeout for the connection to complete.

This is also explained here : Operation now in progress error on connect( function) error.

int wait_on_sock(int sock, long timeout, int r, int w)
{
    struct timeval tv = {0,0};
    fd_set fdset;
    fd_set *rfds, *wfds;
    int n, so_error;
    unsigned so_len;

    FD_ZERO (&fdset);
    FD_SET  (sock, &fdset);
    tv.tv_sec = timeout;
    tv.tv_usec = 0;

    TRACES ("wait in progress tv={%ld,%ld} ...\n",
            tv.tv_sec, tv.tv_usec);

    if (r) rfds = &fdset; else rfds = NULL;
    if (w) wfds = &fdset; else wfds = NULL;

    TEMP_FAILURE_RETRY (n = select (sock+1, rfds, wfds, NULL, &tv));
    switch (n) {
    case 0:
        ERROR ("wait timed out\n");
        return -errno;
    case -1:
        ERROR_SYS ("error during wait\n");
        return -errno;
    default:
        // select tell us that sock is ready, test it
        so_len = sizeof(so_error);
        so_error = 0;
        getsockopt (sock, SOL_SOCKET, SO_ERROR, &so_error, &so_len);
        if (so_error == 0)
            return 0;
        errno = so_error;
        ERROR_SYS ("wait failed\n");
        return -errno;
    }
}

Octahedrite answered 22/1, 2013 at 12:54 Comment(1)

Great answer! May i know what is the recommended value for timeout here i.e for connection timeout? – Rag 22/1, 2016 at 9:40

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags