Go client program generates a lot a sockets in TIME_WAIT state
Asked Answered
T

1

13

I have a Go program that generates a lot of HTTP requests from multiple goroutines. after running for a while, the program spits out an error: connect: cannot assign requested address.

When checking with netstat, I get a high number (28229) of connections in TIME_WAIT.

The high number of TIME_WAIT sockets happens when I the number of goroutines is 3 and is severe enough to cause a crash when it is 5.

I run Ubuntu 14.4 under docker and go version 1.7

This is the Go program.

package main

import (
        "io/ioutil"
        "log"
        "net/http"
        "sync"
)
var wg sync.WaitGroup
var url="http://172.17.0.9:3000/";
const num_coroutines=5;
const num_request_per_coroutine=100000
func get_page(){
        response, err := http.Get(url)
        if err != nil {
                log.Fatal(err)
        } else {
                defer response.Body.Close()
                _, err =ioutil.ReadAll(response.Body)
                if err != nil {
                        log.Fatal(err)
                }
        }

}
func get_pages(){
        defer wg.Done()
        for i := 0; i < num_request_per_coroutine; i++{
                get_page();
        }
}

func main() {
        for i:=0;i<num_coroutines;i++{
                wg.Add(1)
                go get_pages()
        }
        wg.Wait()
}

This is the server program:

package main

import (
    "fmt"
    "net/http"
    "log"
)
var count int;
func sayhelloName(w http.ResponseWriter, r *http.Request) {
    count++;
    fmt.Fprintf(w,"Hello World, count is %d",count) // send data to client side
}

func main() {
    http.HandleFunc("/", sayhelloName) // set router
    err := http.ListenAndServe(":3000", nil) // set listen port
    if err != nil {
        log.Fatal("ListenAndServe: ", err)
    }
}
Tiflis answered 2/10, 2016 at 3:30 Comment(6)
TIME_WAIT is the normal TCP state after closing closing a connection. What exactly are you trying to test here?Indraft
JimB, I am tring to stress test the web server 172.17.0.9:3000 and I want to do it using just one client machine. I know that this is possible because there are no problems if I set num_coroutines to 2. but I want to use many coroutinesTiflis
You're opening and closing connections too fast for your server. Is the server you're testing expected to reuse http/1.1 connections, or does it close the connection on every request?Indraft
JimB, the server program is very simple - I added to the question. I dont think it is using keep alive connections.Tiflis
No, the server is using http/1.1 by default. The problem is partly because the server is too simple and not really doing any work, and benchmarking a "hello world" doesn't prove anything since the client is being tested just as much as the server, with conflating issues from the OS and network stack. (also see stackoverflow.com/questions/30352725).Indraft
Adding a bit more context to "TIME_WAIT is the normal TCP state after closing a connection": serverfault.com/a/23395/117206Washboard
I
32

The default http.Transport is opening and closing connections too quickly. Since all connections are to the same host:port combination, you need to increase MaxIdleConnsPerHost to match your value for num_coroutines. Otherwise, the transport will frequently close the extra connections, only to have them reopened immediately.

You can set this globally on the default transport:

http.DefaultTransport.(*http.Transport).MaxIdleConnsPerHost = numCoroutines

Or when creating your own transport

t := &http.Transport{
    Proxy: http.ProxyFromEnvironment,
    DialContext: (&net.Dialer{
        Timeout:   30 * time.Second,
        KeepAlive: 30 * time.Second,
    }).DialContext,
    MaxIdleConnsPerHost:   numCoroutines,
    MaxIdleConns:          100,
    IdleConnTimeout:       90 * time.Second,
    TLSHandshakeTimeout:   10 * time.Second,
    ExpectContinueTimeout: 1 * time.Second,
}

Similar question: Go http.Get, concurrency, and "Connection reset by peer"

Indraft answered 3/10, 2016 at 14:40 Comment(5)
JimB, I used the first option above and it greatly improved the behavior of the program. now it does not crash on low numbers of num_conections, but I does break for high numbers (for example 10000). I'll try the more more verbose option and see it helps more.Tiflis
@yigal: of course it will break if you raise concurrency high enough, what is the point of testing 10000 concurrent connections with a single http client and server over loopback? You only have so many file descriptors and ephemeral ports you can make use of without some system tuning and better configuration.Indraft
the idea is to stress test our system using just one client machine. The advantage of single client machine over multiple client machines is that is should be simpler to develop and test the stress test code. I am trying out golang for this purpose as it is a fast language with low overhead for spawning threads/coroutines. I am not completely versed at linux optimizations, but my logic says that 10000 concurrent connections should be achievable with stock Linux. I just need to handle the TIME_WAIT problem more efficiently.Tiflis
Any difference when we do a POST request ?Gamber
How can I set different proxies for every request in this way ? Is it possible ?Skydive

© 2022 - 2024 — McMap. All rights reserved.