Why does a simple Thin server stop responding at 16500 requests when benchmarking? [duplicate]
Asked Answered
I

2

6

Possible Duplicate:
'ab' program freezes after lots of requests, why?

Here's a simple test server:

require 'rubygems'
require 'rack'
require 'thin'

class HelloWorld

  def call(env)
    [200, {"Content-Type" => "text/plain"}, "OK"]
  end
end

Rack::Handler::Thin.run HelloWorld.new, :Port => 9294 
#I've tried with these added too, 'rack.multithread' => true, 'rack.multiprocess' => true

Here's a test run:

$ ab -n 20000 http://0.0.0.0:9294/sdf
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 0.0.0.0 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
apr_poll: The timeout specified has expired (70007)
Total of 16347 requests completed

It breaks down at around 16500. Why? How can I find out what's going on. Is it GC in ruby or is it something with number of available network sockets on an OS X machine. I have a MPB 2.5 Ghz 6G memory.


Edit

After some discussion here and testing various things, it seems like changing net.inet.tcp.msl from 15000 to 1000ms makes the problem of testing high frequency web servers with ab go away.

sudo sysctl -w net.inet.tcp.msl=1000 # this is only good for local development

See referenced question with the answer to this problem. 'ab' program freezes after lots of requests, why?

Imperial answered 6/2, 2012 at 6:49 Comment(6)
Did you find the reason?? A potential explanation could be that the OS keeps a socket in a "recently used" state and doesn't reuse it for a few minutes. Apparently one can reconfigure the OS' IP layer to not do that.Peristome
If it helps, I can reproduce this exact behaviour on my MBP. 16359 requests completed. No idea what causes it.Fisc
Hmm, thinking out loud, this number is suspiciously close to 16384...Fisc
This HN comment also notices the problem: news.ycombinator.com/item?id=820694Fisc
I don't know, maybe it's something with ab. It'd be interesting to to try ab from another computer.Imperial
and this issue on a different http server, using jmeter instead of ab: github.com/robbiehanson/CocoaHTTPServer/issues/31 seems that os x is the common thread here...Fisc
I
5

I'll add the solution here for claritys sake. The correct solution for managing to do high frequency tests with ab on os X is to change the 'net.inet.tcp.msl' setting from 15000ms to 1000ms. This should only be done on development boxes.

 sudo sysctl -w net.inet.tcp.msl=1000 # this is only good for local development

This answer was found after the good detective work performed in the comments here and comes from an answer to a very similar question here's the answer: https://mcmap.net/q/331976/-39-ab-39-program-freezes-after-lots-of-requests-why

Imperial answered 19/9, 2012 at 7:44 Comment(0)
F
2

I think I've got it.

When ab makes connections to your test server, it opens a source port (say, 50134) and makes a connection to the destination port (9294).

The ports that ab opens for the source port are determined by the sysctl settings net.inet.ip.portrange.first and net.inet.ip.portrange.last. For example, on my machine:

philippotter ~ $ sysctl -a | grep ip.portrange
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 49152
net.inet.ip.portrange.last: 65535
net.inet.ip.portrange.hifirst: 49152
net.inet.ip.portrange.hilast: 65535

This means that ab's source ports will be in the range from 49152 to 65535, which is a total of 16384.

HTTP is a TCP protocol. When a TCP connection is closed, it goes into the TIME_WAIT state, while it waits for any remaining in-transit packets to reach their destinations. This means that the port is not usable for any other purpose until the timeout is reached.

So, putting all of this together, ab uses up all available source ports very quickly; they go into the TIME_WAIT state; they can't be reused; ab is unable to create any more connections.

You can see this if you kill ab when it hangs, and run it again -- it won't be able to create any connections!

Fisc answered 16/9, 2012 at 9:3 Comment(7)
Sounds like we're closer to the issue! But why isn't the output of netstat -p tcp filled with TIME_WAITS and why can other programs still open connections?Imperial
Starting a fresh server and running ab, it seems that the last request hangs on tcp4 0 0 localhost.52892 localhost.http SYN_SENT and then after a while the request times out. (I tried running the server on port 80 instead of a higher port)Imperial
@Imperial hmm, good questions. Perhaps it isn't TIME_WAIT after all.Fisc
There seems to be that the ab client on OSX is faulty as referenced frequently. I've tried installing a new ab client but even though I frequently compile C code I cannot ./configure my apache project and I don't have the time to debug it right now. Looks interesting but might not be the problem: #1216767 Maybe this is on target but I fail to compile it: #7939369Imperial
Did someone tried to change the TIME_WAIT setting as per #1216767 ? And did see any difference?Peristome
Just adding to what @Peristome said--the command to try is $ sudo sysctl -w net.inet.tcp.msl=1000Soundless
OK I tried it, works like a charm 3159.56 r/sec no hiccups.Imperial

© 2022 - 2024 — McMap. All rights reserved.