Socket buffers the data it receives
Asked Answered
G

3

6

I have a client .NET application and a server .NET application, connected through sockets.

The client sends a string of 20 or so characters every 500 milliseconds.

On my local development machine, this works perfectly, but once the client and the server are on two different servers, the server is not receiving the string immediately when it's sent. The client still sends perfectly, I've confirmed this with Wireshark. I have also confirmed that the the server does receive the strings every 500 milliseconds.

The problem is that my server application that is waiting for the message only actually receives the message every 20 seconds or so - and then it receives all the content from those 20 seconds.

I use asynchronous sockets and for some reason the callback is just not invoked more than once every 20 seconds.

In AcceptCallback it establishes the connection and call BeginReceive

handler.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0, new AsyncCallback(ReadCallback), state);

This works on my local machine, but on my production server the ReadCallback doesn't happen immediately.

The BufferSize is set to 1024. I also tried setting it to 10. It makes a difference in how much data it will read from the socket at one time once the ReadCallback is invoked, but that's not really the problem here. Once it invokes ReadCallback, the rest works fine.

I'm using Microsofts Asynchronous Server Socket Example so you can see there what my ReadCallback method looks like.

How can I get the BeginReceive callback immediately when data arrives at the server?

--

UPDATE

This has been solved. It was because the server had a a single processor and single core. After adding another core, the problem was instantly solved. ReadCallback is now called immediately when the call goes through to the server.

Thankyou all for your suggestions!!

Gardas answered 24/8, 2013 at 12:33 Comment(16)
Can you easily write a little test program that does synchronous receives, just to see if it gets the data in a more timely manner? That might help isolate the problem.Buckra
I'll try doing that. Another question said it would be the same result, but it could be worth a try anyway.Gardas
@JimMischel: I changed it to synchronous and now it receives data immediately. So the issue continues to be that BeginReceive doesn't callback immediately when data arrives. I do not want to continue with synchronous, because then I have to do the threading myself to handle multiple connections at the same time.Gardas
Can you try the same with test application using asynchronous? To see whether it works in test application. That may help to identify the problemInlet
I actually didn't write a test application, I adjusted my current, very simple, application to use synchronous and then it worked. I guess it anyway could be relevant to create a test application with asynchronous just to, kind of, confirm the problem is related to that.Gardas
Do you use the sync or async API on the server side to send?Quinton
I don't send from the server. Could this be a problem?Gardas
Is your code exactly the Asynchronous Server Socket Example that you linked? If not, can you post your exact code somewhere so that I can test it?Buckra
Do you have 1-core CPU, by chance?Protestation
Please post your code.Abutment
@DarkWanderer, yes I do actually. But I have multiple CPU in my development machine. Interesting observation. I'll add another CPU in production and see what happens.Gardas
@AlexandreVinçon, my code is almost 100% identical to the Microsoft sample. I can post it but it takes a lot of space in the question and people mighnt not be able to understand the circumstances.Gardas
@Niels Brinch: That's just a guess, but may be the second thread just doesn't get time to execute because OS (almost) always schedules the main thread... This is the case when adding Sleep(1) in several places may actually speed up the program :)Protestation
@NielsBrinch if when in the same machine everything is working flawlessly, the cause may most probably be a firewall or port issues, try to open the used ports in both ends, I wish this will resolve your problem. Another question, when you say client and server in different servers, are thy in the same network? or connected through the internet.Roentgen
@NielsBrinch from your client try to telnet your server and your port and see what happens.Roentgen
@DarkWanderer: That was it! After adding another (virtual) core to the server, it started calling ReadCallback immediately! Please post it as an answer.Gardas
P
1

By the request of OP, duplicating my "comment/answer" here.

My guess was, the problem appeared because of thread scheduling on a single-core machine. This is an old problem, almost extinct in the modern age of hyper-threading/multi-core processors. When a thread is spawned in the course of execution of the program, it needs scheduled time to run.

On a single-core machine, if one thread continues to execute without explicitly passing control to OS scheduler (by waiting for mutex/signal or by calling Sleep), the execution of any other thread (in the same process and with lower priority) may be postponed indefinitely by the scheduler. Hence, in the case described, the asynchronous network thread was (most likely) just starved for execution time - getting only pieces from time to time.

Adding second CPU/core, obviously, fixed that by providing a parallel scheduling environment.

Protestation answered 11/9, 2013 at 9:21 Comment(1)
Nope, In my case I have 7 core machine yet readcallback is fired after ~15 secs. so I am starting this " listener.BeginAccept()" method inside the worker thread in the main thread. so now the time taken reduced to ~5 seconds. Is it correct approach?Litho
C
7

One approach might be to adjust the SO_SNDBUF option for the send side. SInce you are not running into this problem when both server/client are on the same box, it is possible that having a small buffer is throttling the send side since due to (a possible) slower sending rate between the servers. If the sender cannot send fast enough, then the send-side buffer might be filling up sooner.

Update: we did some debugging and turns out that the issue is with the application being slower.

Cormac answered 24/8, 2013 at 13:16 Comment(18)
I set the send buffer size to the size of the message I am sending. Is it better if I set it higher or lower?Gardas
byte[] msg = Encoding.ASCII.GetBytes(strMsg); socket.SendBufferSize = msg.Length; socket.Send(msg);Gardas
SO_SNDBUF is different than setting the msg.Length. SO_SNDBUF represents the transport buffer beneath the app and is on a per-socket basis. msg.Length applies on a per-packet basis. Basically, if SO_SNDBUF is big (let us say around 20 packets), then it would not throttle the application. Otherwise, the send call might block (if hte socket is blocking).Cormac
If SO_SNDBUF does not work, then I would recommend running tcmpdump on both sender/cient machines. Looking at the timestamp of messages sent/received should give us some more insight into why the latency is occuring.Cormac
I am not sure what throttle means and whether this is what is happening in my application. There is not too much load on neither the client of the server, if that's what you mean.Gardas
Yep, that is hte tool! Throttle term can be used to represent performance bottleneck.Cormac
What will I be looking for if I use this tool?Gardas
let us continue this discussion in chatGardas
OK, I used Wireshark (Winpcap) to confirm the packages ARE sent from the client every 500 milliseconds.Gardas
I also confirmed that the packages are arriving gradually on the server side, every 500 milliseconds. So the transfer goes well. The issue then seems to be with my server application OR some mechanism in the operating system.Gardas
I agree.. The next point of investigation should be that why is your application reading the data slowly? Is the server doing other blocking calls as well, like accept?Cormac
Yes, it has an accept method that is waiting for any connections, then accepting the connection and waiting for the next connection. However, this SEEMS to run perfectly, in that it receives the connection immediately when the client connects and then it stays open idenfinitely as it's supposed to. It doesn't reconnect each time, it re-uses the same connection. Also, it should be noted that currently there is only ONE client connecting.Gardas
Assuming that the process is behaving correctly, we can try bumping up hte process priority ( prnwatch.com/prio ). Also, what happens if we start a second client -- does the second client also see a similar delay of 20 seconds.Cormac
Yes, the second and third client sees the same kind of delay. It suggests that some sort of internal buffer fills up before it reacts. I would like to repeat that there are no performance load of any kind on the server. This connection takes up far, far less than 1% of CPU ressources, so we are not looking at a performance related problem.Gardas
You may also want to instrument (put time-stamps in a global string and print them later) and see why the receive call is taking so much time..Cormac
@Niels Brinch, can you please share your application code? Or atleast some of the snippet.Cormac
I removed some of the irrelevant code because people were not understanding which code was important. The problem is that the ReadCallback method is NOT called when new data arrives. So it's likely there is no problem INSIDE ReadCallback, which is why I removed it.Gardas
Three things. First, coming back to accept() call, are we calling the accept() call again for future connections -- I know that you have only one connection in this example. If you are doing, can you please disable accept() call after you receive the first client. Second, can you please instrument your code -- add current time statements at locations around the recv() call and see if there is anything that might be adding the delay. Third, did you try bumping up the process priority?Cormac
C
6

It might be that the Nagle algorithm is waiting on the sender side for more packets. If you are sending small chunks of data, they will be merged in one so you don't pay a huge TCP header overhead for small data. You can disable it using: StreamSocketControl.NoDelay See: http://msdn.microsoft.com/en-us/library/windows/apps/windows.networking.sockets.streamsocketcontrol.nodelay

The Nagle algorithm might be disabled for loopback and this is a possible explanation of why it works when you have both the sender and the receiver on the same machine.

Calise answered 24/8, 2013 at 13:52 Comment(4)
That sounded very promising. Unforunately it doesn't make any difference. I set this on the sender side like this: socket.NoDelay = true;Gardas
I agree. Since NoDelay works at a timescale of 200 milli-seconds, it should not create delay worth 10 seconds. That is why I did not propose that earlier -- but it is good to know that NoDelay is not the culprit.Cormac
Gabi, if you have any other suggestions like this I would very much be interested in pursuing them.Gardas
@Gabi he said that it used to work fine when client and server were in the same machine.Roentgen
P
1

By the request of OP, duplicating my "comment/answer" here.

My guess was, the problem appeared because of thread scheduling on a single-core machine. This is an old problem, almost extinct in the modern age of hyper-threading/multi-core processors. When a thread is spawned in the course of execution of the program, it needs scheduled time to run.

On a single-core machine, if one thread continues to execute without explicitly passing control to OS scheduler (by waiting for mutex/signal or by calling Sleep), the execution of any other thread (in the same process and with lower priority) may be postponed indefinitely by the scheduler. Hence, in the case described, the asynchronous network thread was (most likely) just starved for execution time - getting only pieces from time to time.

Adding second CPU/core, obviously, fixed that by providing a parallel scheduling environment.

Protestation answered 11/9, 2013 at 9:21 Comment(1)
Nope, In my case I have 7 core machine yet readcallback is fired after ~15 secs. so I am starting this " listener.BeginAccept()" method inside the worker thread in the main thread. so now the time taken reduced to ~5 seconds. Is it correct approach?Litho

© 2022 - 2024 — McMap. All rights reserved.