How does the socket API accept() function work?
Asked Answered
C

4

144

The socket API is the de-facto standard for TCP/IP and UDP/IP communications (that is, networking code as we know it). However, one of its core functions, accept() is a bit magical.

To borrow a semi-formal definition:

accept() is used on the server side. It accepts a received incoming attempt to create a new TCP connection from the remote client, and creates a new socket associated with the socket address pair of this connection.

In other words, accept returns a new socket through which the server can communicate with the newly connected client. The old socket (on which accept was called) stays open, on the same port, listening for new connections.

How does accept work? How is it implemented? There's a lot of confusion on this topic. Many people claim accept opens a new port and you communicate with the client through it. But this obviously isn't true, as no new port is opened. You actually can communicate through the same port with different clients, but how? When several threads call recv on the same port, how does the data know where to go?

I guess it's something along the lines of the client's address being associated with a socket descriptor, and whenever data comes through recv it's routed to the correct socket, but I'm not sure.

It'd be great to get a thorough explanation of the inner-workings of this mechanism.

Catalonia answered 28/1, 2009 at 19:47 Comment(3)
so for every client request, a brand NEW socket connection at server end is opened. The server must be open at 80 always to listen for incoming calls. If it receives a call, it then immediately creates a NEW socket with the four tuples as mentioned below, which will make a TCP connection between client and server. Is my understanding correct?Toinette
This is a very fundamental question and I was recently tested on this in an interview: #24872327 If you have any comments on this, please postToinette
@brainstorm Only if you completely ignore the existence of HTTP keep-alive.Incidental
M
163

Your confusion lies in thinking that a socket is identified by Server IP : Server Port. When in actuality, sockets are uniquely identified by a quartet of information:

Client IP : Client Port and Server IP : Server Port

So while the Server IP and Server Port are constant in all accepted connections, the client side information is what allows it to keep track of where everything is going.

Example to clarify things:

Say we have a server at 192.168.1.1:80 and two clients, 10.0.0.1 and 10.0.0.2.

10.0.0.1 opens a connection on local port 1234 and connects to the server. Now the server has one socket identified as follows:

10.0.0.1:1234 - 192.168.1.1:80  

Now 10.0.0.2 opens a connection on local port 5678 and connects to the server. Now the server has two sockets identified as follows:

10.0.0.1:1234 - 192.168.1.1:80  
10.0.0.2:5678 - 192.168.1.1:80
Mustachio answered 28/1, 2009 at 19:51 Comment(13)
Does the TCP/IP (which is it, btw?) tag the address-port pair in some way, hashing it into the socket to which to transfer the data for it?Catalonia
I don't know the implementation details (which probably vary from platform to platform), I just know that conceptually the sockets are identified by the quartet of information I described.Mustachio
Do you have any reference on this?Wryneck
Random question: What happens if NAT is being used, and two clients on the same network attempt to use the same local port when connecting to the server? For instance, if 10.0.0.1 and 10.0.0.2 are both connected to a router with an external IP of 192.168.0.1, so the server at 192.168.1.1 sees two connections from 192.168.0.1. What happens in that case if by some fluke of the random-number-generator both 10.0.0.1 and 10.0.0.2 choose the same local port?Dray
The NAT support in the router takes care of the details there. The network traffic is actually going over two connections - client to router, and router to server. The router makes the outgoing connections on two different ports 192.168.0.1:1234 and 192.168.0.1:5678. The incoming traffic is then redirected by the router to the correct client.Mustachio
@EliBendersky - "Which is it?" Both. Transfer Control Protocol is an application of Internet Protocol. They are at different levels of abstraction. When written as "TCP-IP", TCP is customarily interpreted as referring to the socket based connection protocol. Strictly, TCP encompasses all the transfer control protocols such as FTP, HTTP, SMTP, ICMP, UDP etc, even though most of these are applications of socket connections. Any of the TCPs could be implemented on some other packet protocol, and Netware in fact implemented several over IPX.Opalescent
@PeterWone: by "which is it", I meant to ask whether the address-port tagging is done on the TCP level or on the IP level.Catalonia
@EliBendersky TCP ports only exist at the TCP level.Incidental
If a socket is identified by the quartet, what is quartet information of a listening socket?Waldrop
@EricZheng That is likely implementation dependent. If I run 'netstat -a' on my windows box, the foreign address for sockets in the listening state are all localMachineName:0Mustachio
@17of26 For me, it is a wildcard address. So for a listening socket, it seems the dst addr and dst port are not quite important, it can be any address and any port. Once accept return and set up a connection, a new socket is created, with a very specific dst addr and dst port, maybe src addr becomes specific as well if the previous socket is listening on any address.Waldrop
@EricZheng The remote address and port of a listening socket are meaningless and undefined. And it certainly isn't 'implementation-dependent'.Incidental
@EricZheng Sockets are not identified by the quartet. TCP connections are.Handshake
H
87

Just to add to the answer given by user "17 of 26"

The socket actually consists of 5 tuple - (source ip, source port, destination ip, destination port, protocol). Here the protocol could TCP or UDP or any transport layer protocol. This protocol is identified in the packet from the 'protocol' field in the IP datagram.

Thus it is possible to have to different applications on the server communicating to to the same client on exactly the same 4-tuples but different in protocol field. For example

Apache at server side talking on (server1.com:880-client1:1234 on TCP) and World of Warcraft talking on (server1.com:880-client1:1234 on UDP)

Both the client and server will handle this as protocol field in the IP packet in both cases is different even if all the other 4 fields are same.

Hourihan answered 5/3, 2009 at 16:35 Comment(0)
C
18

What confused me when I was learning this, was that the terms socket and port suggest that they are something physical, when in fact they're just data structures the kernel uses to abstract the details of networking.

As such, the data structures are implemented to be able to distinguish connections from different clients. As to how they're implemented, the answer is either a.) it doesn't matter, the purpose of the sockets API is precisely that the implementation shouldn't matter or b.) just have a look. Apart from the highly recommended Stevens books providing a detailed description of one implementation, check out the source in Linux or Solaris or one of the BSD's.

Coster answered 30/1, 2009 at 11:4 Comment(1)
Yes, most of the networking terminology is just assigning names to certain collections of bits and to decisions taken based on their values ("protocol identifier", "routing", "binding", "socket" etc.). All your network card's hardware is designed to receive is a stream of bits. What happens to them in relation to programs on your computer is decided by the driver and OS. We could get rid of all of that terminology tomorrow if we wanted, but the principle of delivering a stream of bits seems fundamental...Bezonian
D
-1

As the other guy said, a socket is uniquely identified by a 4-tuple (Client IP, Client Port, Server IP, Server Port).

The server process running on the Server IP maintains a database (meaning I don't care what kind of table/list/tree/array/magic data structure it uses) of active sockets and listens on the Server Port. When it receives a message (via the server's TCP/IP stack), it checks the Client IP and Port against the database. If the Client IP and Client Port are found in a database entry, the message is handed off to an existing handler, else a new database entry is created and a new handler spawned to handle that socket.

In the early days of the ARPAnet, certain protocols (FTP for one) would listen to a specified port for connection requests, and reply with a handoff port. Further communications for that connection would go over the handoff port. This was done to improve per-packet performance: computers were several orders of magnitude slower in those days.

Dineric answered 28/1, 2009 at 20:1 Comment(5)
can you elaborate on the 'handoff port' part?Catalonia
This is either a description of some pre-TCP protocol, or overly simplified. A client attempting to connect to a listening socket sends a special packet to establish the connection (SYN bit set). There's a clear distinction between a packet creating a new socket and one using an existing socket.Yashmak
...sends a special packet to establish the connection (SYN bit set). Which (as I understand it) causes the protocol stack to give it to 'the' listener (if any) which is why there can be only one listening port per address/port/protocol combination. I'm not sure if this is in the spec or merely implementation convention though.Opalescent
The second paragraph does not correctly describe what happens either at the TCP layer or within a server process. Server processes don't need to maintain data structures of sockets of any kind, or to check incoming IP:port pairs against anything whatsoever. That's what sockets are there for. FTP uses a separate port for data, not for all 'further communications', and hats done to simplify the protocol, not for performance reasons. Using an new port while not improve performance in any way whatsoever.Incidental
"maintains a database (meaning I don't care what kind of table/list/tree/array/magic data structure it uses)" :) I usually call this a "Table" (or maybe "Graph" or "Decision tree"). "Database" suggests some implementation to me.Bezonian

© 2022 - 2024 — McMap. All rights reserved.