Creating a WebSocket Client in Python
Asked Answered
E

1

7

I am trying to learn about socket programming as well as the WebSocket protocol. I know that there are python web socket clients in existence but I am hoping to just build a toy version for my own learning. To do this I have created an extremely simple Tornado websocket server that I am running on localhost:8888. All it does is print a message when a client connects.

This is the entire server - and it works (I have tested it with a small javascript script in my browser)

import tornado.httpserver
import tornado.websocket
import tornado.ioloop
import tornado.web


class WSHandler(tornado.websocket.WebSocketHandler):
    def open(self):
        print('new connection')
        self.write_message("Hello World")

    def on_message(self, message):
        print('message received %s' % message)

    def on_close(self):
      print('connection closed')

application = tornado.web.Application([
    (r'/ws', WSHandler),
])


if __name__ == "__main__":
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
    tornado.ioloop.IOLoop.instance().start()

So once I start up the server I try to run the following script

import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((socket.gethostbyname('localhost'), 8888))

msg = '''GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13'''.encode('ascii')
print(len(msg))

sent_count = sock.send(msg)
print('sent this many bytes:', sent_count)
recv_value = sock.recv(1)
print('recvieved:', recv_value)

What I am hoping is that the server will send back the response header as specified in the RFC. Instead the sock.recv is hanging. This leads me to believe the server isn't acknowledging the websocket initial handshake. This handshake is pulled off of the RFC as well. I know that the websocket key should be random and everything, but I don't think that would cause the server to ignore the handshake (the websocket key is valid). I think I can figure the rest out once I can initiate the handshake so I am hoping that there is just some misunderstanding in either how websockets work or how to send the initial handhake.

Esse answered 8/6, 2013 at 6:32 Comment(0)
S
12

1) When you send a message over a socket, you have no idea how many chunks it will be divided into. It may all get sent at once; or the first 3 letters may be sent, then the rest of the message; or the message may be split into 10 pieces.

2) Given 1) how is the server supposed to know when it has received all the chunks sent by the client? For instance, suppose the sever receives 1 chunk of the client's message. How does the server know whether that was the whole message or whether there are 9 more chunks coming?

3) I suggest you read this:

http://docs.python.org/2/howto/sockets.html

(Plus the links in the comments)

4) Now, why aren't you using python to create an HTTP server?

python3:

import http.server
import socketserver

PORT = 8000
handler = http.server.SimpleHTTPRequestHandler

httpd = socketserver.TCPServer(("", PORT), handler)

print("serving at port", PORT)
httpd.serve_forever()

python2:

import SimpleHTTPServer
import SocketServer

PORT = 8000
handler = SimpleHTTPServer.SimpleHTTPRequestHandler

httpd = SocketServer.TCPServer(("", PORT), handler)

print "serving at port", PORT
httpd.serve_forever()

The SimpleHTTPRequestHandler serves files out of the server program's directory and below, matching the request url to the directory structure you create. If you request '/', the server will serve up an index.html file out of the same directory the server is in. Here is an example of a client socket for python 3 (python 2 example below):

import socket   
import sys

try:
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error:
    print('Failed to create socket')
    sys.exit()

print('Socket Created')

#To allow you to immediately reuse the same port after 
#killing your server:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

host = 'localhost';
port = 8000;

s.connect((host , port))

print('Socket Connected to ' + host + ' on port ', port)


#Send some data to server
message = "GET / HTTP/1.1\r\n\r\n"

try :
    #Send the whole string(sendall() handles the looping for you)
    s.sendall(message.encode('utf8') )
except socket.error:
    print('Send failed')
    sys.exit()

print('Message sent successfully')

#Now receive data
data = [] 

while True:
    chunk = s.recv(4096)  #blocks while waiting for data
    if chunk: data.append(chunk.decode("utf8"))
    #If the recv() returns a blank string, then the other side
    #closed the socket, and no more data will be sent:
    else: break  

print("".join(data))

--output:--
Socket Created
Socket Connected to localhost on port  8000
Message sent successfully
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/3.2.3
Date: Sat, 08 Jun 2013 09:15:18 GMT
Content-type: text/html
Content-Length: 23
Last-Modified: Sat, 08 Jun 2013 08:29:01 GMT

<div>hello world</div>

In python 3, you have to use byte strings with sockets, otherwise you will get the dreaded:

TypeError: 'str' does not support the buffer interface

Here it is in python 2.x:

import socket   
import sys

try:
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error:
    print 'Failed to create socket'
    sys.exit()

print('Socket Created')

#To allow you to immediately reuse the same port after 
#killing your server:
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

host = 'localhost';
port = 8000;

s.connect((host , port))

print('Socket Connected to ' + host + ' on port ', port)

#Send some data to server
message = "GET / HTTP/1.1\r\n\r\n"

try :
    #Send the whole string(handles the looping for you)
    s.sendall(message)
except socket.error:
    print 'Send failed'
    sys.exit()

print 'Message sent successfully'

#Now receive data
data = [] 

while True:
    chunk = s.recv(4096)  #blocks while waiting for data
    if chunk: data.append(chunk)
    #If recv() returns a blank string, then the other side
    #closed the socket, and no more data will be sent:
    else: break  

print("".join(data))

--output:--
Message sent successfully
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.3
Date: Sat, 08 Jun 2013 10:06:04 GMT
Content-type: text/html
Content-Length: 23
Last-Modified: Sat, 08 Jun 2013 08:29:01 GMT

<div>hello world</div>

Note that the header of the GET requests tells the server that HTTP 1.1 will be the protocol, i.e. the rules governing the conversation. And as the RFC for HTTP 1.1 describes, there has to be two '\r\n' sequences in the request. So the server is looking for that second '\r\n' sequence. If you delete one of the '\r\n' sequences from the request, the client will hang on the recv() because the server is still waiting for more data because the server hasn't read that second '\r\n' sequence.

Also note that you will be sending the data as bytes(in python 3), so there are not going to be any automatic '\n' conversions, and the server will be expecting the sequence '\r\n'.

Searchlight answered 8/6, 2013 at 6:38 Comment(8)
Is there a control byte that you can send to tell the server that it has recieved the entire header? Maybe like a 0x00 or something? Just saw your edit, thanks for the link I will read it.Esse
Okay, now you are on to something. Is the server expecting a 0x00 byte to signal the end of the message? Or is the server expecting to see "END OF MESSAGE"? Or a newline? Each of those regimes is known as a protocol, and you have to agree on the protocol ahead of time so that a socket and a server know how to talk to each other. Read the link I posted.Searchlight
Yeah I had posted that before your link. But what you said makes sense. What I need to do is let the server know that my message is over so it can process/respond. I'm in the middle of reading the HOWTO now so hopefully that gets me on the right track.Esse
Next read: 1) The definition of an HTTP request: w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5. Note that the definition of the Request-Line includes a CRLF (which is '\r\n' for all OS's ) at the end of the line. In addition, the general Request definition requires another CRLF, which may come after any headers if present. 2) An example of a python client socket connecting to an http server: binarytides.com/python-socket-programming-tutorialSearchlight
I am going to read each of those resources. They are definitely what I was looking for, and hoping to learn more about (my networks class I took was completely worthless and didn't talk at all about any of this). The http part will help seeing a how a client works and expanding upon it.Esse
@Bear, I posted an example for you. See 4) above.Searchlight
@Bear, I had to make some adjustments for python3, so now there is a python3 and python2 example for you to examine.Searchlight
That is really useful. I will look at that tonight. Sorry about the delay in response I was working all day. This all gets me on the right track which is going to make a huge differenceEsse

© 2022 - 2024 — McMap. All rights reserved.