Here's a quick summary of the situation for the weary traveller stumbling upon this in 2024, and probably for a very long time after that.
Nothing has changed in the 12(!) years since this question was opened. The JavaScript WebSocket API is abandoned by all browser vendors (although the implementations do occasionally get updates), and the new specs (WebSocket Stream and WebTransport) are nowhere close to materialization. What this all means is that WebSockets are still widely used, no replacement for the broken API exists despite it being called "legacy" 7 years ago, and the problems outlined in the question are as annoying as ever, if not more.
The options for dealing with the situation (spoiler, #5 or #6 is what you want):
1. Implement authentication externally
Described in https://devcenter.heroku.com/articles/websocket-security. The client is expected to make an authenticated request to a dedicated end point that will generate and persist a short-lived token that will also be sent to the client. The client then returns this token as a URL param when opening a WebSocket. The server can validate it and accept/reject the protocol upgrade. This requires the server to implement a completely custom and stateful authentication mechanism specifically for WebSockets, which is a bridge too far in many scenarios.
2. Send auth information over WebSocket
You open a WebSocket without authenticating, then you send your auth information over WebSocket prior to doing anything else. This in theory sounds logical (and is advised by the browser vendors), but falls apart given just a cursory thought. The server is made to implement an awkward, highly stateful and entirely custom authentication mechanism that doesn't play well with anything else, on top of either having to maintain a persistent connection with a client who refuses to authenticate, leaving a door wide open for denial of service attacks, or getting into a whole new rabbit whole of enforcing rigorous time outs to prevent malicious behavior.
3. Send auth info (e.g. an access token) via a URL param
Not as terrible as it sounds, as long as SSL is enforced (wss://
not ws://
) because WebSocket URLs are special and don't get saved in browser history or similar. On top of that, access tokens are normally short lived, so that also mitigates the danger. But. The server will very likely log the URL anyway at some point. Even if your server application doesn't, the framework or the (cloud) host probably will. Additionally, if you have to pass ID tokens around (like Firebase is wont to do), you might trip up on various URL length limitations as ID tokens get huge.
4. Auth via a good old cookie
Don't. WebSockets are not subject to same-origin policy (because apparently every little thing about WebSockets has to be awful) and allowing cookies would leave you wide open to CSRF attacks. Fixing this using CSRF tokens is described e.g. here but it is more difficult than taking any other approach from this list, so it is simply not even worth considering.
5. Smuggle access tokens inside Sec-WebSocket-Protocol
Since the only header a browser will let you control is Sec-WebSocket-Protocol
, you can abuse it to emulate any other header. Interestingly (or rather comically), this is what Kubernetes is doing. In short, you append whatever you need for authentication as an extra supported subprotocol inside Sec-WebSocket-Protocol
:
var ws = new WebSocket("ws://example.com/path", ["realProtocol", "yourAccessTokenOrSimilar"]);
Then, on the server, you add some sort of middleware that transforms the request back to its saner form before passing it further into the system. Terrible, yes, but so far the best solution. No tokens in the URL, no custom authentication save for the little middleware, no extra state on the server needed. Do not forget to include the real subprotocol, as various tools will reject a connection without one.
6. Switch to SSE (or RSocket?), if applicable
For a good number of cases, SSE might be a decent replacement. The browser EventSource
API is as horribly broken as WebSocket
(it can't send anything but GET requests, can't send headers either despite being regular HTTP), but! it can be easily replaced by fetch
which is, for a change, a saner API. This approach works well as an alternative to WebSocket in e.g. GraphQL subscriptions, or really anywhere where full duplex isn't mandatory. And that likely covers most scenarios. RSocket could theoretically also be an option, but seeing how it's implemented via WebSockets in the browser, I don't think it actually resolves anything, but I didn't look into it deep enough to say with absolute certainty.
connect
request. I'm using Django channels on the back end and I've designed it to accept the connection onconnect
event. it then sets an "is_auth" flag in thereceive
event (if it sees a valid auth message). if the is_auth flag isn't set and it's not an auth message then it closes the connection. – Juieta