I'm working on replacing a REST-based data pipe with a Websocket-based one, and I'm having trouble finding all the places things can go wrong. The system is production, so very bad things happen if it fails and doesn't recover. Here's what I've got so far:
Client-side
let server = new Websocket(path, opts)
- Wrapping this in
try-catch
will find programmer errors, like incorrect URLs, but operational errors like the server not responding correctly don't seem catchable as they're asynchronous and there's no callback
- Wrapping this in
server.send(data, cb)
- Wrapping this in
try-catch
will catch type errors, typically programmer errors - Adding a callback here (
function (err) { handleErr(err); }
) is a great catch-all on operational errors, as the callback will have a non-nullerr
if the send fails for any reason, so that's handled
- Wrapping this in
server.on('error', cb)
- Adding a callback here seems to be a good idea, as the
error
event is part of theEventEmitter
spec, but I haven't actually caught anything with it yet
- Adding a callback here seems to be a good idea, as the
- Heartbeat Checks (verbose, but described here)
- This is recommended by the
ws
readme as a way of catching silent connection failures
- This is recommended by the
Server-side
server.on('connection', function(connection) {...})
- Trying
connection.send('test', function(err) { handleErr(err); });
is a nice way of making sure the connection didn't fail somehow getting setup, before trying to use it, but it may not be necessary. Also, that should be wrapped in atry-catch
for the reasons above
- Trying
server.on('error', cb)
- Seems like a good idea for the same reasons I do it on the client side above
It just seems like building on top of ws
with production in mind is difficult because nowhere is it documented all the different things that can go wrong, and going with a more user-friendly library like Socket.io would remove many of the performance advantages sought by using ws
. Is there anywhere documentation on all the different things that can go wrong when using ws
, or at least a guide to battle-hardening it? I feel like deploying this thing is just a gamble where any second I could get angrily called into an office to fix things.