Socket.io disconnected unexpectedly
Asked Answered
J

1

7

I have node.js service and angular client using socket.io to transport some message during long time http request.

Service:

export const socketArray: SocketIO.Socket[] = [];
export let socketMapping: {[socketId: string]: number} = {};

const socketRegister: hapi.Plugin<any> = {
    register: (server) => {
        const io: SocketIO.Server = socket(server.listener);

        // Whenever a session connected to socket, create a socket object and add it to socket array
        io.on("connection", (socket) => {
            console.log(`socket ${socket.id} connected`);
            logger.info(`socket ${socket.id} connected`);

            // Only put socket object into array if init message received
            socket.on("init", msg => {
                logger.info(`socket ${socket.id} initialized`);
                socketArray.push(socket);
                socketMapping[socket.id] = msg;
            });

            // Remove socket object from socket array when disconnected
            socket.on("disconnect", (reason) => {
                console.log(`socket ${socket.id} disconnected because: ${reason}`)
                logger.info(`socket ${socket.id} disconnected because: ${reason}`);
                for(let i = 0; i < socketArray.length; i ++) {
                    if(socketArray[i] === socket) {
                        socketArray.splice(i, 1);
                        return;
                    }
                }
            });
        });
    },
    name: "socketRegister",
    version: "1.0"
}

export const socketSender = async (socketId: string, channel: string, content: SocketMessage) => {
    try {
        // Add message to db here
        // await storeMessage(socketMapping[socketId], content);
        // Find corresponding socket and send message
        logger.info(`trying sending message to ${socketId}`);
        for (let i = 0; i < socketArray.length; i ++) {
            if (socketArray[i].id === socketId) {
                socketArray[i].emit(channel, JSON.stringify(content));
                logger.info(`socket ${socketId} send message to ${channel}`);
                if (content.isFinal == true) {
                    // TODO: delete all messages of the process if isFinal is true
                    await deleteProcess(content.processId);
                }
                return;
            }
        }
    } catch (err) {
        logger.error("Socket sender error: ", err.message);
    }

};

Client:

connectSocket() {
   if (!this.socket) {
       try {
           this.socket = io(socketUrl);
           this.socket.emit('init', 'some-data');
       } catch (err) {
           console.log(err);
       }
   } else if (this.socket.disconnected) {
       this.socket.connect();
       this.socket.emit('init', 'some-data');
   }
   this.socket.on('some-channel', (data) => {
       // Do something
   });
   this.socket.on('disconnect', (data) => {
       console.log(data);
   });

}

They usually work fine but produce disconnection error randomly. From my log file, we can see this:

2018-07-21T00:20:28.209Z[x]INFO: socket 8jBh7YC4A1btDTo_AAAN connected

2018-07-21T00:20:28.324Z[x]INFO: socket 8jBh7YC4A1btDTo_AAAN initialized

2018-07-21T00:21:48.314Z[x]INFO: socket 8jBh7YC4A1btDTo_AAAN disconnected because: ping timeout

2018-07-21T00:21:50.849Z[x]INFO: socket C6O7Vq38ygNiwGHcAAAO connected

2018-07-21T00:23:09.345Z[x]INFO: trying sending message to C6O7Vq38ygNiwGHcAAAO

And at the same time of disconnect message, front-end also noticed a disconnect event which saying transport close.

From the log, we can get the work flow is this:

  1. Front-end started a socket connection and sent an init message to back-end. It also save the socket.
  2. Back-end detected the connection and received init message
  3. Back-end put the socket to the array so that it can be used anytime anywhere
  4. The first socket was disconnected unexpectedly and another connection is published without front-end's awareness so front-end never send a message to initialize it.
  5. Since front-end's saved socket is not changed, it used the old socket id when made http request. As a result, back-end sent message with the old socket which was already removed from socket array.

The situation doesn't happen frequently. Does anyone know what could cause the disconnect and unknown connect issue?

Josephus answered 21/7, 2018 at 0:57 Comment(0)
V
7

It really depends what "long time http request" is doing. node.js runs your Javascript as a single thread. That means it can literally only do one thing at a time. But, since many things that servers do are I/O related (read from a database, get data from a file, get data from another server, etc...) and node.js uses event-driven asynchronous I/O, it can often have many balls in the air at the same time so it appears to be working on lots of requests at once.

But, if your complex http request is CPU-intensive, using lots of CPU, then it's hogging the single Javascript thread and nothing else can get done while it is hogging the CPU. That means that all incoming HTTP or socket.io requests have to wait in a queue until the one node.js Javascript thread is free so it can grab the next event from the event queue and start to process that incoming request.

We could only really help you more specifically if we could see the code for this "very complex http request".

The usual way around CPU-hogging things in node.js is to offload CPU-intensive stuff to other processes. If it's mostly just this one piece of code that causes the problem, you can spin up several child processes (perhaps as many as the number of CPUs you have in your server) and then feed them the CPU-intensive work and leave your main node.js process free to handle incoming (non-CPU-intensive) requests with very low latency.

If you have multiple operations that might hog the CPU, then you either have to farm them all out to child processes (probably via some sort of work queue) or you can deploy clustering. The challenge with clustering is that a given socket.io connection will be to one particular server in your cluster and if it's that process that just happens to be executing a CPU-hogging operation, then all the socket.io connections assigned to that server would have bad latency. So, regular clustering is probably not so good for this type of issue. The work-queue and multiple specialized child processes to handle CPU-intensive work are probably better because those processes won't have any outside socket.io connections that they are responsible for.


Also, you should know that if you're using synchronous file I/O, that blocks the entire node.js Javascript thread. node.js can not run any other Javascript during a synchronous file I/O operation. node.js gets its scalability and its ability to have many operations in flight at the same from its asynchronous I/O model. If you use synchronous I/O, you completely break that and ruin scalability and responsiveness.

Synchronous file I/O belongs only in server startup code or in a single purpose script (not a server). It should never be used while processing a request in a server.

Two ways to make asynchronous file I/O a little more tolerable are by using streams or by using async/await with promisified fs methods.

Verleneverlie answered 25/7, 2018 at 17:40 Comment(1)
@Josephus - I added some more info about synchronous file I/O.Verleneverlie

© 2022 - 2024 — McMap. All rights reserved.