libuv worker threads or work queue health check?

Asked 12/12, 2013 at 23:25 Answered 30/11, 2014 at 4:31

In libuv, you can end up tying up the worker threads with too much work or buggy code. Is there a simple function that can check the health of the worker threads or thread queue? It doesn't have to be 100% deterministic, after all it would be impossible to determine whether the worker thread is hanging on slow code or an infinite loop.

So any of the following heuristics would be good:

Number of queued items not yet worked on. If this is too large, it could mean the worker threads are busy or hung.
Does libuv have any thread killing mechanism where if the worker thread doesn't check back in n seconds, it gets terminated?

Tripoli answered 12/12, 2013 at 23:25 Comment(1)

are you using libuv as part of a node.js app or standalone? – Faeroese 10/4, 2014 at 19:17

That function does not exist in libuv itself, and I am not aware of any OSS that provides something like that.

In terms of a killing mechanism, there is none baked into libuv, but http://nikhilm.github.io/uvbook/threads.html#core-thread-operations suggests:

A well designed program would have a way to terminate long running workers that have already started executing. Such a worker could periodically check for a variable that only the main process sets to signal termination.

Autotomize answered 18/4, 2014 at 0:45 Comment(0)

-1

If this is for nodejs, would a simple monitor thread do? I don't know of a way to get information about the event queue internals, but you can inject a tracer into the event queue to monitor that threads are being run in a timely manner. (This measures load not by the number of threads not yet run, but by whether the threads are getting run on time. Same thing, kind of.)

A monitor thread could re-queue itself and check that it gets called at least every 10 milliseconds (or whatever max cumulative blocking ms is allowed). Since nodej runs threads round-robin, if the monitor thread was run on time, it tells us that all other threads got a chance to run within that same 10 ms window. Something like (in node):

// like Date.now(), but with higher precision
// the extra precision is needed to be able to track small delays
function dateNow() {
    var t = process.hrtime();
    return (t[0] + t[1] * 1e-9) * 1000;
}

var _lastTimestamp = dateNow();   // when healthMonitor ran last, in ms
var _maxAllowedDelay = 10.0;      // max ms delay we allow for our task to run
function healthMonitor() {
    var now = dateNow();
    var delay = now - _lastTimestamp;
    if (delaly > _maxAllowedDelay) {
        console.log("healthMonitor was late:", delay, " > ", _maxAllowedDelay);
    }
    _lastTimestamp = now;
    setTimeout(healthMonitor, 1);
}

// launch the health monitor and run it forever
// note: the node process will never exit, it will have to be killed
healthMonitor();

Throttling the alert messages and supporting a clean shutdown is an exercise left to the reader.

Skipp answered 30/11, 2014 at 4:31 Comment(0)

Recommended topics

Hot tags