How to communicate Web and Worker dynos with Node.js on Heroku?
Asked Answered
C

2

37

Web Dynos can handle HTTP Requests

and while Web Dynos handles them Worker Dynos can handle jobs from it.

But I don't know how to make Web Dynos and Worker Dynos to communicate each other.

For example, I want to receive a HTTP request by Web Dynos

, send it to Worker Dynos

, process the job and send back result to Web Dynos

, show results on Web.

Is this possible in Node.js? (With RabbitMQ or Kue or etc)?

I could not find an example in Heroku Documentation

Or Should I implement all codes in Web Dynos and scaling Web Dynos only?

Countermark answered 11/7, 2012 at 9:39 Comment(0)
L
41

As the high-level article on background jobs and queuing suggests, your web dynos will need to communicate with your worker dynos via an intermediate mechanism (often a queue).

To accomplish what it sounds like you're hoping to do follow this general approach:

  • Web request is received by the web dyno
  • Web dyno adds a job to the queue
  • Worker dyno receives job off the queue
  • Worker dyno executes job, writing incremental progress to a shared component
  • Browser-side polling requests status of job from the web dyno
    • Web dyno queries shared component for progress of background job and sends state back to browser
  • Worker dyno completes execution of the job and marks it as complete in shared component
  • Browser-side polling requests status of job from the web dyno
    • Web dyno queries shared component for progress of background job and sends completed state back to browser

As far as actual implementation goes I'm not too familiar with the best libraries in Node.js, but the components that glue this process together are available on Heroku as add-ons.

Queue: AMQP is a well-supported queue protocol and the CloudAMQP add-on can serve as the message queue between your web and worker dynos.

Shared state: You can use one of the Postgres add-ons to share the state of an job being processed or something more performant such as Memcache or Redis.

So, to summarize, you must use an intermediate add-on component to communicate between dynos on Heroku. While this approach involves a little more engineering, the result is a properly-decoupled and scalable architecture.

Lyso answered 11/7, 2012 at 17:43 Comment(4)
I have one more question on this. When I use AMQP, how would you guarantee that processing each jobs are handled by one worker dyno but not duplicating? To me, AMQP is similar to TCP Socket, broadcast event and listen to the event and do something. If an "enqueue" event happened, multiple worker dynos would react to the "enqueue" event and try to "dequeue" event in same time. How can I handle this problem?Countermark
While queue behavior varies between each queue and client libraries the default behavior is usually not to broadcast. So by default, when a message is consumed off the queue it is done so by the first receiver to get there and is then removed from the queue.Lyso
In AMQP you have Exchanges to which you publish messages, and you have Queues from which you get messages, then you have "bindings" between them which routes the messages from an Exchange to one or more Queues. If you only have one binding between an Exchange and a Queue (which is the default), you're guaranteed to only get unique messages to each subscriber of that Queue.Taveras
Also, AMQP has other nice benefits like in-order guarantees, and features like message persistance, high availability (mirrored) queues etc. (Disclosure, I own CloudAMQP)Taveras
F
-4

From what I can tell, Heroku does not supply a way of communicating for you, so you will have to build that yourself. In order to communicate to another process using Node, you will probably have to deal with the process' stdin/out/err manually, something like this:

var attachToProcess = function(pid) {
    return {
        stdin: fs.createWriteStream('/proc/' + pid + '/fd/0'),
        stdout: fs.createReadStream('/proc/' + pid + '/fd/1'),
        stderr: fs.createReadStream('/proc/' + pid + '/fd/2')
    };
};

var pid = fs.readFile('/path/to/worker.pid', 'utf8', function(err, pid) {
    if (err) {throw err;}
    var worker = attachToProcess(Number(pid));
    worker.stdin.write(...);
});

Then, in your worker process, you will have to store the pid in that pid file:

fs.writeFile('/path/to/worker.pid', process.pid, function(err) {
    if (err) {throw err;}
});

I haven't actually tested any of this, so it will likely take some working and building on it, but I think the basic idea is clear.

Edit

I just noticed that you tagged this with "redis" as well, and thought I should add that you can also use redis pub/sub to communicate between your various processes as explained in the node_redis readme.

Fenelia answered 11/7, 2012 at 10:33 Comment(3)
Heroku dynos are each virtualized meaning they do not share the same filesystem, even within the same app. So, communicating via process ids from one dyno to another won't work.Lyso
@RyanDaigle yeah, i thought there might be some issue there. the idea about redis is still valid, though.Fenelia
definitely. Using Redis as an intermediary (or some other queue lib) is the right approach.Lyso

© 2022 - 2024 — McMap. All rights reserved.