Few questions about RabbitMQ v3.1.5 clustering. I have a cluster with 2 nodes, rabbitmq.config is like this on both nodes:
[
{rabbit, [
{cluster_nodes, {['rabbit@rmq01', 'rabbit@rmq02'], ram}},
{tcp_listeners, [5674]}
]}
].
I already seen issue like this, and now I'm watching it again: When sometimes all cluster is shutting down, in case second node (rmq02) starts before first (rmq01), it 'forgets' about rmq01:
[root@rmq2 rabbitmq]# rabbitmqctl cluster_status
Cluster status of node 'rabbit@rmq2' ...
[{nodes,[{disc,['rabbit@rmq2']}]},
{running_nodes,['rabbit@rmq2']},
{partitions,[]}]
...done.
After this first node (rmq01) can not start due to rmq2 disagrees about clustering:
{"init terminating in do_boot",{rabbit,failure_during_boot,{error,{inconsistent_cluster,"Node 'rabbit@rmq1' thinks it's clustered with node 'rabbit@rmq2', but 'rabbit@rmq2' disagrees"}}}}
I've tried to add rmq01 to rmq02, but seems I have to stop_app before this:
[root@rmq2 rabbitmq]# rabbitmqctl join_cluster rabbit@rmq1
Clustering node 'rabbit@rmq2' with 'rabbit@rmq1' ...
Error: mnesia_unexpectedly_running
Here I see that rmq02 forgot about rmq01:
[root@rmq2 ~]# cat /var/lib/rabbitmq/mnesia/rabbit\@rmq2/cluster_nodes.config
{['rabbit@rmq2'],['rabbit@rmq2']}.
Meanwhile on rmq01 (correct configuration):
[root@rmq1 ~]# cat /var/lib/rabbitmq/mnesia/rabbit\@rmq1/cluster_nodes.config
{['rabbit@rmq1','rabbit@rmq2'],['rabbit@rmq1']}.
Questions:
- Is it normal rmq02 forgets about rmq01, or I have some missconfiguration? Why is this happening?
- In case it is ok, is it possible to fix up cluster health without rmq02 downtime (I mean without stop_app)?
C:\Users\<username>\AppData\Roaming\RabbitMQ\db
. I deleted that folder on a node I couldn't get back up and it worked. Thanks! – Ferryman