Reproduce RabbitMQ network partition scenario
Asked Answered
T

4

10

I would like to reproduce the network partition scenario with all the three modes - ignore, autoheal and pause_minority. How can I achieve this? I tried stopping(/sbin/service reboot) one of the nodes of the cluster but this didn't cause any network partitioning. I also tried deleting the mnesia on one node to create inconsistent mnesia across the cluster but that also didn't help.

Timberwork answered 4/3, 2016 at 11:49 Comment(0)
B
7

In order to simulate a network partition you can block the outgoing connections using iptables

Suppose you have 3 nodes:

node1 - ip : 10.10.0.1
node2 - ip : 10.10.0.2
node3 - ip : 10.10.0.3

After creating the cluster, go to node 2 for example and

iptables -A OUTPUT -d 10.10.0.1 -j DROP

In this way you blocked the connections and the node will go in network partition.

Then

iptables -F

to restore the network.

Bevvy answered 4/3, 2016 at 13:12 Comment(2)
How to simulate locally would be helpful tooReluctant
this doesn't work when trying to simulate network partition in rabbitmq cluster of docker swarm service of 3 replicasUpsurge
A
1

If you are using a docker, disconnecting the connected network will activate the partitioning.

docker network disconnect network_name rabbitmq_container_name
Aubigny answered 23/2, 2022 at 7:48 Comment(1)
It doesn't work for me. I have a 7 nodes cluster, if I do this in 3 nodes, the cluster works fine, without any signal of partition detected when I use rabbitmqctl cluster_status, but if I disconnect 4 nodes, the entire cluster stop working.Lake
T
0

Adding more details to above answer:

Execute below command either in node2 or node3 to block the connection from other node(s)

sudo iptables -A INPUT -s 10.10.0.1 -j DROP

To allow the connection from other nodes(s)/delete the firewall rule that we created earlier

sudo iptables -D INPUT -s 10.10.0.1 -j DROP

To view existing firewall rules

iptables --list

Note: In few cluster setup, the net partition occurs only when the nodes whose connections were blocked earlier (via 'iptables' commands) are able to communicate with each other again. So, try block and unblock connections after 60 seconds (which is default 'net_ticktime' value)

Toms answered 27/8, 2017 at 14:31 Comment(0)
D
0

I managed to simulate / reproduce network partition for RabbitMQ by blocking 25672 port.

25672: used for inter-node and CLI tools communication

I had two RabbitMQ nodes in different AWS instances.

To simulate a network partition I configured dropping tcp packets for that port, waited 60 seconds (may differ according to Net Tick Time parameter) and then removed the port blocking (that's required for network partition detection).

Adding the rules for port blocking (with the highest priority):

sudo iptables -I INPUT 1 -p tcp --dport 25672 -j DROP
sudo iptables -I OUTPUT 1 -p tcp --dport 25672 -j DROP

Removing the rules (after 60+ seconds):

sudo iptables -D INPUT -p tcp --dport 25672 -j DROP
sudo iptables -D OUTPUT -p tcp --dport 25672 -j DROP

To check that network partition has occurred:

sudo rabbitmqctl cluster_status

partitions property will have nodes in it's array like this

{partitions,[{'rabbit@ip-163-10-1-10',['rabbit@ip-163-10-0-15']}]}

RabbitMQ network partition docs

Danseuse answered 12/11, 2020 at 7:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.