We have a test running with RabbitMQ, but somehow hit a "threshold" which makes RabbitMQ slow.
What we did, we setup a RabbitMQ server and made a small PHP script that uses pecl-amqp to connect (no pconnect) to the MQ server, sends 1 random message and disconnects immediately after that. We fork this process 200 times (single php execution, so no threading), which means every second +/- 2000 connections are opened to the MQ server and closed.
After that I start the same script on a different server to measure the connect time + read time. We noticed that it takes +/- 1.5 seconds until the "Connection.Start" message is received by the client. I've added a screenshot of the tcpdump for this single client to show you the situation. Another problem we see, there are a lot of TCP Out-Of-Order and Retransmissions.
Server specs
- Bare Metal
- CPU: 2x Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
- Memory: 24GB
TCPdump https://i.sstatic.net/l1UoM.png
sysctl settings:
fs.file-max = 11479756
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.conf.default.rp_filter = 0
net.ipv4.tcp_syncookies=1
net.core.somaxconn=2048
net.ipv4.ip_local_port_range = 10240 65535
net.nf_conntrack_max = 1000000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
rabbitmq.conf
%% -*- mode: erlang -*-
[{
rabbit,
[{
tcp_listeners, [{
"10.0.1.69", 5672
}, {
"::1", 5672
}]
},
{tcp_listen_options, [binary,
{packet, raw},
{reuseaddr, true},
{backlog, 65535},
{nodelay, true},
{exit_on_close, false},
{keepalive, true},
{sndbuf, 32768},
{recbuf,32768}]
},
{ vm_memory_high_watermark, 0.6 },
{
log_levels,
[{
connection,
debug
}, {
channel,
debug
}]
},
{
loopback_users,
[]
},
{
default_user,
<< "code_mucRfHX79xvil" >>
}, {
default_pass,
<< "xxxx" >>
}
]
},
{
kernel,
[{
inet_default_connect_options,
[{
nodelay,
true
}]
}, {
inet_default_listen_options,
[{
nodelay,
true
}]
}]
},
{
rabbitmq_management,
[]
},
{
rabbitmq_shovel,
[{
shovels,
[]
}]
},
{
rabbitmq_stomp,
[]
},
{
rabbitmq_mqtt,
[{
default_user,
<< "code_mucRfHX79xvil" >>
}, {
default_pass,
<< "xxxxx" >>
}]
},
{
rabbitmq_amqp1_0,
[{
default_user,
<< "code_mucRfHX79xvil" >>
}]
},
{
rabbitmq_auth_backend_ldap,
[]
}
].
rabbitmqctl report (during "heavy" load test)
Reporting server status on {{2015,12,29},{14,43,15}}
...
Status of node rabbit@web1 ...
[{pid,7730},
{running_applications,
[{rabbitmq_shovel_management,"Shovel Status","3.6.0"},
{rabbitmq_management,"RabbitMQ Management Console","3.6.0"},
{rabbitmq_web_dispatch,"RabbitMQ Web Dispatcher","3.6.0"},
{webmachine,"webmachine","git"},
{mochiweb,"MochiMedia Web Server","2.13.0"},
{rabbitmq_shovel,"Data Shovel for RabbitMQ","3.6.0"},
{rabbitmq_management_agent,"RabbitMQ Management Agent","3.6.0"},
{rabbit,"RabbitMQ","3.6.0"},
{os_mon,"CPO CXC 138 46","2.2.14"},
{ssl,"Erlang/OTP SSL application","5.3.2"},
{public_key,"Public key infrastructure","0.21"},
{crypto,"CRYPTO version 2","3.2"},
{amqp_client,"RabbitMQ AMQP Client","3.6.0"},
{rabbit_common,[],"3.6.0"},
{inets,"INETS CXC 138 49","5.9.7"},
{mnesia,"MNESIA CXC 138 12","4.11"},
{compiler,"ERTS CXC 138 10","4.9.4"},
{xmerl,"XML parser","1.3.5"},
{syntax_tools,"Syntax tools","1.6.12"},
{asn1,"The Erlang ASN1 compiler version 2.0.4","2.0.4"},
{ranch,"Socket acceptor pool for TCP protocols.","1.2.1"},
{sasl,"SASL CXC 138 11","2.3.4"},
{stdlib,"ERTS CXC 138 10","1.19.4"},
{kernel,"ERTS CXC 138 10","2.16.4"}]},
{os,{unix,linux}},
{erlang_version,
"Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:24:24] [async-threads:64] [kernel-poll:true]\n"},
{memory,
[{total,74421680},
{connection_readers,0},
{connection_writers,0},
{connection_channels,0},
{connection_other,2704},
{queue_procs,2704},
{queue_slave_procs,0},
{plugins,1630200},
{other_proc,20907680},
{mnesia,61736},
{mgmt_db,1294648},
{msg_index,53056},
{other_ets,1387864},
{binary,8651592},
{code,26770847},
{atom,951441},
{other_system,12707208}]},
{alarms,[]},
{listeners,
[{clustering,25672,"::"},{amqp,5672,"10.0.1.69"},{amqp,5672,"::1"}]},
{vm_memory_high_watermark,0.6},
{vm_memory_limit,15139282944},
{disk_free_limit,50000000},
{disk_free,34608447488},
{file_descriptors,
[{total_limit,65435},
{total_used,8},
{sockets_limit,58889},
{sockets_used,6}]},
{processes,[{limit,1048576},{used,276}]},
{run_queue,2},
{uptime,3756},
{kernel,{net_ticktime,60}}]
Cluster status of node rabbit@web1 ...
[{nodes,[{disc,[rabbit@web1]}]},
{running_nodes,[rabbit@web1]},
{cluster_name,<<"[email protected]">>},
{partitions,[]}]
Application environment of node rabbit@web1 ...
[{amqp_client,[{prefer_ipv6,false},{ssl_options,[]}]},
{asn1,[]},
{compiler,[]},
{crypto,[]},
{inets,[]},
{kernel,
[{error_logger,tty},
{inet_default_connect_options,[{nodelay,true}]},
{inet_default_listen_options,[{nodelay,true}]},
{inet_dist_listen_max,25672},
{inet_dist_listen_min,25672}]},
{mnesia,[{dir,"/var/lib/rabbitmq/mnesia/rabbit@web1"}]},
{mochiweb,[]},
{os_mon,
[{start_cpu_sup,false},
{start_disksup,false},
{start_memsup,false},
{start_os_sup,false}]},
{public_key,[]},
{rabbit,
[{auth_backends,[rabbit_auth_backend_internal]},
{auth_mechanisms,['PLAIN','AMQPLAIN']},
{backing_queue_module,rabbit_priority_queue},
{channel_max,0},
{cluster_keepalive_interval,10000},
{cluster_nodes,{[],disc}},
{cluster_partition_handling,ignore},
{collect_statistics,fine},
{collect_statistics_interval,5000},
{credit_flow_default_credit,{200,50}},
{default_permissions,[<<".*">>,<<".*">>,<<".*">>]},
{default_user,<<"username">>},
{default_user_tags,[administrator]},
{default_vhost,<<"/">>},
{delegate_count,16},
{disk_free_limit,50000000},
{enabled_plugins_file,"/etc/rabbitmq/enabled_plugins"},
{error_logger,{file,"/var/log/rabbitmq/[email protected]"}},
{fhc_read_buffering,false},
{fhc_write_buffering,true},
{frame_max,131072},
{halt_on_upgrade_failure,true},
{handshake_timeout,10000},
{heartbeat,60},
{hipe_compile,false},
{hipe_modules,
[rabbit_reader,rabbit_channel,gen_server2,rabbit_exchange,
rabbit_command_assembler,rabbit_framing_amqp_0_9_1,rabbit_basic,
rabbit_event,lists,queue,priority_queue,rabbit_router,rabbit_trace,
rabbit_misc,rabbit_binary_parser,rabbit_exchange_type_direct,
rabbit_guid,rabbit_net,rabbit_amqqueue_process,
rabbit_variable_queue,rabbit_binary_generator,rabbit_writer,
delegate,gb_sets,lqueue,sets,orddict,rabbit_amqqueue,
rabbit_limiter,gb_trees,rabbit_queue_index,
rabbit_exchange_decorator,gen,dict,ordsets,file_handle_cache,
rabbit_msg_store,array,rabbit_msg_store_ets_index,rabbit_msg_file,
rabbit_exchange_type_fanout,rabbit_exchange_type_topic,mnesia,
mnesia_lib,rpc,mnesia_tm,qlc,sofs,proplists,credit_flow,pmon,
ssl_connection,tls_connection,ssl_record,tls_record,gen_fsm,ssl]},
{log_levels,[{connection,debug},{channel,debug}]},
{loopback_users,[]},
{memory_monitor_interval,2500},
{mirroring_flow_control,true},
{mirroring_sync_batch_size,4096},
{mnesia_table_loading_timeout,30000},
{msg_store_credit_disc_bound,{2000,500}},
{msg_store_file_size_limit,16777216},
{msg_store_index_module,rabbit_msg_store_ets_index},
{msg_store_io_batch_size,2048},
{password_hashing_module,rabbit_password_hashing_sha256},
{plugins_dir,"/usr/lib/rabbitmq/lib/rabbitmq_server-3.6.0/plugins"},
{plugins_expand_dir,
"/var/lib/rabbitmq/mnesia/rabbit@web1-plugins-expand"},
{queue_index_embed_msgs_below,4096},
{queue_index_max_journal_entries,32768},
{reverse_dns_lookups,false},
{sasl_error_logger,{file,"/var/log/rabbitmq/[email protected]"}},
{server_properties,[]},
{ssl_allow_poodle_attack,false},
{ssl_apps,[asn1,crypto,public_key,ssl]},
{ssl_cert_login_from,distinguished_name},
{ssl_handshake_timeout,5000},
{ssl_listeners,[]},
{ssl_options,[]},
{tcp_listen_options,
[binary,
{packet,raw},
{reuseaddr,true},
{backlog,65535},
{nodelay,true},
{exit_on_close,false},
{keepalive,true},
{sndbuf,32768},
{recbuf,32768}]},
{tcp_listeners,[{"10.0.1.69",5672},{"::1",5672}]},
{trace_vhosts,[]},
{vm_memory_high_watermark,0.6},
{vm_memory_high_watermark_paging_ratio,0.5}]},
{rabbit_common,[]},
{rabbitmq_management,
[{http_log_dir,none},
{listener,[{port,15672}]},
{load_definitions,none},
{rates_mode,basic},
{sample_retention_policies,
[{global,[{605,5},{3660,60},{29400,600},{86400,1800}]},
{basic,[{605,5},{3600,60}]},
{detailed,[{10,5}]}]}]},
{rabbitmq_management_agent,[]},
{rabbitmq_shovel,
[{defaults,
[{prefetch_count,1000},
{ack_mode,on_confirm},
{publish_fields,[]},
{publish_properties,[]},
{reconnect_delay,5}]},
{shovels,[]}]},
{rabbitmq_shovel_management,[]},
{rabbitmq_web_dispatch,[]},
{ranch,[]},
{sasl,[{errlog_type,error},{sasl_error_logger,false}]},
{ssl,[]},
{stdlib,[]},
{syntax_tools,[]},
{webmachine,[{error_handler,rabbit_webmachine_error_handler}]},
{xmerl,[]}]
Connections:
pid name port peer_port host peer_host ssl peer_cert_subject peer_cert_issuer peer_cert_validity auth_mechanism ssl_protocol ssl_key_exchange ssl_cipher ssl_hash protocol user vhost timeout frame_max channel_max client_properties connected_at recv_oct recv_cnt send_oct send_cnt send_pend state channels
<[email protected]> 10.0.1.210:23801 -> 10.0.1.69:5672 5672 23801 10.0.1.69 10.0.1.210 false PLAIN {0,9,1} username / 0 131072 256 [{"product","rabbitmq-c"},{"version","0.5.2"},{"platform","Linux-3.2.0-37-generic"},{"copyright","Copyright (c) 2007-2014 VMWare Inc, Tony Garnock-Jones, and Alan Antonuk."},{"information","See https://github.com/alanxz/rabbitmq-c"},{"client",[{"type","php-amqp extension"},{"version","1.6.1"},{"revision","release"},{"connection type","transient"},{"connection started","Tue, 29 Dec 2015 14:43:13 GMT"}]}] 1451400195480 542 5 530 4 0 running 1
<[email protected]> 10.0.1.210:44872 -> 10.0.1.69:5672 5672 44872 10.0.1.69 10.0.1.210 false none {0,9,1} none 3 4096 undefined none 1451400195483 8 1 481 1 0 starting 0
<[email protected]> 10.0.1.210:22339 -> 10.0.1.69:5672 5672 22339 10.0.1.69 10.0.1.210 false PLAIN {0,9,1} username / 0 131072 256 [{"product","rabbitmq-c"},{"version","0.5.2"},{"platform","Linux-3.2.0-37-generic"},{"copyright","Copyright (c) 2007-2014 VMWare Inc, Tony Garnock-Jones, and Alan Antonuk."},{"information","See https://github.com/alanxz/rabbitmq-c"},{"client",[{"type","php-amqp extension"},{"version","1.6.1"},{"revision","release"},{"connection type","transient"},{"connection started","Tue, 29 Dec 2015 14:43:13 GMT"}]}] 1451400195480 651 7 554 6 0 running 1
<[email protected]> 10.0.1.210:23806 -> 10.0.1.69:5672 5672 23806 10.0.1.69 10.0.1.210 false PLAIN {0,9,1} username / 0 131072 256 [{"product","rabbitmq-c"},{"version","0.5.2"},{"platform","Linux-3.2.0-37-generic"},{"copyright","Copyright (c) 2007-2014 VMWare Inc, Tony Garnock-Jones, and Alan Antonuk."},{"information","See https://github.com/alanxz/rabbitmq-c"},{"client",[{"type","php-amqp extension"},{"version","1.6.1"},{"revision","release"},{"connection type","transient"},{"connection started","Tue, 29 Dec 2015 14:43:13 GMT"}]}] 1451400195482 510 3 514 3 0 running 0
<[email protected]> 10.0.1.210:23990 -> 10.0.1.69:5672 5672 23990 10.0.1.69 10.0.1.210 false PLAIN {0,9,1} username / 0 131072 256 [{"product","rabbitmq-c"},{"version","0.5.2"},{"platform","Linux-3.2.0-37-generic"},{"copyright","Copyright (c) 2007-2014 VMWare Inc, Tony Garnock-Jones, and Alan Antonuk."},{"information","See https://github.com/alanxz/rabbitmq-c"},{"client",[{"type","php-amqp extension"},{"version","1.6.1"},{"revision","release"},{"connection type","transient"},{"connection started","Tue, 29 Dec 2015 14:43:13 GMT"}]}] 1451400195482 510 4 514 3 0 running 0
<[email protected]> 10.0.1.210:22506 -> 10.0.1.69:5672 5672 22506 10.0.1.69 10.0.1.210 false PLAIN {0,9,1} username / 0 131072 256 [{"product","rabbitmq-c"},{"version","0.5.2"},{"platform","Linux-3.2.0-37-generic"},{"copyright","Copyright (c) 2007-2014 VMWare Inc, Tony Garnock-Jones, and Alan Antonuk."},{"information","See https://github.com/alanxz/rabbitmq-c"},{"client",[{"type","php-amqp extension"},{"version","1.6.1"},{"revision","release"},{"connection type","transient"},{"connection started","Tue, 29 Dec 2015 14:43:13 GMT"}]}] 1451400195481 510 3 514 3 0 running 0
Channels:
pid name connection number user vhost transactional confirm consumer_count messages_unacknowledged messages_unconfirmed messages_uncommitted acks_uncommitted prefetch_count global_prefetch_count state
<[email protected]> 10.0.1.210:23811 -> 10.0.1.69:5672 (1) <[email protected]> 1 username / false false 0 0 0 0 0 3 0 running
Queues on /:
Exchanges on /:
name type durable auto_delete internal arguments policy
direct true false false []
ack topic false false false []
amq.direct direct true false false []
amq.fanout fanout true false false []
amq.headers headers true false false []
amq.match headers true false false []
amq.rabbitmq.log topic true false true []
amq.rabbitmq.trace topic true false true []
amq.topic topic true false false []
direct direct false false false []
Bindings on /:
Consumers on /:
Permissions on /:
user configure write read
username .* .* .*
Policies on /:
Parameters on /:
php code
<?php
$connection = new AMQPConnection([
'host' => 'host',
'port' => 5672,
'login' => 'username',
'password' => 'password',
]);
$connection->connect();
$channel = new AMQPChannel($connection);
$exchange = new AMQPExchange($channel);
$exchange->setName('ack');
$exchange->setType(AMQP_EX_TYPE_TOPIC);
$exchange->declareExchange();
echo $exchange->publish('test data ' . mt_rand(1,100), 'ack.call');
$connection->disconnect();
tcp_tw_reuse,recycle = 1
. Looks like complete mess in connections. I'm not sure it is safe to have them set to 1, check Networking and RabbitMQ, OS Level Tuning section RabbitMQ official doc. Meanwhile, can you try to run withtcp_tw_{reuse,recycle} = 0
to check whether problem gone which impact it has on performance. Note, that there are Flow Control which when hit (may) have negative impact on throughput. – Lyse