Twemproxy Lag Forces a Restart
Asked Answered
B

2

7

We are running a PHP stack on our app servers which use twemproxy locally (via socket), to connect to multiple upstream memcached servers (EC2 small instances) for our caching layer.

Every so often I get an alert from our app monitor that a page load time takes > 5 seconds. When this occurs, the immediate fix is to restart the twemproxy service on each app server, which is a hassle.

The only fix I have now is a crontab that runs every minute and restarts the service, but as you can imagine nothing gets written for a few seconds every minute, which is not a desired, permanent solution.

Has anyone encountered this before? If so, what was the fix? I tried to switch to AWS Elasticache but it didn't have the same performance as our current twemproxy solution.

Here is my twemproxy config.

default:
  auto_eject_hosts: true
  distribution: ketama
  hash: fnv1a_64
  listen: /var/run/nutcracker/nutcracker.sock 0666
  server_failure_limit: 1
  server_retry_timeout: 600000 # 600sec, 10m
  timeout: 100
  servers:

    - vcache-1:11211:1
    - vcache-2:11211:1

And here is the connection config for the php layer:

# Note: We are using HA / twemproxy (nutcracker) / memcached proxy
# So this isn't a default memcache(d) port
# Each webapp will host the cache proxy, which allows us to connect via socket
#   which should be faster, as no tcp overhead
# Hash has been manually override from default jenkins to FNV1A_64, which directly aligns with proxy
port: 0
<?php echo Hobis_Api_Cache::TYPE_VOLATILE; ?>:
  options:
    - <?php echo Memcached::OPT_HASH; ?>: <?php echo Memcached::HASH_FNV1A_64; ?><?php echo PHP_EOL; ?>
    - <?php echo Memcached::OPT_SERIALIZER; ?>: <?php echo Memcached::SERIALIZER_IGBINARY; ?><?php echo PHP_EOL; ?>
  servers:
    - /var/run/nutcracker/nutcracker.sock

We are running 0.4.1 twemproxy and 1.4.25 memcached.

Thanks.

Buckeye answered 1/2, 2017 at 21:15 Comment(1)
It is the problem of setting crontabHydraulic
B
0

I ended up switching from unix socket to tcp port on localhost and it seems to have resolved the restart problem. However I did notice an uptick in response time in making the switch, due to the overhead associated with tcp. Not accepting this answer in hopes someone down the road will post a more authoritative answer about the sockets...

Buckeye answered 15/2, 2017 at 17:55 Comment(0)
M
3

The number of open / stale socket connections may be the issue

Marra answered 10/2, 2017 at 13:19 Comment(2)
Hmmm that they may be stacking, I'll look into this.Buckeye
So he shouldn't have tried to help OP? He has something that may direct OP towards a solution, yet you'd rather have him shut up? I swear Stack Overflow is full of narcissic people who just want to ruin other people's experience... OP looks happy to receive this comment, and Kiran did what he could to give his opinion, since Stack won't let him comment... So please Jorn... Leave and let people be. This is a site where we help people. Kiran's post was helpful. Yours is everything but helpful. Upvoting you Kiran. Dunno if it'll solve his issue but you tried to help -as opposed to others-Nazar
B
0

I ended up switching from unix socket to tcp port on localhost and it seems to have resolved the restart problem. However I did notice an uptick in response time in making the switch, due to the overhead associated with tcp. Not accepting this answer in hopes someone down the road will post a more authoritative answer about the sockets...

Buckeye answered 15/2, 2017 at 17:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.