CakeSession::_startSession - Slow on Elasticache
Asked Answered
D

3

18

We're running CakePHP 2.9, and using an Elasticache Cluster for Session Storage (which is stored via Memcached).

We've disabled PHP's in-built session garbage collection as recommended here: https://tideways.io/profiler/blog/php-session-garbage-collection-the-unknown-performance-bottleneck

session.gc_probability = 0

We have also set the probability setting to 0 within CakePHP's Cache config.

However; we're still having issues whereby occasionally we experience major slow-downs in CakeSession::_startSession, as reported by New Relic:

Slow CakeSession::_startSession

The Elasticache Cluster is not showing any metrics which would suggest there is a problem (unless there's some metric I'm not understanding correctly).

Any suggestions on how to diagnose this cause?

Dennis answered 20/2, 2017 at 0:2 Comment(11)
Are the webservers on the same VPC as the ElasticCache?Saleswoman
@Saleswoman Yes - all within the same Security Group - is that what you meant?Dennis
No VPC is not the same as the securty group. VPC is like a LAN for the services. Check the faq pages outSaleswoman
Yeah, its called "VPC Security Group". The cluster is on the same VPC Security Group as the EC2 Instances.Dennis
If your instances are on the same VPC (which is what's implied by using the same VPC security group) then the only other reason I can think of is that they're t type instances and the burst quota is regularly being exceeded.Saleswoman
Sorry they're all c4.large. About a month ago we moved off t2 type instances because we were having issues with credits running out. This issue has persisted since switching instance sizes.Dennis
@Dennis how many memcache servers are you running?Cyclorama
@srayhunter 2 within the cluster. Spread over 2 availability zones.Dennis
@Dennis I ran into issues where having 2 nodes in the cluster caused a ton of issues. I wonder if you change it to 1 if that would fix the issue.Cyclorama
In the graph i can't see the problem with CakeSession::_startSession. The whole execution time for Dispacher::dispach is only 5ms, including CakeSession::_startSession.Pyo
@pbacterio: Perhaps I'm mis-reading the graph, but my understanding is that it's showing that total execution time was 0.026s up till it hit CakeSession::_startSession, then it took 5.7s to complete that before carrying on with TenantAuthorizeComponent::initialize at timestamp 5.787?Dennis
D
1

This issue appears to have been caused by session locking, something I wasn't even aware existed.

This article explains how and why Session Locking exists: https://ma.ttias.be/php-session-locking-prevent-sessions-blocking-in-requests/

What's important is that memcached has session locking turned on by default.

In our case, we don't use Sessions for much other than Authentication, our application doesn't use the session information for storing User State (like a shopping cart would), so we simply disabled session locking with the php.ini setting:

memcached.sess_locking = 0

Since making this change, we've seeing a huge improvement in response times (~200ms average to ~160). This is especially noticeable on AJAX-heavy pages which load a lot of data concurrently. Previously it seems these requests were being loaded sequentially however they're now all serviced simultaneously, the difference in speed is incredible.

While there are likely some edge cases we'll uncover over the coming weeks/months as a result of turning off session locking, this appears to be the cause of the issue, and this change seems to have stopped the problem from occurring.

Dennis answered 18/3, 2017 at 12:41 Comment(0)
I
0

You need to debug in decoupled way to find out which layer is causing problems.

It can be Cake, AWS infrastructure, network latency...

Run this small PHP script and tell us the time it took.

// memcache
$m = microtime( true );
$memcache_obj = new Memcache;
$memcache_obj->connect('myhost.cache.amazonaws.com', 11211);
printf('%.5f', microtime( true ) - $m) ;

// memcached.
$time = microtime( true );
$m = new Memcached();
$m->addServer('<elasticache node endpoint>', 11211);

$m->set('foo', 100);
var_dump($m->get('foo'));
printf('%.5f', microtime( true ) - $time) ;

If time is OK, the problem will be Cake.

However being honest here, I fairly certain the problem is ElastiCache Cluster.

Try to point to and end-point of a node and not the end-point of ElastiCache Cluster and let me know how ti goes.

Icon answered 17/3, 2017 at 11:52 Comment(1)
"Memcache" is not installed, only "Memcached" - do you know how to perform this with Memcached?Dennis
D
0

We had similar problem of site becoming slow after moving sessions to Memcached on AWS (EC2 and Elasticache/Memcached). Following changes fixed the problem.

php.ini - session.lazy_write = Off
memcached.ini - memcached.sess_locking = Off

Now site is working fine, with expected speed.

But I am wondering if there is any adverse effects of turning off these settings?

Damalis answered 2/1, 2020 at 13:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.