How arrive at an Apdex Threshold value based on the SLA?
Asked Answered
C

1

0

We have a REST API available. For each of the endpoints that this API offers, we have a defined SLA based on the internal testing. New Relic provide an option to define the Apdex T score on a per application basis. Considering a scenario as follows:

  • Endpoint A: SLA is 200ms
  • Endpoint B: SLA is 800ms
  • Average SLA: 500ms

    Case 1: Consider the average SLA for the Apdex Threshold value The problem with this approach is that even though my endpoint A is expected to completed in 200ms, it wouldn't be flagged even if the endpoint takes twice the time defined in the SLA since it would still be less than the average value. Vice-versa would be the case for endpoint B, where it would be flagged even if it was below 800ms.

    Case 2: Consider the max SLA(800ms) of all the endpoints as the Apdex T value Again the problem, here would be with the endpoint A. Any delay in response from this endpoint wouldn't be flagged even if take 4 times the actual expected time.

So, how do we arrive at an Apdex Threshold value in such scenarios? I went through the following article from New relic: LINK. This makes sense when we look the service as a whole, but not when we look at each of the endpoints.

Cyclopropane answered 24/7, 2019 at 9:24 Comment(0)
E
1

Are you sure you want to set Apdex based on your SLA?

I would suggest that typical performance of the application is the better metric to be looking at. Lets say if over the last 7 days your application has an average performance. However in the "How to set an Apdex T", the article suggests using a percentile for your typical performance.

So if you get the 90th Percentile, it should result typically in a near 0.95 Apdex Score. Obviously Apdex of 1 is useless as you're not holding your account to near enough account. So I would individually ask Insights

select percentile(duration, 90) from Transaction where appName="AppA" since 7 days ago

select percentile(duration, 90) from Transaction where appName="AppB" since 7 days ago

This will give you a response time that 90% of your customers are getting better than. So should be a good rough guide as to your Apdex T value.

If however your goal is that on App A where SLA is 200ms and ANY transaction over that should be 0 points towards the Apdex score. Then quite simply your Apdex T should be 50ms. Because anything faster than 50ms gets 1 point, anything between Apdex T and 4 x Apdex T gets 0.5 points, but at least is still scoring. Anything slower than 4 x Apdex T (in this scenario 200ms) gets 0 points towards Apdex. So that would give you transactions marked as Frustrated for Apdex if they violate the SLA.

Apdex is a bit of an art but you can definitely get to where you need with either of the above. I hope I covered off the two scenarios I see as being likely in this case.

Engedi answered 24/7, 2019 at 14:22 Comment(4)
I understand the reason for leveraging the percentile here. However, why is it not correct or rather a bad idea to use the SLA for a service to set the Apdex T score? My understanding was a little different. So, what I understand is that the SLA is the maximum response time that the service should take below which it is considered good. Then between T and 4T would be tolerating and beyond that it would be in a frustrating range.Cyclopropane
I guess it depends on how you want to look at your SLA or how stringent is it If SLA is 1 second. and you have a request that is 2 seconds. You are in breach of your SLA. Can you be losing money in a contract for every request over 1 seconds?Engedi
If that was the case, I wouldn't EVER want a request over 1 second, so I would say my transaction time needs to be WAY faster, and that my Apdex should consider anything faster than 250ms as good, or Satisfied, anything between 250ms and 1000ms as tolerated and anything over 1000ms as frustrated. Because we are deeming 1 second to be the point at which we're losing revenue and don't want to be over that.Engedi
Yes, so it is expected that the services adheres to the defined SLA limits. However, money factor is a good point of consideration. I would probably need to figure out how much of margin we would want to keep (if at all), or if we want to be very stringent with the defined SLA. Post that it would be much easier to arrive at a Apdex T value. Thanks for the clarification!Cyclopropane

© 2022 - 2024 — McMap. All rights reserved.