Spring Boot Actuator - MAX property
Asked Answered
S

3

10

I am using Spring Boot Actuator dependency to get insights of application. For that, I have used Spring Boot Admin. Configuration for client-server is working fine. I have to measure the count, total-time, max for endpoints which are going to execute.

uri:/user/asset/getAllAssets
TOTAL_TIME: 831ms
MAX: 0ms 

uri:/user/getEmployee/{employeeId}
TOTAL_TIME: 98ms
MAX: 0ms

Why MAX (time) is 0 while TOTAL_TIME: is Xms

Spring Boot Admin Image

While I execute generalize form

localhost:8889/actuator/metrics/http.server.requests I get the MAX as 3.00..

I had also seen production-ready-features but not able to find any description about how MAX is calculated or what does it represent

Notes: with the number of request in an increase, COUNT, TOTAL_TIME is also getting an increase but MAX is reducing sometimes (see Request 1, Request 2 for details)

Request 1: http.server.requests

 {
        "name": "http.server.requests",
        "description": null,
        "baseUnit": "seconds",
        "measurements": [
            {
                "statistic": "COUNT",
                "value": 597
            },
            {
                "statistic": "TOTAL_TIME",
                "value": 144.9057076
            },
            {
                "statistic": "MAX",
                "value": 3.0002913
            }
        ],
        "availableTags": [
            {
                "tag": "exception",
                "values": [
                    "None"
                ]
            },
            {
                "tag": "method",
                "values": [
                    "GET"
                ]
            },
            {
                "tag": "uri",
                "values": [
                    "/actuator/metrics/{requiredMetricName}",
                    "/**/favicon.ico",
                    "/actuator",
                    "/user/getEmployee/{employeeId}",
                    "/user/asset/getAllAssets",
                    "/actuator/health",
                    "/actuator/info",
                    "/actuator/env/{toMatch}",
                    "/actuator/metrics",
                    "/**"
                ]
            },
            {
                "tag": "outcome",
                "values": [
                    "CLIENT_ERROR",
                    "SUCCESS"
                ]
            },
            {
                "tag": "status",
                "values": [
                    "404",
                    "200"
                ]
            }
        ]
    }

UPDATE

localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/2

Response 404 (I have executed /user/getEmployee/2 before making a request for actuator)


localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/{employeeId}

Response 400


localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/asset/getAllAssets

{
    "name": "http.server.requests",
    "description": null,
    "baseUnit": "seconds",
    "measurements": [
        {
            "statistic": "COUNT",
            "value": 1
        },
        {
            "statistic": "TOTAL_TIME",
            "value": 0.8311609
        },
        {
            "statistic": "MAX",
            "value": 0
        }
    ],
    "availableTags": [
        {
            "tag": "exception",
            "values": [
                "None"
            ]
        },
        {
            "tag": "method",
            "values": [
                "GET"
            ]
        },
        {
            "tag": "outcome",
            "values": [
                "SUCCESS"
            ]
        },
        {
            "tag": "status",
            "values": [
                "200"
            ]
        }
    ]
}

Request 2: http.server.requests

localhost:8889/actuator/metrics/http.server.requests

{
    "name": "http.server.requests",
    "description": null,
    "baseUnit": "seconds",
    "measurements": [
        {
            "statistic": "COUNT",
            "value": 3346
        },
        {
            "statistic": "TOTAL_TIME",
            "value": 559.7992767999998
        },
        {
            "statistic": "MAX",
            "value": 2.3612968
        }
    ],
Studio answered 29/7, 2019 at 4:25 Comment(3)
Can you check localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/{employeeId} & localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/asset/getAllAssets, they ought to be identical as that screen with MAX = 0 again. Otherwise you are seeing an aggregation of metrics of all endpoints with root call.Reflect
Thanks for response @buræquete, for /user/getEmployee/{employeeId} I get 404,400 and for uri:/user/asset/getAllAssets same result MAX as 0Studio
@buræquete with the number of request in an increase, COUNT, TOTAL_TIME is also getting an increase but MAX is reducing sometimes (see Request 1, Request 2 for details), Can you please give an idea how MAX is calculated or what does is represnt, Thank youStudio
S
1
  • What does MAX represent

MAX represents the maximum time taken to execute endpoint.

Analysis for /user/asset/getAllAssets

COUNT  TOTAL_TIME  MAX
5      115         17
6      122         17  (Execution Time = 122 - 115 = 17)
7      131         17  (Execution Time = 131 - 122 = 17)
8      187         56  (Execution Time = 187 - 131 = 56)  
9      204         56  From Now MAX will be 56 (Execution Time = 204 - 187 = 17)  

  • Will MAX be 0 if we have less number of request (or 1 request) to the particular endpoint?

No number of request for particular endPoint does not affect the MAX


  • When MAX will be 0

There is Timer which set the value 0. When the endpoint is not being called or executed for sometime Timer sets MAX to 0. Here approximate timer value is 2.30 minutes (150 seconds)


  • How I have determined the timer value?

For that, I have taken 6 samples (executed the same endpoint for 6 times). For that, I have determined the time difference between the time of calling the endpoint - time for when MAX set back to zero

DistributionStatisticConfig has .expiry(Duration.ofMinutes(2)).bufferLength(3) which sets some measurements to 0 if there is no request has been made in between expiry time or rotate time.


MAX property belongs to enum Statistic which is used by Measurement (In Measurement we get COUNT, TOTAL_TIME, MAX)

public static final Statistic MAX

The maximum amount recorded. When this represents a time, it is reported in the monitoring system's base unit of time.


Notes: This is the cases from metric for a particular endpoint (here /actuator/metrics/http.server.requests?tag=uri:/user/asset/getAllAssets).

For generalize metric of actuator/metrics/http.server.requests

As you can see from Request 1, Request 2 (in question) the MAX has been reduced (from 3.0002913 to 2.3612968) so that maybe because of MAX for some endPoint will be set backed to 0 due to a timer. In my view for MAX for /http.server.requests will be same as a particular endpoint. (but sure on that, investigating on it)

Studio answered 29/7, 2019 at 9:14 Comment(0)
H
12

The MAX metrics is a rolling max. So it represents the maximum measurement in a rolling window.

For example if you were to scrape your metrics every minute:

          Total    Count   Max
Minute 1    100        1   100  
Minute 2    500      101    90
Minute 3   4500     1000    10
Minute 4   4500     1000     0

In minute 1 you had 1 request, and a total of 100ms, so the average duration was 100ms, and the slowest (the max) was 100ms

In minute 2 total has increased by 400 (since total is cummulative) and count has increased by 100. So average is 4ms. However since the max is 90ms, then you know that while most of your requests in that second were fast, there were still some that were slower.

In minute 3 you had 899 more requests (count) and 4000ms added to the total. (4000/899 = ~4.4ms) So your average measurement was 4.4ms and the max was 10ms.

So the purpose of the MAX is to measure the worst outlier so you know how consistent the code is performing.

Looking at minute 4, the total and count haven't increased because there were no requests. Since there were no requests, then there couldn't be a 'slowest' request for the MAX, and that is why the MAX is 0.

Handgun answered 16/9, 2019 at 14:28 Comment(3)
Good to see your answer, yes you are right that if we do not make any request for expiry time or time to rotate MAX will be 0 after that duration.Studio
What I have found is DistributionStatisticConfig has .expiry(Duration.ofMinutes(2)).bufferLength(3) which sets some measurements to 0 if there is no request has been made in between expiry time or rotate time. it is the code that rotate the MAX window, is it?Studio
I'm not certain, since I haven't tried tweaking that property myself. I believe it is configurable, so I think you've found the correct one, though I'm not certain of the bufferLength(3) part.Handgun
M
2

You can see the individual metrics by using ?tag=url:{endpoint_tag} as defined in the response of the root /actuator/metrics/http.server.requests call. The details of the measurements values are;

  • COUNT: Rate per second for calls.
  • TOTAL_TIME: The sum of the times recorded. Reported in the monitoring system's base unit of time
  • MAX: The maximum amount recorded. When this represents a time, it is reported in the monitoring system's base unit of time.

As given here, also here.


The discrepancies you are seeing is due to the presence of a timer. Meaning after some time currently defined MAX value for any tagged metric can be reset back to 0. Can you add some new calls to /user/asset/getAllAssets then immediately do a call to /actuator/metrics/http.server.requests to see a non-zero MAX value for given tag?

This is due to the idea behind getting MAX metric for each smaller period. When you are seeing these metrics, you will be able to get an array of MAX values rather than a single value for a long period of time.

You can get to see this in action within Micrometer source code. There is a rotate() method focused on resetting the MAX value to create above described behaviour.

You can see this is called for every poll() call, which is triggered every some period for metric gathering.

Millennial answered 29/7, 2019 at 5:23 Comment(13)
Yes, there is no proper documentation for it. Apart from it can you please clarify why I am getting 404 for localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/2 or how we can see metric for endpoint which has path variableStudio
@PatelRomil you should use how it is defined in the controller not with hardcoded values like 2, e.g. /user/getEmployee/{employeeId}, not sure why are you getting 400 with that, you can see that in the availableTags of default /http.server.requests responseReflect
@RequestMapping(value="/user/getEmployee/{employeeId}",method=RequestMethod.GET) public ResponseEntity<User> getUser(@PathVariable("employeeId") BigInteger employeeId ), to get profile views with id COUNT might be helpfulStudio
@PatelRomil you cannot do that with /metrics endpoint unless you add some custom logic I think. You should be able to use localhost:8889/actuator/metrics/http.server.requests?tag=uri:/user/getEmployee/{employeeId} try to call it from different environments, maybe you added some unaccepted char in it? 400 is not normal error response for a valid URLReflect
For 400 it was my bad as I have directly pasted with {employeeId} but I have replaced it with employee Id which has been executed previously which results in 404. I am sure that no extra char is appended to itStudio
@PatelRomil can you just copy directly from the response of /http.server.requests ? Find that endpoint as tag there, under availableTags & use exactly that, maybe some weird char issue, if not, you can ask that as a separate question I think!Reflect
Yes, I have copied from /http.server.requests but with no luck. Thank you for the timeStudio
@PatelRomil btw that screen you added on your question, that page must've called that endpoint with a correct URL, since it can list its values there. You can try to check the page network while loading to confirm which endpoints it is calling?Reflect
Let us continue this discussion in chat.Studio
I feel a bit confused about how to configure the tag? {url}/actuator/metrics/http.server.requests works for me. But when trying a specific tag, say, {url}/actuator/metrics/http.server.requests?tag=uri:/test/1, I just get the site can't be reached error.Noctambulism
@Vicky you don't put the value 1, just keep the {paramName} whatever you have for the name there, like /user/getEmployee/{employeeId}Reflect
No, I think I'm correct, the test controller is like this, @RestController @RequestMapping("/test/") public class testController { @GetMapping("1") public String create() { return "helloworld"; } }Noctambulism
The reason why requests with {employeeId} give you 400 is that it needs to be URL-escaped to %7BemployeeId%7D.Isom
S
1
  • What does MAX represent

MAX represents the maximum time taken to execute endpoint.

Analysis for /user/asset/getAllAssets

COUNT  TOTAL_TIME  MAX
5      115         17
6      122         17  (Execution Time = 122 - 115 = 17)
7      131         17  (Execution Time = 131 - 122 = 17)
8      187         56  (Execution Time = 187 - 131 = 56)  
9      204         56  From Now MAX will be 56 (Execution Time = 204 - 187 = 17)  

  • Will MAX be 0 if we have less number of request (or 1 request) to the particular endpoint?

No number of request for particular endPoint does not affect the MAX


  • When MAX will be 0

There is Timer which set the value 0. When the endpoint is not being called or executed for sometime Timer sets MAX to 0. Here approximate timer value is 2.30 minutes (150 seconds)


  • How I have determined the timer value?

For that, I have taken 6 samples (executed the same endpoint for 6 times). For that, I have determined the time difference between the time of calling the endpoint - time for when MAX set back to zero

DistributionStatisticConfig has .expiry(Duration.ofMinutes(2)).bufferLength(3) which sets some measurements to 0 if there is no request has been made in between expiry time or rotate time.


MAX property belongs to enum Statistic which is used by Measurement (In Measurement we get COUNT, TOTAL_TIME, MAX)

public static final Statistic MAX

The maximum amount recorded. When this represents a time, it is reported in the monitoring system's base unit of time.


Notes: This is the cases from metric for a particular endpoint (here /actuator/metrics/http.server.requests?tag=uri:/user/asset/getAllAssets).

For generalize metric of actuator/metrics/http.server.requests

As you can see from Request 1, Request 2 (in question) the MAX has been reduced (from 3.0002913 to 2.3612968) so that maybe because of MAX for some endPoint will be set backed to 0 due to a timer. In my view for MAX for /http.server.requests will be same as a particular endpoint. (but sure on that, investigating on it)

Studio answered 29/7, 2019 at 9:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.