Grafana expression for prometheus histogram

Can anyone help me with visualising a prometheus histogram as both a chart and apdex please?

Ignoring any secondary labelling (for now) I'd just like to be able to visualise them as a histogram on Grafana (stacked bar chart is fine) and it would be really useful to also show the apdex in grafana.

Examples of the buckets from the prometheus web console

someoperation_duration_seconds_bucket{
    labelOne="some_consistent_label",
    exported_instance="foo",
    exported_job="my_job",
    instance="10.0.0.0:9091",
    job="kubernetes-service-endpoints",
    kubernetes_name="prometheus-push-gateway",
    kubernetes_namespace="monitoring",
    le="+Inf",
    labelTwo="some_label_that_changes1"
}

someoperation_duration_seconds_bucket{
    labelOne="some_consistent_label",
    exported_instance="foo",
    exported_job="my_job",
    instance="10.0.0.0:9091",
    job="kubernetes-service-endpoints",
    kubernetes_name="prometheus-push-gateway",
    kubernetes_namespace="monitoring",
    le="120",
    labelTwo="some_label_that_changes1"
} etc etc

I've viewed this post how can I visualize a histogram with promdash or grafana and got the chart showing as a stacked bar with the series being the 'le' (bucket) values however the value for the Y axis for each bucket has exactly the same value.

Because of the nature of the operation the metrics are collected via a PushGateway. Not sure if that has an impact.

Many thanks all

Equally sized buckets are best for the panel type histogram. Three carefully chosen buckets are perfect for apdex calculation.

Histogram

Select Histogram as panel type and use the following query:

someoperation_duration_seconds_bucket{
    labelOne="some_consistent_label",
    exported_instance="foo",
    exported_job="my_job",
    instance="10.0.0.0:9091",
    job="kubernetes-service-endpoints",
    kubernetes_name="prometheus-push-gateway",
    kubernetes_namespace="monitoring",
    labelTwo="some_label_that_changes1"
}

Note that I just copied your own query but ommitted the label le.

I suppose, that you are interested in the change of the values over time, so it might be useful to visualize the following query instead:

increase(someoperation_duration_seconds_bucket{ ... }[TIME])

TIME could be 10m, 1h or 1d, depending on your requirements.

The X-axis will not use the same bucket sizes as you defined them in your application. You can set a bucket size as a panel option, but this might not be what you are searching for.

Apdex

You can only calculate an apdex, if you know, which performance is acceptable by your customer. Let us assume 80 to 120 Seconds are tolerated, everything faster is fine, everything slower is not acceptable. Let's now assume that you defined only three buckets b1=[0,80], b2=[0,120], b3=[0,+Inf]. The apdex will be calculated as follows:

((count(b1) + (count(b2)-count(b1)) * 0.5)) / count(b3)

You've allready given the formula to calculate the count of a bucket:

someoperation_duration_seconds_bucket{
    ...,
    le="BUCKET",
    ...
}

For my example, BUCKET will be 80 and 120. Bucket +Inf gives you the total count. It might again be useful to calcuate the increase over time:

increase(someoperation_duration_seconds_bucket{
    ...,
    le="BUCKET",
    ...
}[TIME])

The only problem here is, that you need to know the bucket sizes defined by your application. And those sizes must match the tolerated performance boundaries of your customer. We could not calcuate the apdex in the example above, if 80 and 120 where no bucket boundaries. More than three buckets would be fine, but you'll need at least those three buckets that match the tolerated performance boundaries.

I found this answer quite interesting as it calculates an apdex threshold that corresponds to some SLA agreement but takes a different approach than I did.

Recommended topics

Hot tags