Prometheus query to count unique label values
Asked Answered
G

4

63

I want to count number of unique label values. Kind of like

select count (distinct a) from hello_info

For example if my metric 'hello_info' has labels a and b. I want to count number of unique a's. Here the count would be 3 for a = "1", "2", "3".

hello_info(a="1", b="ddd")
hello_info(a="2", b="eee")
hello_info(a="1", b="fff")
hello_info(a="3", b="ggg")
Garden answered 16/8, 2018 at 17:6 Comment(0)
B
110
count(count by (a) (hello_info))

First you want an aggregator with a result per value of a, and then you can count them.

Boyceboycey answered 16/8, 2018 at 17:32 Comment(2)
Doesn't seem to work for me, the first count returns a matrix of metrics (where each value is a vector too), and running count on it returns a nonsensical value such as "1.3" ... any ideas?Concussion
I think this query returns a time series rather than a number of all values ever recorded for a particular label.Freeload
A
15

Other example: If you want to count the number of apps deployed in a kubernetes cluster based on different values of a label( ex:app):

count(count(kube_pod_labels{app=~".*"}) by (app))
Accompanist answered 26/2, 2019 at 11:59 Comment(0)
B
10

The count(count(hello_info) by (a)) is equivalent to the following SQL:

SELECT
  time_bucket('5 minutes', timestamp) AS t,
  COUNT(DISTINCT a)
FROM hello_info
GROUP BY t 

See time_bucket() function description.

E.g. it returns the number of distinct values for a label per each 5-minute interval by default - see staleness docs for details about 5-minute interval.

If you need to calculate the number of unique values for a label over custom interval (for example, over the last day), then the following PromQL query must be used instead:

count(count(last_over_time(hello_info[1d])) by (a))

The custom interval - 1d in the case above - can be changed to an arbitrary value - see these docs for possible values, which can be used there.

This query uses last_over_time() function for selecting all the time series, which were active during the last day. Time series can stop receiving new samples and become inactive at any time. Such time series aren't captured with simple count(...) by (a) after 5 minutes of inactivity. New deployments in Kubernetes and horizontal pod autoscaling are the most frequent source of big number of inactive time series (aka high churn rate).

Barby answered 29/3, 2022 at 15:40 Comment(0)
G
0

What worked for me was the following:

count(count by (a)(delta(hello_info[10m]) > 0)) OR on() vector(0)

It counts the number of distinct non-zero metrics.

Gathers answered 29/3, 2023 at 12:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.