I am trying to configure Recording rule and according to documentation, it is not clear, how to set it up.
I configured rules.yml file in /loki/rules
directory. According the doc Recording rules, I implement my own rule:
name: MyRules
interval: 1m
rules:
- record: generator:requests:rate2m
expr: |
sum(
rate({service="generator_generator"}[2m])
)
labels:
cluster: "something"
At first, this does not make anything, no logs in Loki about wrong format, no metrics in Prometheus (remote write). After that, I copy this file also to directory rules-temp
and also to the /loki/rules/fake/
directory, based on doc Ruler storage. From the doc, I am not sure, where this file should be located so I copied it everywhere. The result was the same - no logs in Loki, nothing in Prometheus.
After day off, I started Loki and find out log:
2022-11-03T08:24:24.062210590Z level=error ts=2022-11-03T08:24:24.061854756Z caller=ruler.go:497 msg="unable to list rules" err="failed to list rule groups for user fake: failed to list rule group for user fake and namespace rules.yml: error parsing /loki/rules/fake/rules.yml: /loki/rules/fake/rules.yml: yaml: unmarshal errors:\n line 1: field name not found in type rulefmt.RuleGroups\n line 2: field interval not found in type rulefmt.RuleGroups\n line 3: field rules not found in type rulefmt.RuleGroups"
This log was not there before, even when I restart Loki, it is not there, do not understand why. But I assume, Loki cannot parse my rules file. I found out corterx-tool for validating Loki rules. After few run, I ended up with new rules.yml file:
namespace: rules
groups:
- name: MyRules
interval: 1m
rules:
- record: generator:requests:rate1m
expr: |-
sum(rate({service="generator_generator"}[2m]))
labels:
cluster: something
It is quiet different from the one in docs, but It looks like its ok:
$ cortextool rules lint --backend=loki rules.yml
INFO[0000] SUCCESS: 1 rules found, 0 linted expressions
After this small success I run Loki again but no result in Loki logs or Prometheus. I tried even set wrong prometheus remote write addres but Loki does not log anything about this error.
My current configuration of Loki ruler:
ruler:
alertmanager_url: http://localhost:9093
remote_write:
enabled: true
client:
url: http://prometheus:9090/api/v1/write
Prometheus runs in default configuration.
Versions: Loki: 2.6.1 Prometheus: v2.39.1
Questions:
- Where should be rule file located and whats the difference between
/rules
,/rules-temp
and/rules/<tenant-id>
? - What is the format of rules and rule files? Can there be multiple files?
- Why the log about rules does not occur in Loki logs (wrong Prometheus url, wrong rules.yml format)?
- How to properly configure rules (both Recording and Alerting) in Loki? Documentation looks very unclear.
- How to debug this configuration and setup? Basically, I do not know where to check, if something is wrong with no logs or any information about it.
Thanks for any tips.