Which jmx metric should be used to monitor the status of a connector in kafka connect?
Asked Answered
C

3

5

I'm using the following jmx metrics for kafka connect.

Chisel answered 11/5, 2018 at 11:18 Comment(0)
W
11

Have a look at Connect Monitoring section in the Kafka docs, it lists all the Kafka Connect specific metrics.

For example there are overall metrics for each connector:

  • kafka.connect:type=connector-metrics,connector="{connector}" which contains a connector status (running, failed, etc)

  • kafka.connect:type=connector-task-metrics,connector="{connector}",task="{task}" which contains the status of individual tasks

If you want more than just the status, there are also additional metrics for both sink and source tasks:

  • kafka.connect:type=connector-task-metrics,connector="{connector}",task="{task}"
  • kafka.connect:type=sink-task-metrics,connector="{connector}",task="{task}"
Weig answered 11/5, 2018 at 11:48 Comment(4)
Thanks @Mickael! I checked the connect monitoring before posting the question but couldn't find the metrics specified in emitted metrics of jmx. Can you check [this] (gist.github.com/jigar1101/ba02fef8c8a8ccb7550b75e95db2abcd) and let me know where would the above metric be in this emitted metrics.Chisel
There's no Connector metrics in your output. Do you have any Connectors in your Connect cluster ? Which version of Connect are you running ?Weig
Yes I have connectors deployed in my Connect cluster. The version is 1.1.0-cp1 Any example of what connector metrics would be emitted? The list I have uploaded contains the metrics emitted when I do a curl request to the metrics api. (Without the numbers)Chisel
After your response stating no Connector metrics in my output, I checked the metrics again (Have updated the gist) I found that from the examples you mentioned above, connector-task-metrics, sink-task-metrics are now visible but still connector-metrics which contains the connector status and what I was looking for is not present. Any suggestions what I could be missing?Chisel
S
1

I still don't have enough rep to comment but I can answer...

Elaborating on Mickael's answer, be careful: currently task metrics disappear when a task is in a failed state rather than show up with the FAILED status. A Jira can be found here and PR can be found here

Stencil answered 30/4, 2020 at 20:14 Comment(1)
This has been fixed since Kafka 2.6.Weig
A
0

Connector status is available under kafka.connect:type=connector-metrics. With jmxterm, you may notice that the attributes are described as doubles instead of strings:

$>info
#mbean = kafka.connect:connector=dev-kafka-connect-mssql,type=connector-metrics
#class name = org.apache.kafka.common.metrics.JmxReporter$KafkaMbean
# attributes
  %0   - connector-class (double, r)
  %1   - connector-type (double, r)
  %2   - connector-version (double, r)
  %3   - status (double, r)

$>get status
#mbean = kafka.connect:connector=dev-kafka-connect-mssql,type=connector-metrics:
status = running;

This resulted in WARN logs from my monitoring agent:

    2018-05-23 14:35:53,966 | WARN | JMXAttribute | Unable to get metrics from kafka.connect:type=connector-metrics,connector=dev-kafka-connect-rabbitmq-orders - status
java.lang.NumberFormatException: For input string: "running"
        at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
        at java.lang.Double.parseDouble(Double.java:538)
        at org.datadog.jmxfetch.JMXAttribute.castToDouble(JMXAttribute.java:270)
        at org.datadog.jmxfetch.JMXSimpleAttribute.getMetrics(JMXSimpleAttribute.java:32)
        at org.datadog.jmxfetch.JMXAttribute.getMetricsCount(JMXAttribute.java:226)
        at org.datadog.jmxfetch.Instance.getMatchingAttributes(Instance.java:332)
        at org.datadog.jmxfetch.Instance.init(Instance.java:193)
        at org.datadog.jmxfetch.App.instantiate(App.java:604)
        at org.datadog.jmxfetch.App.init(App.java:658)
        at org.datadog.jmxfetch.App.main(App.java:140)

Each monitoring system may have different fixes, but I suspect the cause may be the same?

Abranchiate answered 24/5, 2018 at 10:5 Comment(1)
Thanks for your input, but I couldn't find any logs related to the ones you mentioned.Chisel

© 2022 - 2024 — McMap. All rights reserved.