What is a simple, effective way to debug custom Kafka connectors?
Asked Answered
B

5

15

I'm working a couple of Kafka connectors and I don't see any errors in their creation/deployment in the console output, however I am not getting the result that I'm looking for (no results whatsoever for that matter, desired or otherwise). I made these connectors based on Kafka's example FileStream connectors, so my debug technique was based off the use of the SLF4J Logger that is used in the example. I've searched for the log messages that I thought would be produced in the console output, but to no avail. Am I looking in the wrong place for these messages? Or perhaps is there a better way of going about debugging these connectors?

Example uses of the SLF4J Logger that I referenced for my implementation:

Kafka FileStreamSinkTask

Kafka FileStreamSourceTask

Borrell answered 16/8, 2017 at 15:32 Comment(0)
A
27

I will try to reply to your question in a broad way. A simple way to do Connector development could be as follows:

  • Structure and build your connector source code by looking at one of the many Kafka Connectors available publicly (you'll find an extensive list available here: https://www.confluent.io/product/connectors/ )
  • Download the latest Confluent Open Source edition (>= 3.3.0) from https://www.confluent.io/download/
  • Make your connector package available to Kafka Connect in one of the following ways:

    1. Store all your connector jar files (connector jar plus dependency jars excluding Connect API jars) to a location in your filesystem and enable plugin isolation by adding this location to the plugin.path property in the Connect worker properties. For instance, if your connector jars are stored in /opt/connectors/my-first-connector, you will set plugin.path=/opt/connectors in your worker's properties (see below).
    2. Store all your connector jar files in a folder under ${CONFLUENT_HOME}/share/java. For example: ${CONFLUENT_HOME}/share/java/kafka-connect-my-first-connector. (Needs to start with kafka-connect- prefix to be picked up by the startup scripts). $CONFLUENT_HOME is where you've installed Confluent Platform.
  • Optionally, increase your logging by changing the log level for Connect in ${CONFLUENT_HOME}/etc/kafka/connect-log4j.properties to DEBUG or even TRACE.

  • Use Confluent CLI to start all the services, including Kafka Connect. Details here: http://docs.confluent.io/current/connect/quickstart.html

    Briefly: confluent start

Note: The Connect worker's properties file currently loaded by the CLI is ${CONFLUENT_HOME}/etc/schema-registry/connect-avro-distributed.properties. That's the file you should edit if you choose to enable classloading isolation but also if you need to change your Connect worker's properties.

  • Once you have Connect worker running, start your connector by running:

    confluent load <connector_name> -d <connector_config.properties>

    or

    confluent load <connector_name> -d <connector_config.json>

    The connector configuration can be either in java properties or JSON format.

  • Run confluent log connect to open the Connect worker's log file, or navigate directly to where your logs and data are stored by running

    cd "$( confluent current )"

Note: change where your logs and data are stored during a session of the Confluent CLI by setting the environment variable CONFLUENT_CURRENT appropriately. E.g. given that /opt/confluent exists and is where you want to store your data, run:

export CONFLUENT_CURRENT=/opt/confluent
confluent current

  • Finally, to interactively debug your connector a possible way is to apply the following before starting Connect with Confluent CLI :

    confluent stop connect
    export CONNECT_DEBUG=y; export DEBUG_SUSPEND_FLAG=y;
    confluent start connect

    and then connect with your debugger (for instance remotely to the Connect worker (default port: 5005). To stop running connect in debug mode, just run: unset CONNECT_DEBUG; unset DEBUG_SUSPEND_FLAG; when you are done.

I hope the above will make your connector development easier and ... more fun!

Antebi answered 16/8, 2017 at 17:25 Comment(3)
Hey @Konstantine Karantasis, thank you for the answer. I've gotten a chance to try out Confluent, but I've found myself stuck at trying to load my connectors, is there anymore detail you could provide on the step? My connectors are packaged as such in a jar in the location you instructed: neo4k.filestream.source.Neo4jFileStreamSourceConnector and neo4k.sink.Neo4jSinkConnector; and have the following config files respectively located at ${CONFLUENT_HOME}/etc/kafka-connect-neo4j: connect-neo4j-file-source.properties and connect-neo4j-sink.properties.Borrell
Never mind, I got the load command to find the .properties file. It was just a matter of correct filepathing.Borrell
export KAFKA_DEBUG=y; Setting KAFKA_DEBUG worked for me instead of CONNECT_DEBUG.Lorenzen
C
1

i love the accepted answer. one thing - the environment variables didn't work for me... i'm using confluent community edition 5.3.1...

here's what i did that worked...

i installed the confluent cli from here: https://docs.confluent.io/current/cli/installing.html#tarball-installation

i ran confluent using the command confluent local start

i got the connect app details using the command ps -ef | grep connect

i copied the resulting command to an editor and added the arg (right after java):

-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005

then i stopped connect using the command confluent local stop connect

then i ran the connect command with the arg

brief intermission ---

vs code development is led by erich gamma - of gang of four fame, who also wrote eclipse. vs code is becoming a first class java ide see https://en.wikipedia.org/wiki/Erich_Gamma

intermission over ---

next i launched vs code and opened the debezium oracle connector folder (cloned from here) https://github.com/debezium/debezium-incubator

then i chose Debug - Open Configurations

enter image description here

and entered the highlighted debugging configuration

enter image description here

and then run the debugger - it will hit your breakpoints !!

enter image description here

the connect command should look something like this:

/Library/Java/JavaVirtualMachines/jdk1.8.0_221.jdk/Contents/Home/bin/java -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005 -Xms256M -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/var/folders/yn/4k6t1qzn5kg3zwgbnf9qq_v40000gn/T/confluent.CYZjfRLm/connect/logs -Dlog4j.configuration=file:/Users/myuserid/confluent-5.3.1/bin/../etc/kafka/connect-log4j.properties -cp /Users/myuserid/confluent-5.3.1/share/java/kafka/*:/Users/myuserid/confluent-5.3.1/share/java/confluent-common/*:/Users/myuserid/confluent-5.3.1/share/java/kafka-serde-tools/*:/Users/myuserid/confluent-5.3.1/bin/../share/java/kafka/*:/Users/myuserid/confluent-5.3.1/bin/../support-metrics-client/build/dependant-libs-2.12.8/*:/Users/myuserid/confluent-5.3.1/bin/../support-metrics-client/build/libs/*:/usr/share/java/support-metrics-client/* org.apache.kafka.connect.cli.ConnectDistributed /var/folders/yn/4k6t1qzn5kg3zwgbnf9qq_v40000gn/T/confluent.CYZjfRLm/connect/connect.properties
Compander answered 25/10, 2019 at 17:39 Comment(1)
Environment variable should work but it's KAFKA_DEBUG github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh#L220Disseminule
S
1

Connector module is executed by the kafka connector framework. For debugging, we can use the standalone mode. we can configure IDE to use the ConnectStandalone main function as entry point.

  1. create debug configure as the following. Need remember to tick "Include dependencies with "Provided" scope if it is maven project enter image description here

  2. connector properties file need specify the connector class name "connector.class" for debugging enter image description here

  3. worker properties file can copied from kafka folder /usr/local/etc/kafka/connect-standalone.properties
Spoofery answered 26/3, 2020 at 7:6 Comment(0)
P
0

Looking through the kafka startup scripts led me this command line:

KAFKA_DEBUG=1 DEBUG_SUSPEND_FLAG=y bin/connect-standalone.sh config/connect-standalone.properties config/my-connector.properties

The KAFKA_DEBUG and DEBUG_SUSPEND_FLAG env vars were enough to get the JVM to pause on start and allow me to connect from IntelliJ.

The version of IntelliJ I'm using found the process and offered it in list so I didn't need to configure anything.

Peashooter answered 22/3 at 2:29 Comment(0)
L
0

This may help you. Also KAFKA_DEBUG=Y DEBUG_SUSPEND_FLAG=y doesn't work in Kafka within Docker Compose. See my answer in Kafka Connect in debug mode

Lancelle answered 25/4 at 21:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.