flink - adding instrumentation
Asked Answered
D

2

7

I want to add NewRelic instrumentation to my flink jobs. I don't see where it's possible to pass additional classpath / other params to the bin/flink run <job> command.

The NewRelic java agent wants -javaagent:<path to jar> added to the execution path. Passing in the config file path is advisable as well.


Edit:

I added this line to my conf/flink-conf.yaml on all (3) cluster machines:

env.java.opts: "-javaagent:/opt/newrelic/newrelic.jar -Dnewrelic.config.file=/opt/newrelic/newrelic.yml"

When I go to start the cluster only the job manager will start. The task manager doesn't start on any of the machines.

The only way I've found to add instrumentation (so far) is to change the command line at the end of bin/flink to include the above parameters. This is fine except that it requires the session where the command was running to remain open. If you exit out, the Flink job keeps going but the NewRelic agent quits.

Debug answered 22/11, 2015 at 13:57 Comment(8)
What does the JobManager log say when you start it with the modified flink-conf.yaml?Thorley
No error messages. The UI shows no tasks slots or tasks managers available.Debug
My apologies - had the task and job managers backwards in my edit.Debug
Ok, what does the taskmanager.log say on the different machines?Thorley
No log file is created.Debug
Could you try to start a TaskManager manually with the env.java.opts set on the TaskManager nodes via taskmanager.sh start? You find the script in the bin/ directory. If there are no logs, then this mean that the JVM has problems to be started, which is really odd.Thorley
Let us continue this discussion in chat.Debug
@Debug trying to do something similar to find memory issues in Flink? Did adding relic agent work for you?Undeniable
T
4

You can pass additional JVM start-up parameters via the env.java.opts config value which you can set in Flink's configuration file flink-conf.yaml.

Thorley answered 22/11, 2015 at 14:19 Comment(2)
Is there a way to set this on a per-job basis? Each topology will have a distinct newrelic.yml file.Debug
If you run the jobs sequentially on the same Flink cluster, then this is not possible. The reason is that the TaskManager and the JobManager are re-used for the different jobs. However, if you use YARN, then you can start the jobs in single-job mode, which starts up a YARN cluster for each job and terminates it when the job is done. For each these sessions you can define via dynamic properties -Denv.java.opts=xyz the configuration settings (se ci.apache.org/projects/flink/flink-docs-release-0.10/setup/…)Thorley
H
2

First remove the quotes in the value(right side)

env.java.opts: -javaagent:/opt/newrelic/newrelic.jar -Dnewrelic.config.file=/opt/newrelic/newrelic.yml

And also make sure that you put the files in "lib" directory of flink and rewrite the command as

env.java.opts: -javaagent:lib/newrelic.jar -Dnewrelic.config.file=lib/newrelic.yml

All the files in "lib" directory will be copied to job manager and task managers and available in the relative path "./lib"

Hemispheroid answered 24/10, 2017 at 16:48 Comment(1)
Happy to hear thatHemispheroid

© 2022 - 2024 — McMap. All rights reserved.