How Can Apache NiFi Flow Be Tested?
Asked Answered
D

1

5

We have started using NiFi for a lot of our data pipeline jobs. One of things which is challenging in Nifi is to do regression testing of the changes to the flows.

What are the common ways to handle unit and functional testing of NiFi flows? Are there any frameworks?

Ditmore answered 16/7, 2019 at 17:2 Comment(0)
A
11

There is a lot that can be written on this topic, but I'll try to keep it focused and brief.

  • Unit testing
    • The NiFi framework comes with extensive testing utilities for the framework itself as well as individual processors. You can examine the test code of any bundled processor to see common test patterns (testing a specific logic method vs. testing the execution of arbitrary flowfiles through the TestRunner mock execution). Many mock classes and services are available to streamline these tests. Example: TestEncryptContent
    • Groovy unit testing and Spock are also supported as test frameworks to allow for descriptive scenarios. Example: StandardHttpResponseMapperSpec
  • Integration testing
    • You can also build dynamic flows in test code (i.e. configure multiple processors and connections) and then pass in arbitrary data to evaluate behavior. Building the flow programmatically may take some time at first, but once complete, you'll have a repeatable flow definition you can use with many different input characteristics. Example: ITestHandleHttpRequest
    • You can test the application of variables, etc. on process groups. Example: StandardProcessGroupIT
    • You can use Docker containers to test dependent services like MongoDB, etc. Some OS-integration features are tested with containers using TestContainers. Example: ShellUserGroupProviderIT
  • Smoke testing
    • You can have a special bucket in your NiFi Registry which contains "test flows" used to establish baselines on a new/upgraded NiFi instance. Perhaps one flow tries to exhaust memory, another network, another CPU via heavy processing, etc. You can deploy these versioned flows onto a new system and run them to determine performance in common known scenarios.
    • You can replay specific flowfiles through a flow after modifying it to gather more information during flow development, tighten the feedback loop, and verify expected behavior. NiFi User Guide - Replaying a Flowfile
    • You can use GenerateFlowFile to mock static or dynamic flowfile content and attributes, which you can feed into a process group where the "flow under test" is deployed. From the FUT's perspective, this is no different from a production scenario. When the flow is updated, the same GFF can be used to "verify" the new behavior, and then it can be disabled and the "production" input connection can be dragged onto the same Input Port. More examples in my presentation BYOP: Custom Processor Development with Apache NiFi (slides)
Apollyon answered 16/7, 2019 at 17:48 Comment(3)
Thanks Andy. I will start looking into each one of theseDitmore
Hello Andy, Could you elaborate on how to use the TestRunner with multiple processors? For example, 2 processors that share usage of the same (stateless) controllerservice. having to repeat controllerservice configuration for testrunner of each processor leads to a lot of duplicated codeJestude
@Jestude I have had the same issues before too and I fixed this by having a static TestHelper class that controls the logic for instantiating the controller services.Californium

© 2022 - 2024 — McMap. All rights reserved.