Apache Nifi - Extract Attributes From Avro
Asked Answered
K

2

7

I'm trying to get my head around on extracting attributes from Avro and JSON. I'm able to extract attributes from JSON by using EvaluateJsonPath processor. I'm trying to do the same on Avro, but i'm not sure whether it is achievable.

Here is my flow, ExecuteSQL -> SplitAvro -> UpdateAttribute

UpdateAttribute is the processor where i want to extract the attributes. Please find below snapshot of UpdateAttribute processor,

UpdateAttribute Processor COnfiguration

So, my basic question is, could we extract attributes form Avro? If yes, please provide me the right approach. Or is it necessary to use ConvertAvroToJSON always before extracting the attributes?

Klystron answered 27/2, 2017 at 22:7 Comment(0)
R
14

Currently, there is no way in NiFi to extract attributes directly from Avro (there is not yet an AvroPath like XPath for XML or JsonPath for JSON) so as you said you can use ConvertAvroToJSON before extracting the attributes.

Alternatively, I wrote a Groovy script for use in an ExecuteScript processor, it takes "Avro path" values as dynamic properties (each starting with avro.path and whose value is really JsonPath), does the conversion of Avro to JSON in memory, and requires you download and point to the Avro JARs. I can post it here if you are interested, but really its only advantage is to maintain the flow file content in Avro, and although it might be annoying, you could use ConvertAvroToJson -> EvaluateJsonPath -> ConvertJsonToAvro as the workaround.

Rafaelarafaelia answered 28/2, 2017 at 3:2 Comment(3)
thanks again for helping. I will use ConvertAvroToJson -> EvaluateJsonPath -> ConvertJsonToAvro as the workaroundKlystron
Couldn't you also extract the attributes from the JSON content and then merge the attributes back onto the flowfile with the Avro content?Schoenfeld
@Andy, i can extract the attributes from JSON using EvaluateJSONPath. But have never tried to merge the attributes with the Avro. I was trying to extract attributes from Avro and Json.Klystron
A
2

Maybe use PartitionRecord instead of SplitAvro and UpdateAttribute processors - it will partition your records based on the attributes you provide, hence no need for explicit splitting, or you can do splitting later in the flow.

E.g., for the setup in OP's question (ExecuteSQL -> SplitAvro -> UpdateAttribute):

enter image description here

And you can configure PartitionRecord like below, with a corresponding RecordPath for each attribute:

enter image description here

Audacious answered 21/11, 2023 at 5:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.