How do we Write a Apache beam pipeline output to a variable instead of a file?
Asked Answered
A

0

7

I need to process some values in a data pipeline and need to use the value later somewhere in the program.

Here is a simple example

import apache_beam as beam

p = beam.Pipeline()

resu=(
    p
    | beam.Create([1,3,5,3,5,3])
    | beam.CombineGlobally(beam.combiners.MeanCombineFn())
    | beam.io.WriteToText("result.txt")
)

p.run()

Now the mean value is calculated and put into the file "result.txt". If I need to use the mean value later in the program I need to do a file io operation. I want to have the result come in memory as a variable instead. How do I achieve this?

something like

mean_value=resu.values()
# use mean_value as a regular variable
some_other_value=mean_value/2
Arturo answered 21/3, 2020 at 11:6 Comment(3)
did you got the answer?Heedful
I believe this cannot be done.Arturo
One thing can be done using something called beam.pvalue.AsSingleton to provide the value as side input to another Map/ParDo Refer: linkHeedful

© 2022 - 2024 — McMap. All rights reserved.