I'm hoping someone can clarify the relationship between TensorFlow and its dependencies (Beam, AirFlow, Flink,etc)
I'm referencing the main TFX page: https://www.tensorflow.org/tfx/guide#creating_a_tfx_pipeline_with_airflow ,etc.
In the examples, I see three variants:
https://github.com/tensorflow/tfx/tree/master/tfx/examples/chicago_taxi_pipeline
taxi_pipeline_flink.py
, taxi_pipeline_kubeflow.py
, taxi_pipeline_simple.py
BEAM Example?
There is no "BEAM" example and little describing its use.
Is it correct to assume that taxi_pipeline_simple.py
would run even if airflow wasn't installed? I think not since it uses "AirflowDAGRunner". If not, then can you run TFX with only BEAM and its runner? If so, why no example of that?
Flink Example
In taxi_pipeline_flink.py
, AirflowDAGRunner is used. I assume that is using AirFlow as an orchestrator which in turn uses Flink as its executor. Correct?
Airflow Example
The page states that BEAM is a required dependency, yet airflow doesn't have beam as one of its executors. It only has SequentialExecutor, LocalExecutor, CeleryExecutor, DaskExecutor, and KubernetesExecutor. Therefore, is BEAM only needed when not using Airflow? When using airflow, what is the purpose of beam if it is required?
Thank you for any insights.