I have a saved PipelineModel:
pipe_model = pipe.fit(df_train)
pipe_model.write().overwrite().save("/user/pipe_text_2")
And now I want to add to this Pipe a new already fited PipelineModel:
pipe_model = PipelineModel.load("/user/pipe_text_2")
df2 = pipe_model.transform(df1)
kmeans = KMeans(k=20)
pipe2 = Pipeline(stages=[kmeans])
pipe_model2 = pipe2.fit(df2)
Is that possible without fitting it again? In order to obtain a new PipelineModel but not a new Pipeline. The ideal thing would be the following:
pipe_model_new = pipe_model + pipe_model2
TypeError: unsupported operand type(s) for +: 'PipelineModel' and 'PipelineModel'
I've found Join two Spark mllib pipelines together but with this solution you need to fit the whole Pipe again. That is what I'm trying to avoid.