I want to update my code of pyspark. In the pyspark, it must put the base model in a pipeline, the office demo of pipeline use the LogistictRegression as an base model. However, it seems not be able to use XGboost model in the pipeline api. How can I use the pyspark like this
from xgboost import XGBClassifier
...
model = XGBClassifier()
model.fit(X_train, y_train)
pipeline = Pipeline(stages=[..., model, ...])
...
It is convenient to use the pipeline api, so can anybody give some advices? Thanks.