I have an sklearn k-means model. I am training the model and saving it in a pickle file so I can deploy it later using azure ml library. The model that I am training uses a custom Feature Encoder called MultiColumnLabelEncoder. The pipeline model is defined as follow :
# Pipeline
kmeans = KMeans(n_clusters=3, random_state=0)
pipe = Pipeline([
("encoder", MultiColumnLabelEncoder()),
('k-means', kmeans),
])
#Training the pipeline
model = pipe.fit(visitors_df)
prediction = model.predict(visitors_df)
#save the model in pickle/joblib format
filename = 'k_means_model.pkl'
joblib.dump(model, filename)
The model saving works fine. The Deployment steps are the same as the steps in this link :
However the deployment always fails with this error :
File "/var/azureml-server/create_app.py", line 3, in <module>
from app import main
File "/var/azureml-server/app.py", line 27, in <module>
import main as user_main
File "/var/azureml-app/main.py", line 19, in <module>
driver_module_spec.loader.exec_module(driver_module)
File "/structure/azureml-app/score.py", line 22, in <module>
importlib.import_module("multilabelencoder")
File "/azureml-envs/azureml_b707e8c15a41fd316cf6c660941cf3d5/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'multilabelencoder'
I understand that pickle/joblib has some problems unpickling the custom function MultiLabelEncoder. That's why I defined this class in a separate python script (which I executed also). I called this custom function in the training python script, in the deployment script and in the scoring python file (score.py). The importing in the score.py file is not successful. So my question is how can I import custom python module to azure ml deployment environment ?
Thank you in advance.
EDIT: This is my .yml file
name: project_environment
dependencies:
# The python interpreter version.
# Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2
- pip:
- multilabelencoder==1.0.4
- scikit-learn
- azureml-defaults==1.0.74.*
- pandas
channels:
- conda-forge