I am trying in Amazon Sagemaker to deploy an existing Scikit-Learn model. So a model that wasn't trained on SageMaker, but locally on my machine.
On my local (windows) machine I've saved my model as model.joblib and tarred the model to model.tar.gz.
Next, I've uploaded this model to my S3 bucket ('my_bucket') in the following path s3://my_bucket/models/model.tar.gz. I can see the tar file in S3.
But when I'm trying to deploy the model, it keeps giving the error message "Failed to extract model data archive".
The .tar.gz is generated on my local machine by running 'tar -czf model.tar.gz model.joblib' in a powershell command window.
The code for uploading to S3
import boto3
s3 = boto3.client("s3",
region_name='eu-central-1',
aws_access_key_id=AWS_KEY_ID,
aws_secret_access_key=AWS_SECRET)
s3.upload_file(Filename='model.tar.gz', Bucket=my_bucket, Key='models/model.tar.gz')
The code for creating the estimator and deploying:
import boto3
from sagemaker.sklearn.estimator import SKLearnModel
...
model_data = 's3://my_bucket/models/model.tar.gz'
sklearn_model = SKLearnModel(model_data=model_data,
role=role,
entry_point="my-script.py",
framework_version="0.23-1")
predictor = sklearn_model.deploy(instance_type="ml.t2.medium", initial_instance_count=1)
The error message:
error message: UnexpectedStatusException: Error hosting endpoint sagemaker-scikit-learn-2021-01-24-17-24-42-204: Failed. Reason: Failed to extract model data archive for container "container_1" from URL "s3://my_bucket/models/model.tar.gz". Please ensure that the object located at the URL is a valid tar.gz archive
Is there a way to see why the archive is invalid?