I want to get real time predictions using my machine learning model with the help of SageMaker. I want to directly get inferences on my website. How can I use the deployed model for predictions?
Sagemaker endpoints are not publicly exposed to the Internet. So, you'll need some way of creating a public HTTP endpoint that can route requests to your Sagemaker endpoint. One way you can do this is with an AWS Lambda function fronted by API gateway.
I created an example web app that takes webcam images and passes them on to a Sagemaker endpoint for classification. This uses the API Gateway -> Lambda -> Sagemaker endpoint strategy that I described above. You can see the whole example, including instructions for how to set up the Lambda (and the code to put in the lambda) at this GitHub repository: https://github.com/gabehollombe-aws/webcam-sagemaker-inference/
Use the CLI like this:
aws sagemaker-runtime invoke-endpoint \
--endpoint-name <endpoint-name> \
--body '{"instances": [{"in0":[863],"in1":[882]}]}' \
--content-type application/json \
--accept application/json \
results
I found it over here in a tutorial about accessing Sagemaker via API Gateway.
You can invoke the SageMaker endpoint using API Gateway or Lambda.
Lambda:
Use sagemaker aws sdk and invoke the endpoint with lambda.
API Gateway:
Use API Gateway and pass parameters to the endpoint with AWS service proxy.
Documentation with example:
Hope it helps.
As other answers have mentioned, your best option is fronting the SageMaker endpoint with a REST API in API Gateway. The API then lets you control authorisation and 'hides' the backend SageMaker endpoint from API clients, lowering the coupling between API clients (your website) and your backend. (By the way, you don't need a Lambda function there, you can directly integrate the REST API with SageMaker as a backend).
However, if you are simply testing the endpoint after deploying it and you want to quickly get some inferences using Python, there's two options:
After deploying your endpoint with
predictor = model.deploy(...)
, if you still have thepredictor
object available in your Python scope, you can simply runpredictor.predict()
, as documented here. However, it's rather likely that you've deployed the endpoint a while ago and you can no longer access thepredictor
object, and naturally one doesn't want to re-deploy the entire endpoint just to get thepredictor
.If your endpoint already exists, you can invoke it using
boto3
as follows, as documented here:import boto3 payload = "string payload" endpoint_name = "your-endpoint-name" sm_runtime = boto3.client("runtime.sagemaker") response = sm_runtime.invoke_endpoint( EndpointName=endpoint_name, ContentType="text/csv", Body=payload ) response_str = response["Body"].read().decode()
Naturally, you can adjust the above invocation according to your content type, to send JSON data for example. Then just be aware of the (de)serializer the endpoint uses, as well as the ContentType in the argument to invoke_endpoint
.
© 2022 - 2024 — McMap. All rights reserved.