Endpoint URL
Vertex AI document - guide - Send an online prediction request
HTTP method and URL:
- LOCATION: The region where you are using Vertex AI.
- PROJECT: Your project ID
- ENDPOINT_ID: The ID for the endpoint.
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/endpoints/ENDPOINT_ID:predict
Alternatively we can use gcloud ai endpoints list command.
$ LOCATION='us-central1'
$ gcloud ai endpoints list --region=${LOCATION} --uri
Using endpoint [https://us-central1-aiplatform.googleapis.com/]
https://us-central1-aiplatform.googleapis.com/v1beta1/projects/335513839215/locations/us-central1/endpoints/5031620295400620032
We can check the available Vertex AI API service endpoint URL FQDN part in Vertex AI API - Service: aiplatform.googleapis.com
https://us-central1-aiplatform.googleapis.com
https://us-east1-aiplatform.googleapis.com
https://us-east4-aiplatform.googleapis.com
https://us-west1-aiplatform.googleapis.com
https://us-west2-aiplatform.googleapis.com
https://northamerica-northeast1-aiplatform.googleapis.com
https://northamerica-northeast2-aiplatform.googleapis.com
https://europe-west1-aiplatform.googleapis.com
https://europe-west2-aiplatform.googleapis.com
https://europe-west3-aiplatform.googleapis.com
https://europe-west4-aiplatform.googleapis.com
https://europe-west6-aiplatform.googleapis.com
https://asia-east1-aiplatform.googleapis.com
https://asia-east2-aiplatform.googleapis.com
https://asia-northeast1-aiplatform.googleapis.com
https://asia-northeast3-aiplatform.googleapis.com
https://asia-south1-aiplatform.googleapis.com
https://asia-southeast1-aiplatform.googleapis.com
https://australia-southeast1-aiplatform.googleapis.com
Google API for Vertex endpoint prediction
The URL format above is based on the API definition Method: projects.locations.endpoints.predict.
POST https://{service-endpoint}/v1/{endpoint}:predict
Where {service-endpoint} is one of the supported service endpoints.
Authentication
GCP uses OAuth 2.0 for authentication by default, and the Vertex AI endpoint authenticates the access with the OAuth 2.0 access token set in the HTTP Authorization header as a Bearer token.
RFC 6749 - The OAuth 2.0 Authorization Framework
+--------+ +---------------+
| |--(A)- Authorization Request ->| Resource |
| | | Owner |
| |<-(B)-- Authorization Grant ---| |
| | +---------------+
| |
| | +---------------+
| |--(C)-- Authorization Grant -->| Authorization |
| Client | | Server |
| |<-(D)----- Access Token -------| |
| | +---------------+
| |
| | +---------------+
| |--(E)----- Access Token ------>| Resource |
| | | Server |
| |<-(F)--- Protected Resource ---| |
+--------+ +---------------+
If we have gcloud command available e.g. inside a GCP environment, we can use the ADC (Application Default Credentials) to first get authenticated with gcloud auth application-default login
and obtain the token using the gloud auth print-access-token
command. Then the token can be specified to the HTTP header as in the Vertex AI documentation below.
curl -X POST \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/endpoints/ENDPOINT_ID:predict"
Request body format
{
"instances": [
value
],
"parameters": value
}
The actual value to set depends on the frameworks. See Get online predictions from custom-trained models- Format your input for online prediction if you are getting predictions from custom models.