In short: We have web app running at Google App Engine Nodejs runtime, flex environment. Starting 5 days ago all our deployments started to fail with the reason:
ERROR: (gcloud.app.deploy) Error Response: [4] Timed out waiting for the app infrastructure to become healthy.
Full error stack trace:
Updating service [default] (this may take several minutes)...\DEBUG: Operation [apps/PROJECT_ID/operations/45d6fec1-9261-41d2-943a-648976b971ed] not complete. Waiting to retry.
Updating service [default] (this may take several minutes)...-DEBUG: Operation [apps/PROJECT_ID/operations/45d6fec1-9261-41d2-943a-648976b971ed] complete. Result: {
"metadata": {
"user": "[email protected]",
"target": "apps/PROJECT_ID/services/default/versions/release-0-6-3",
"@type": "type.googleapis.com/google.appengine.v1.OperationMetadataV1",
"insertTime": "2018-02-19T06:08:56.439Z",
"method": "google.appengine.v1.Versions.CreateVersion"
},
"done": true,
"name": "apps/PROJECT_ID/operations/45d6fec1-9261-41d2-943a-648976b971ed",
"error": {
"message": "Timed out waiting for the app infrastructure to become healthy.",
"code": 4
}
}
Updating service [default] (this may take several minutes)...failed.
DEBUG: (gcloud.app.deploy) Error Response: [4] Timed out waiting for the app infrastructure to become healthy.
Traceback (most recent call last):
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 797, in Execute
resources = calliope_command.Run(cli=self, args=args)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/calliope/backend.py", line 757, in Run
resources = command_instance.Run(args)
File "/usr/lib/google-cloud-sdk/lib/surface/app/deploy.py", line 65, in Run
parallel_build=False)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/command_lib/app/deploy_util.py", line 588, in RunDeploy
flex_image_build_option=flex_image_build_option)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/command_lib/app/deploy_util.py", line 394, in Deploy
extra_config_settings)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/app/appengine_api_client.py", line 188, in DeployService
message=message)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/app/operations_util.py", line 246, in WaitForOperation
sleep_ms=retry_interval)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/waiter.py", line 266, in WaitFor
sleep_ms=sleep_ms)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/core/util/retry.py", line 222, in RetryOnResult
if not should_retry(result, state):
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/util/waiter.py", line 260, in _IsNotDone
return not poller.IsDone(operation)
File "/usr/lib/google-cloud-sdk/lib/googlecloudsdk/api_lib/app/operations_util.py", line 171, in IsDone
encoding.MessageToPyValue(operation.error)))
OperationError: Error Response: [4] Timed out waiting for the app infrastructure to become healthy.
ERROR: (gcloud.app.deploy) Error Response: [4] Timed out waiting for the app infrastructure to become healthy.
Before that (3 weeks ago) deploys began to be really slow (5-20 minutes).
Info about operation provided by command:
gcloud beta app operations describe OPERATION_ID
gives this:
done: true
error:
code: 4
message: Timed out waiting for the app infrastructure to become healthy.
metadata:
'@type': type.googleapis.com/google.appengine.v1.OperationMetadataV1
endTime: '2018-02-19T06:36:02.752Z'
insertTime: '2018-02-19T06:08:56.439Z'
method: google.appengine.v1.Versions.CreateVersion
target: apps/PROJECT_ID/services/default/versions/release-0-6-3
user: [email protected]
name: apps/PROJECT_ID/operations/45d6fec1-9261-41d2-943a-648976b971ed
Any ideas how to get more info about the operation and what actions are performed?
Best,
Alex