Why an activity task is scheduled but not started?
Asked Answered
B

1

0

I have an activity task scheduled but never executed(started) by looking at the history, and then workflow got timeouted. I can confirm that cadence worker is running because other workflow works fine at the same time.

Why the activity is not started/executed in history? How should I investigate issue like this?

My activity timeout is same as workflow timeout.

The question is from by this Github issue.

Bride answered 2/12, 2020 at 17:42 Comment(0)
B
1

First of all, an activity with retry will only write started event when activity is completed or finally failed/timeouted.

From the history, the workflow timeout before activity event can be written into history. You need to make sure the workflow timeout is larger than activity timeout so that activity timeout event can be written into history before workflow timeouts.

So how do we make sure if an activity is indeed started or not?

  1. When activity with retry is running, the best way to see the status is through CLI:
~/cadence [qlong-cli-wf-show-actvities-retry] M % ./cadence --do qlong wf show -w retry_db345b68-0e50-4c24-8d2d-8c6dd18d88dc
   1  WorkflowExecutionStarted  {WorkflowType:{Name:main.retryWorkflow},
                                TaskList:{Name:retryactivityGroup}, Input:[],
                                ExecutionStartToCloseTimeoutSeconds:120,
                                TaskStartToCloseTimeoutSeconds:60,
                                ContinuedFailureDetails:[], LastCompletionResult:[],
                                OriginalExecutionRunId:63acf35f-9ede-48ef-aee7-66579382fed5,
                                Identity:35027@IT-USA-25920@,
...
...
...
...
                                ExpirationIntervalInSeconds:20},
                                Header:{Fields:map{}}}
============Pending activities============
[
  {
    "ActivityID": "0",
    "ActivityType": {
      "name": "main.batchProcessingActivity"
    },
    "State": "STARTED",
    "LastStartedTimestamp": "2020-10-11T22:47:16-07:00",
    "LastHeartbeatTimestamp": "2020-10-11T22:47:16-07:00",
    "Attempt": 0,
    "MaximumAttempts": 15,
    "ExpirationTimestamp": "2020-10-11T22:47:36-07:00"
  }
]
NOTE: ActivityStartedEvent with retry policy will be written into history when the activity is finished.

Or through webUI describe workflow view, it shows the pending activities.

  1. If you have done 1. to confirm that activity is not started, then we need to look at tasklist and worker being available for the tasklist.
./cadence --do <> tl desc --tl <>

Finally, the reason behind activities with retry: history needs to be immutable as invariance, but start event can changed for activity being retried until it finally settled down. This invariance is important for Cadence architecture. But it's indeed confusing in webUI. And here is the issue to improve it.

Bride answered 2/12, 2020 at 17:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.