Need some help in understanding the locking behavior around DagRun scheduling.
We noticed that after a few DagRuns the subsequent runs are no longer getting scheduled and notice the following in the logs.
{scheduler_job_runner.py:1426} INFO - DAG dag-test scheduling was skipped, probably because the DAG record was locked.
We are currently running a single scheduler pod. Would like to understand the locking behavior in general and under what scenarios does the lock get removed.
Currently the workaround is to restart the scheduler pod and the lock gets released but this isn't ideal for production. Would like to understand when does the lock get released for a DagRun in general ?
Appreciate any help/pointers.