What is a crashloop?

A crashloop is when a process crashes and is restarted by a watchdog daemon, indefinitely.

That is, the history is:

Process starts at time T.
Process crashes at time T+1.
Watchdog daemon restarts process.
Process started at time T+2.
Process crashes at time T+3.
Watchdog daemon restarts process.
Process starts...etc.

Here, the watchdog deamon is Borg, and the process is encapsulated into a task.

In general, in distributed computing if you want something to eventually succeed, you have to write down your intent for it to be completed and you need a worker to loop continually to act on this intent. This is "at least once delivery" of a work item.

Here, the intent is that the task runs (written down into Borg), and Borg itself is running the loop that is constantly trying to make sure the task runs. This is why when a task crashes, it is restarted. When a task crashes repeatedly, together you end up with a crashloop.

Recommended topics

Hot tags