Jira's Lexorank algorithm for new stories

I am looking to create a large list of items that allows for easy insertion of new items and for easily changing the position of items within that list. When updating the position of an item, I want to change as few fields as possible regarding the order of items.

After some research, I found that Jira's Lexorank algorithm fulfills all of these needs. Each story in Jira has a 'rank-field' containing a string which is built up of 3 parts: <bucket>|<rank>:<sub-rank>. (I don't know whether these parts have actual names, this is what I will call them for ease of reference)

Examples of valid rank-fields:

0|vmis7l:hl4
0|i000w8:
0|003fhy:zzzzzzzzzzzw68bj

When dragging a card above 0|vmis7l:hl4, the new card will receive rank 0|vmis7l:hl2, which means that only the rank-field for this new card needs to be updated while the entire list can always be sorted on this rank-field. This is rather clever, and I can't imagine that Lexorank is the only algorithm to use this.

Is there a name for this method of sorting used in the sub-rank?

My question is related to the creation of new cards in Jira. Each new card starts with an empty sub-rank, and the rank is always chosen such that the new card is located at the bottom of the list. I've created a bunch of new stories just to see how the rank would change, and it seems that the rank is always incremented by 8 (in base-36).

Does anyone know more specifically how the rank for new cards is generated? Why is it incremented by 8?

I can only imagine that after some time (270 million cards) there are no more ranks to generate, and the system needs to recalculate the rank-field of all cards to make room for additional ranks.

Are there other triggers that require recalculation of all rank-fields?
I suppose the bucket plays a role in this recalculation. I would like to know how?

We are talking about a special kind of indexing here. This is not sorting; it is just preparing items to end up in a certain order in case someone happens to sort them (by whatever sorting algorithm). I know that variants of this kind of indexing have been used in libraries for decades, maybe centuries, to ensure that books belonging together but lacking a common title end up next to each other in the shelves, but I have never heard of a name for it.
The 8 is probably chosen wisely as a compromise, maybe even by analyzing typical use cases. Consider this: If you choose a small increment, e. g. 1, then all tickets will have ranks like [a, b, c, …]. This will be great if you create a lot of tickets (up to 26) in the correct order because then your rank fields keep small (one letter). But as soon as you move a ticket between two other tickets, you will have to add a letter: [a, b] plus a new ticket between them: [a, an, b]. If you expect to have this a lot, you better leave gaps between the ranks: [a, i, q, …], then an additional ticket can get a single letter as well: [a, e, i, q, …]. But of course if you now create lots of tickets in the correct order right in the beginning, you quickly run out of letters: [a, i, q, y, z, za, zi, zq, …]. The 8 probably is a good value which allows for enough gaps between the tickets without increasing the need for many letters too soon. Keep in mind that other scenarios (maybe not Jira tickets which are created manually) might make other values more reasonable.
You are right, the rank fields get recalculated now and then, Lexorank calls this "balancing". Basically, balancing takes place in one of three occasions: ① The ranks are exhausted (largest value reached), ② the ranks are due to user-reranking of tickets too close together ([a, b, i] and something is supposed to go in between a and b), and ③ a balancing is triggered manually in the management page. (Actually, according to the presentation, Lexorank allows for up to three letter ranks, so "too close together" can be something like aaa and aab but the idea is the same.)
The <bucket> part of the rank is increased during balancing, so a messy [0|a, 0|an, 0|b] can become a nice and clean [1|a, 1|i, 1|q] again. The brownbag presentation about Lexorank (as linked by @dandoen in the comments) mentions a round-robin use of <buckets>, so instead of a constant increment (0→1→2→3→…) a 2 is increased modulo 3, so it will turn back to 0 after the 2 (0→1→2→0→…). When comparing the ranks, the sorting algorithm can consider a 0 "greater" than a 2 (it will not be purely lexicographical then, admitted). If now the balancing algorithm works backwards (reorder the last ticket first), this will keep the sorting order intact all the time. (This is just a side aspect, that's why I keep the explanation small, but if this is interesting, ask, and I will elaborate on this.)

Sidenote: Lexorank also keeps track of minimum and maximum values of the ranks. For the functioning of the algorithm itself, this is not necessary.

Recommended topics

Hot tags