Merging two partial (jointly overdetermined) sets of ordering information

Asked 21/3, 2014 at 22:29 Answered 23/3, 2014 at 10:44

algorithm sorting merge partial-ordering

I have a web app with data in a grid. The user can reorder the columns, and the server can change which columns exist. I would like to save the user's column order in a cookie and restore it on page load.

More formally, I have two arrays of unique IDs (strings) called user_columns and server_columns. I would like to reorder server_columns such that I respect all of the ordering information from user_columns, and as much from server_columns as possible. How do I do this? What's a reasonable formal definition of "as much as possible"?

My analysis so far:

One aspect of the problem is trivial: if the server removes some columns, delete the corresponding entries from user_columns. Any information about the ordering of columns which are no longer there is moot. The problem then becomes one of merging two potentially conflicting sets of ordering information.

This corresponds to a family of problems from voting theory: given a set of ballots, each of which contains a partial order between the candidates, produce a complete ordering of the candidates which in some sense reflects the ballots.

This leads me to think that I might get a workable solution by applying e.g. the Schulze Method or Ranked Pairs to a sufficiently rigged set of ballots based on user_columns and server_columns. For UX reasons, breaking ties by inserting new columns last (to the right) seems like a good idea to me.

Does this sound like it's on the right track?

Note also that we can consider three kinds of comparisons: A and B are both in user_columns, one of them is, or none of them are. The former and latter kinds are easily resolved (refer to user_columns and server_columns, respectively); the one in the middle, and its interactions with the latter, are the tricky parts.

Adage answered 21/3, 2014 at 22:29 Comment(6)

Possibly related: #15933101 and #6352712 – Assassinate 21/3, 2014 at 22:30

What's the language you want to implement this in? – Husch 21/3, 2014 at 23:5

@Ivarpoiss: javascript – Assassinate 21/3, 2014 at 23:10

Please clarify the question. Can the server reorder columns as well or just change the set of columns? Because in the former case, why do you mention voting theory at all? There's no voting necessary if only one side has an ordering at all – Edna 22/3, 2014 at 3:5

Both user_columns and server_columns are arrays, and the server can change the order of columns, i.e. if A and B both occur in server_columns in two consecutive page loads, they can occur in two different orders. The more interesting property, I think, is that user_columns and server_columns can disagree in a single page load, but in either case I think the answer is "yes" :) – Assassinate 22/3, 2014 at 3:14

Would you be happy with an algorithm that minimizes the inversions with respect to server_columns and respects user_columns completely (zero inversions w.r.t. user_columns)? Because that how I interpret the question. – Edna 22/3, 2014 at 4:25

Let's say we have C columns, numbered from 1 to C. We have two sequences sequences of columns, U = u₁, u₂, ... u_n and S = s₁, s₂, ... s_m. We want to find a permutation P of S, such that P has no inversions with regard to U and a minimal number of inversions with regard to S.

We can show that there is such an optimal P which is an interleaving of U ∩ S and S \ U. By "interleaving" I mean that P has no inversions with regard to U ∩ S or S \ U.

We can apply dynamic programming to find the optimal interleaving: Let A = (a_i) = U ∩ S and B = (b_j) = S \ U. Let f(i,j) be the number of inversions w.r.t. S of the optimal interleaving of the prefixes a_1...i of A and b_1...j of B. The idea is very similar to the longest common subsequence DP algorithm. We have the recurrence

f(0,j) = 0 for all j >= 0
f(i,0) = f(i-1, 0) + sum(k=1 to i-1, [1 if A[i] appears before A[k] in S])
f(i,j) = min(f(i-1, j) + sum(k=1 to i-1, [1 if A[i] appears before A[k] in S])
                       + sum(k=1 to j, [1 if A[i] appears before B[k] in S]),
             f(i, j-1) + sum(k=1 to i, [1 if B[j] appears before A[k] in S])
                       + sum(k=1 to j-1, [1 if B[j] appears before B[k] in S]))

I used the notation [1 if X] here to denote the value 1, if X is true and 0, if X is false.

The matrix f can be built in time O(|A|^2 * |B|^2). The minimal cost (number of inversions w.r.t. S) will be f(|A|, |B|).

We can reconstruct the optimal permutation using the DP matrix as well: We build it from back to front. We start with the tuple (i,j) = (|A|, |B|) and at every step depending on which of the two options is minimum in the DP transition, we know whether we need to put A[i] or B[j] to the front of the permutation. Then we proceed with (i-1, j) or (i, j-1) depending on which we choose.

Here is an implementation of the algorithm, please excuse my lack of JS skills.

Edna answered 22/3, 2014 at 5:14 Comment(6)

Regarding the interleaving proof: assume we have b₂, aᵢ, ..., aⱼ, b₁ as part of the solution with b₁ inverted relative to b₂. Let Inv(1) be the number of inversions between b₁ and aᵢ..aⱼ, and Inv(2) likewise. If Inv(1) ≤ Inv(2) then aᵢ, ..., aⱼ, b₁, b₂ has fewer inversions than the original(new Inv(2) = old Inv(1) ≤ old Inv(2), and we fixed the b₁/b₂ inversion), and similarly in reverse if Inv(2) ≤ Inv(1). Since swapping adjacent out-of-order b-elements lowers the inversion count, the solution has the elements of b in order. QED. Correct? – Assassinate 23/3, 2014 at 6:12

@JonasKölker Very good, that is actually very similar to the proof idea I had in mind and it is a very compact formulation of it. I like it – Edna 23/3, 2014 at 6:16

@JonasKölker Actually I think swapping b1 and b2 can cost you one additional inversion, but since you also decrement it by putting b1 and b2 in order you get at least an equally good permutation. Or to be more precise, new Inv(2) = old Inv(1) <= old Inv(2) doesn't necessarily hold if I'm not mistaken, but it's a bit too late here for me to be sure about that right now – Edna 23/3, 2014 at 6:23

@JonasKölker By the way in case you're interested, the recurrence can probably be solved in O(|A| * |B|) – Edna 23/3, 2014 at 6:28

I'd be very interested in both a fix of my proof and a faster algorithm :) – Assassinate 23/3, 2014 at 6:30

@Jonas The algorithmic part is easier: E.g. Let g(x,K) = sum(k=1 to K, [1 if x appears before A[k] in S]). Then we have g(x,K) = g(x,K-1) + [1 if x appears before A[K] in S], so we can compute all the sums in O(1). You can answer "X appears before Y in S" queries in O(1) by precomputing the ranks of elements in S – Edna 23/3, 2014 at 6:38

One definition that might be reasonable is to minimize the number of inversions relative to the server's order, subject to the constraint that the restriction to the columns in common of the two orders be equal. I don't know offhand an algorithm to minimize this objective.

Singley answered 21/3, 2014 at 22:59 Comment(3)

This sounds like a reasonable definition. I suspect that the Schulze Method or Ranked Pairs can achieve this, but I haven't though enough about it. – Assassinate 21/3, 2014 at 23:3

I think we can just use DP to optimally interleave the client column set with the server column set excluding the client columns – Edna 22/3, 2014 at 5:14

@NiklasB. I thought that there might be something like that. I had to rush out the door five minutes after I started this answer. – Singley 22/3, 2014 at 13:23

This relates to Niklas B's answer:

Theorem: Consider a sequence S = s₁, …, sₙ of some orderet set (e.g. integers). If i < j and sᵢ > sⱼ, then swapping sᵢ and sⱼ decreases the number of inversions—that is, let S' = s₁, …, sᵢ₋₁, sⱼ, sᵢ₊₁, …, sⱼ₋₁, sᵢ, sⱼ₊₁, …, sₙ; then S' has fewer inversions than S.

Intuitively stated: if two elements are out of order and you swap them, you're closer to having a sorted list.

Proof: Observe that the only elements that have a different relative order in S and S' are (sᵢ, sⱼ), (sᵢ, sₖ) and (sⱼ, sₖ) for each k where i < k < j. We know that (sᵢ, sⱼ) is an inversion in S but not S', so consider sₖ for some such k.

Either sₖ < sⱼ < sᵢ or sⱼ < sₖ < sᵢ or sⱼ < sᵢ < sₖ (we assume the elements of S to be unique).

In the first case, (sᵢ, sₖ) is an inversion in S and (sⱼ, sₖ) is an inversion in S'. In the second case, (sᵢ, sₖ) and (sⱼ, sₖ) are inversions in S but not in S'. In the third case, (sⱼ, sₖ) is an inversion in S and (sᵢ, sₖ). These are all the changes in inversions.

In each case, the number of inversions in S' is either the same as that in S or it is smaller. Recall that (sᵢ, sⱼ) got fixed from S to S', and we get the desired result. ■

Thus, if we have a₁, bᵢ, …, bⱼ, a₂ with each a ∈ S \ U and each b ∈ U ∩ S and a₁ > a₂ and we swap a₁ and a₂, getting a₂, bᵢ, …, bⱼ, a₁, the inversion count is lower. Since such swaps only rearrange elements of S \ U and not those of U ∩ S, any solution which has zero inversions on U ∩ S and (subject to that) a minimal number of inversions on S \ U must make all such swaps.

Ergo: the elements of S \ U must occur in order, and thus the solution is an interleaving of U ∩ S and S \ U.

Adage answered 23/3, 2014 at 10:44 Comment(1)

Kudos for doing the proof on your own, I really appreciate that :) It looks good to me, I had something wrong yesterday when I thought there was a problem. My approach was something like this: Let's say we have b2 ai, ..., aj b1 as a substring of the permutation and we have X < b1 < Y < b2 < Z in S, where X, Y and Z represent subsets of ai, ..., aj. Then b2 ai, ..., aj b1 has 1 + |X| + 2|Y| + |Z| inversions, b1 b2 ai, ..., aj has 2|X| + |Z| and ai, ..., aj b1 b2 has 2|Z| + |Y|. So depending on whether |X| < Z| or not we can always do strictly better using one of those – Edna 23/3, 2014 at 21:7

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags