Algorithm : Letters and envelopes pairing
Asked Answered
R

3

8

Disclaimer : This isn't any kind of homework, the problem just came to my mind while I was going through all the Christmas cards

The problem is given as follows : We've got M envelopes and N letters, each of which is described as a pair of positive integers. Both envelopes and letters are rectangular and obviously can be rotated. A letter fits into an envelope if both dimensions are smaller or equal to the envelope's ones. The goal is to find maximum envelopes-letters matching.

The problem is easily convertible to maximum bipartite matching problem, for which an algorithm running in O(sqrt(M+N) * MN) exists (Hopcroft-Karp, the conversion runs trivially in O(MN)). I tried to come up with a greedy algorithm or with a dynamic approach, but I haven't found any.

Do you know about any faster solution?

Rhee answered 4/1, 2014 at 21:50 Comment(6)
Just a thought: in the one-dimensional case, where all the envelopes and letters are the same height, this is O(n lg n). We just sort both the letters and the envelopes by width, and repeatedly put the widest possible letter in the widest possible envelope. Haven't found a way to adapt this observation to the 2D case, as you've only got a partial ordering there.Docilla
Unfortunately, this doesn't work. Just take letters 50 1, 40 2 and envelopes 55 2, 54 1. You can find a similar example for all sorts of greedy approach.Rhee
Yeah, I said it didn't work in the 2D case, because there's only a partial ordering there - the letters (50, 1) and (40, 2) are incomparable. However, in an average problem there should be long chains of elements that are comparable, and that might be useful in speeding up the matching process.Docilla
Just to clarify: it's a one-to-one correspondence between envelopes and letters, right? i.e., you can only put one letter in an envelope, and you can't put one letter into multiple envelopes?Reviviscence
Jan 4: Either 2013 Christmas cards in which case you need O(-12day) or for 2014 you are quite early and O(Dec24,2014) will work. *<|:o)>Lordly
Perhaps it would be best to ask this on math.stackexchange.com since it appears to be a math puzzle. (EDIT: whoops, ignore this comment. I just failed a Stack Overflow audit).Ailsa
T
0

The following "greedy"-type approach might help.

Define m[i] to be the minimum of the two integers of envelope i.

mins = distinct values of m[i], in increasing order
letters_to_match = all letters
for min in mins:
    envs = envelopes i with m[i] == min
    match letters_to_match with envs
    remove matched letters from letters_to_match
Tiffin answered 6/1, 2014 at 10:49 Comment(7)
What is the time complexity of this? It seems to be O(kN) where k is the number of distinct envelopes, which in the worst case, is O(MN) (not any faster than the OP's solution).Reviviscence
I don't think that match letters_to_match with envs is a trivial task. Can you please explain it a bit further?Rhee
@asQuirreL as far as I can see, the OP is suggesting a solution in O(sqrt(M+N) * MN).Tiffin
@Rhee all envelopes in envs have the same smallest dimension, so sort them by their other dimension, and stuff them with the largest letter they can take.Tiffin
But how do you define the largest letter? As Andy mentioned in the OP comments, the ordering is only partial.Rhee
@mitchus, Apologies, misread the question. I thought "and the conversion runs trivially in O(MN)" meant that there was a specialisation of the algorithm that ran in O(MN), as opposed to that being the time complexity of the reduction.Reviviscence
@Rhee the letter with the largest maximum.Tiffin
C
0

Isn't this equivalent to maximum bipartite matching?

i.e Suppose you had an algorithm for this problem, then you can solve max bipartite matching.

Given a bipartite graph, we can assign Envelopes and Letters (i.e. two positive integers) to each node...

Clung answered 2/4, 2014 at 4:3 Comment(0)
S
0

This is solvable in O(max(M,N) log(max(M,N))) (the time to sort letters and envelopes) with a 2D vertical-sweep-line algorithm. This is the same as O(M+N log (M+N)).

  1. Since rotation is allowed, swap any pair (x, y) of letter/envelope dimensions so that x <= y.

  2. Combine the lists, marking the original list of each point. We can do this by making (x, y, 'E') represent points from the Envelope list, for example, and 'L' for letters.

  3. Sort the list by x, then by y to tiebreak, then with 'L' before 'E' if still tied. We're going to treat a letter (x0, y0, 'L') as a point in 2D space, which can match any 'envelope' point (x1, y1, 'E') if and only if x0 <= y0 and x1 <= y1.

  4. Initialize a balanced binary-search-tree (BST) unmatched_letters, which will contain our so-far-unmatched letters, with their key/ordering in the BST based on their y-value only.

The idea is that we'll be moving a vertical sweep line over the points, from left to right, processing the points on the line from bottom to top (this is the exact order our list is already sorted in):

  • If we see a letter, we'll add it to our unmatched_letters BST, sorted by 'y'. That means any envelopes we see after adding it only need to be checked for larger/equal 'y' values (the sweep line ensures the later x value isn't smaller).
  • If we see an envelope, we'll try to match it with the highest-height letter available for it in our BST using a binary search. If there is none, we can 'throw away'/disregard that envelope, since any letters we see in the future will either have higher x or y values than this envelope. Otherwise, match them. You can show by a greedy-exchange argument that this will give the optimal solution; I may include a proof of that if it isn't too long or nontrivial.

For the more detailed, 'pseudocode' version of the step we just described:

  1. Iterate over the (x0, y0, category) points in our list. If category == 'L', insert it to unmatched_letters with key y0. If category == 'E', binary-search for the largest key-value in unmatched_letters that is at most y0. If there is a match: the letter has an x-value at most x0 since it was processed earlier. Remove that letter, and record it as part of our solution with this envelope.

Since all we did was sort the lists and iterate, this can clearly be done in O(sorting). It's not hard to argue why this must give a maximum matching, but the exact proof of greedy algorithms' correctness can be long. Part of the argument goes like this: if we compare to any fixed, optimal solution, and the earliest difference (by x-coordinate) is that greedy matches envelope 'E1' with letter 'L1' and optimal doesn't, then there are two (not disjoint) possibilities:

  • In the optimal solution, 'E1' matches a different letter, 'L2'. By construction, 'L2' has a y-value at most as large as 'L1', so if 'L1' is matched by a later (by x-value) envelope in the optimal solution, this later envelope must also fit 'L2' (and if not, we can swap 'L1' and 'L2' in optimal to match greedy).
  • In the optimal solution, 'L1' matches a different envelope, 'E2'. Then, again, envelope 'E1' in the optimal solution is matched with a smaller-or-equal-y-value letter (or empty), and we can exchange these pairings in the optimal solution to match greedy.
Stuartstub answered 19/2, 2022 at 0:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.