Finding rectangles in a 2d block grid

Asked 27/4, 2011 at 21:11 Answered 27/5, 2016 at 14:20

Let's say I have a grid of blocks, 7x12. We use the colors '*','%','@' and an empty cell '-'.

1 2 3 4 5 6 7
- - - - - - -  1
- - - - - - -  2
% % - - - - -  3
% % - - - - *  4 
% % - - - @ %  5
@ @ @ - - @ %  6
@ @ * * * - *  7
* * * % % % %  8 
% @ @ % * * %  9
% @ % % % % %  10
* * * % % @ @  11
* * @ @ @ @ *  12

I want to find rectangles in this grid of a certain minimum size, and the biggest I can find and then smaller until no rectangles greater or equal to the minimum size can be found.

In this example, consider the minimum size 1x4, 4x1, 2x2 so a 1x3 is not valid but a 2x3 is. If we want the biggest rectangles we find the following:

4x1 at (4,8)
5x1 at (3,10)
2x3 at (1,3)
2x2 at (6,1)
2x2 at (1,11)
4x1 at (3,12)

Note that rectangles cannot be in each others space, they cannot overlap. For example the 2x2 rectangle at (4,10) is not mentioned because it would overlap the 5x1 rectangle at (3,10).

All are perfectly valid rectangles: they are equal or greater that the minimum size and all the blocks per rectangle are of the same color.

What I want is to do this programmatically. When you tell someone to find rectangles in a grid, he finds them immediatly, without any thinking about it. The question is, how can I write an algoritm that does the same?

I considered bruteforcing but I need the algorithm to execute as fast as possible as it will need to be executed a lot in a very small time frame on a limited (mobile) device.

I see a lot of questions on the internet about rectangles, but I'm suprised this one hasn't been asked anywhere yet. Am I thinking too difficult or has no one ever wanted to do something like this?

Maineetloire answered 27/4, 2011 at 21:11 Comment(4)

Can rectangles cross? I see you didn't mention 2x2 at (4,10), is it because it has a common part with a bigger 5x1 at (3,10)? – Keaton 27/4, 2011 at 21:22

No, rectangles cannot cross each other. Each rectangle cannot be in the space of others. – Maineetloire 27/4, 2011 at 21:25

@Sebeazz wouldn't that potentially disqualify rectangles based on what order they were found in? – Biparietal 27/4, 2011 at 21:34

@GlowCoder: No, because I want to find the biggest first. – Maineetloire 28/4, 2011 at 7:9

Call the width and height of the input array W and H respectively.

Run this clever O(WH) algorithm for determining the largest rectangle, but instead of tracking just the single largest rectangle, for each (x, y) location record in a W*H matrix the width and height of (one or all of) the largest rectangles whose top-left corner is (x, y), updating these values as you go.
Loop through this matrix, adding each sufficiently-large rectangle in it to a max-heap ordered by area (width * height).
Read entries out of this heap; they will be produced in decreasing area order. With every entry read whose top-left corner is (x, y) and which has width w and height h, mark each of the wh locations included in the rectangle as "used" in a WH bit array. When reading rectangles from the heap, we must discard any rectangles that contain "used" squares to avoid producing overlapping rectangles. It's sufficient to check just the four edges of each candidate rectangle against the "used" array, since the only other way that the candidate rectangle could overlap another rectangle would be if the latter rectangle was completely contained by it, which is impossible due to the fact that we are reading rectangles in decreasing area order.

This approach is "greedy" insofar as it won't guarantee to choose the largest sequence of rectangles overall if there are multiple ways to carve a solid coloured region into maximal rectangles. (E.g. it might be that there are several rectangles whose top-left corner is at (10, 10) and which have an area of 16: 16x1, 8x2, 4x4, 2x8, 1x16. In this case one choice might produce bigger rectangles "downstream" but my algorithm doesn't guarantee to make that choice.) If necessary you could find this overall optimal series of rectangles using backtracking, though I suspect this could be very slow in the worst case.

The maximum-rectangle algorithm I mention is designed for single-colour rectangles, but if you can't adapt it to your multi-colour problem you can simply run it once for each colour before starting step 2.

Burkhart answered 28/4, 2011 at 2:44 Comment(7)

Would you mind updating the link? It doesn't seem to work anymore. – Postconsonantal 15/11, 2013 at 23:14

@JanBerktold: At first I thought so too -- it timed out. But then I found a different URL for the same article: drdobbs.com/database/the-maximal-rectangle-problem/…. This brought up the article after about 30s... and a second attempt on the original URL worked too! So I'm leaving the link as-is. – Burkhart 16/11, 2013 at 0:24

Can you explain further @Burkhart how to do what you describe in step one? I'm not sure A) how to provide an x/y location to the algorithm, B) where to grab the data for each rectangle from the algorithm - I just see the max / max area. – Esker 13/11, 2016 at 17:44

@ZacharyCarter: Whenever the algorithm chooses a maximum of several values to solve a particular subproblem, you should record in a "predecessor" array which choice this was. Then once you have solved the entire problem, you can trace backwards through this array to reconstruct an optimal solution. – Burkhart 13/11, 2016 at 18:25

@Burkhart both links are still broken for me – Piper 10/9, 2019 at 1:41

@AlexReinking: Same for me, just now. I wasn't able to find another copy of that article (in fact Google's top result for "dr dobbs maximal rectangle problem" is still the same, now-useless link!) but it seems the author of the article wrote an answer to a very similar question, and I've asked him if he has another copy online somewhere. Fingers crossed... – Burkhart 10/9, 2019 at 21:53

As is often the case, Archive.org has a copy -- web.archive.org/web/20150921112543/http://www.drdobbs.com/… – Piper 16/9, 2019 at 20:9

Note: this operates under the assumption that you're trying to find the biggest k rectangles.

We know we must, in the worst case, look at every node in the grid at least once. This means our best-case worst-cast is O(len*wid).

Your brute-force is going to be O(len*len*wid*wid) with the naive approach of "Checking for rectangles at a point is O(len*wid), and you do that O(len*wid) times.

It may be that you find this to not be the case, as each time you find a rectangle, you have potential to reduce the problem space. A brute force approach of "check each rectangle" I feel is going to be the best approach. There are things you can do to speed it up, though.

Basic algorithm:

for(x = 1 .. wid) {
    for(y = 1 .. len) {
        Rectangle rect = biggestRectOriginatingAt(x,y);
        // process this rectangle for being added
    }
}

Keep track of the largest k rectangles. As you go along, you can search the perimeter of where an eligible rectangle could possibly be.

Rectangle biggestRectOriginatingAt(x,y) {
    area = areaOf(smallestEligibleRectangle); // if we want the biggest k rect's, this
                                              // returns the area of the kth biggest
                                              // known rectangle thus far

    for(i = 1 .. area) {
        tempwid = i
        templen = area / i

        tempx = x + tempwid
        tempy = y + templen

        checkForRectangle(x,y,tempx,tempy); // does x,y --> tempx,tempy form a rectangle?
    }

}

This allows you to get big performance gains towards the end of your large searches (if it's a small search, you don't gain as much but you don't care because it's a small search!)

This also doesn't work as well for more random distrobutions.

Another optimization is to use a paint-fill algorithm to find the largest consecutive areas. This is O(len*wid), which is a small cost. This will allow you to search the most likely areas for a large rectangle to be.

Note that neither of these approaches reduce the worst case. But, they do reduce the real-world expected running time.

Biparietal answered 27/4, 2011 at 21:33 Comment(0)

I had to solve a very similar problem for my first person shooter. I use that in input:
[  ][  ][  ][  ][  ][  ][  ][  ]
[  ][  ][  ][X][  ][  ][  ][  ]
[  ][X][X][X][X][X][X][X]
[  ][  ][X][X][X][X][  ][  ]
[  ][X][X][X][X][  ][  ][  ]
[  ][X][X][X][X][  ][  ][  ]
[  ][  ][X][  ][  ][  ][  ][  ]
[  ][  ][  ][  ][  ][  ][  ][  ]

I get that in output:
[  ][  ][  ][  ][  ][  ][  ][  ]
[  ][  ][  ][A][  ][  ][  ][  ]
[  ][B][G][G][G][F][E][E]
[  ][  ][G][G][G][F][  ][  ]
[  ][D][G][G][G][  ][  ][  ]
[  ][D][G][G][G][  ][  ][  ]
[  ][  ][C][  ][  ][  ][  ][  ]
[  ][  ][  ][  ][  ][  ][  ][  ]

This schema is better. The source code (under GNU General Public License version 2) is here, it is heavily commented. You may have to adapt it a bit to your needs like the one suggested by j_random_hacker.

Fives answered 10/12, 2012 at 13:53 Comment(3)

Where exactly is the relevant code? The sourceforge folder is quite big. – Postconsonantal 15/11, 2013 at 23:11

Jan, I've just fixed the URL, the code containing the relevant code is in ArrayHelper.computeFullArraysFromNonFullArray(). – Fives 4/2, 2014 at 19:39

@JanBerktold I have commented absolutely everything in this class and I have created a unit test that used square and rectangular arrays in input. It's not very fast but it works :) – Fives 1/11, 2015 at 22:55

My own solution is to find the biggest rectangle, using the same algorithm as in @j_random_hacker answer, then split the remaining area into 4 regions, and recursively search for the biggest rectangle again in each of these regions.

Link to C++ sources

It will find less rectangles than the accepted answer, because I have found it difficult to adopt that algorithm to save each intermediary rectangle when searching for the biggest one. The algorithm skips all smaller rectangles, so we have to iterate through every point in our grid, to save every rectangle possible, then discard smaller ones, and this bumps the algorithm back to O(M³ ⋅ N³) complexity.

We can split the remaining area in two ways, the algorithm will check both, and will use the option which covers the most area, so it will perform the recursive call twice - first time to calculate the area, second time to fill the output array.

    ****|***|***                ************
    ****|***|***                ************
    ****#####***                ----#####---
    ****#####***        vs      ****#####***
    ****#####***                ----#####---
    ****|***|***                ************

We can leave only one choice of area split to make the algorithm run faster, because this area comparison does not provide much improvement to the amount of detected rectangles, to be honest.

Edit: I just realized that recursively checking both splitting variants raises the algorithm to factorial complexity, to something like O(min(M,N)!). So I've disabled the second area split, which leaves the algorithm with complexity around O(M⋅N⋅log(M⋅N)).

Delaminate answered 27/5, 2016 at 14:20 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags