Pre RTree step: Divide a set of points into rectangular regions each containing one point

D

3

6

given my current position (lat,long) I want to quickly find the nearest neighbor in a points of interest problem. Thus I intend to use an R-Tree database, which allows for quick lookup. However, first the database must be populated - of course. Therefore, I need to determine the rectangular regions that covers the area, where each region contains one point of interest.

My question is how do I preprocess the data, i.e. how do I subdivide the area into these rectangular sub-regions? (I want rectangular regions because they are easily added to the RTree - in contrast to more general Voronoi regions).

/John

Denishadenison answered 15/2, 2009 at 21:31 Comment(0)

T

2

Edit: The below approach works, but ignores the critical feature of R-trees -- that The splitting behavior of R-tree nodes is well defined, and maintains a balanced tree (through B-tree-like properties). So in fact, all you have to do is:

Pick the maximum number of rectangles per page
Create seed rectangles (use points furthest away from each other, centroids, whatever).
For each point, choose a rectangle to put it into
1. If it already falls into a single rectangle, put it in there
2. If it does not, extend the rectangle that needs to be extended least (different ways to measure "least" exits -- area works)
3. If multiple rectangles apply -- choose one based on how full it is, or some other heuristic
If overflow occurs -- use the quadratic split to move things around...
And so on, using R-tree algorithms straight out of a text book.

I think the method below is ok for finding your initial seed rectangles; but you don't want to populate your whole R-tree that way. Doing the splits and rebalancing all the time can be a bit expensive, so you will probably want to do a decent chunk of the work with the KD approach below; just not all of the work.

The KD approach: enclose everything in a rectangle.

If the number of points in the rectangle is > threshold, sweep in direction D until you cover half the points.

Divide into rectangles left and right (or above and below) the splitting point).

Call the same procedure recursively on the new rectangles, with the next direction (if you were going left to right, you will now go top to bottom, and vice versa).

The advantage this has over the divide-into-squares approach offered by another poster is that it accommodates skewed point distributions better.

Trigonal answered 15/2, 2009 at 22:10 Comment(5)

Yes, this makes a subdivision even though it is a bit crude/naive approach, resulting in a not very good subdivision. There must (I think at least) exist a better algorithm for this problem. Maybe a two-step method, which first generates the Voronoi regions and then converts those into rectangles.. – Denishadenison 16/2, 2009 at 13:14

John, it actually works pretty ok in practice, unless you have so many dimensions you run out of points before you rotate through. This is how KD-trees are built. Try it. – Trigonal 16/2, 2009 at 13:36

Ok, I might give it a try - but I got a really bad case just by testing with paper and pen... What I otherwise will do, if no-one else comes up with an interesting enough suggestion, is to generate the Voronoi diagram and then build one Minimum Bounding Rectangle for each Voronoi cell. /John – Denishadenison 16/2, 2009 at 14:1

Forgot to say that those generated MBRs will sometimes overlap, but that will not be a problem since the resultset will only contain a handful of candidates, which I can test individually. – Denishadenison 16/2, 2009 at 14:5

We're posting at the same time.. so, the whole thing about r-trees is that you CAN overlap. So it's actually NEVER a problem! – Trigonal 16/2, 2009 at 14:6

R

2

Oracle Spatial Cartridge documentation describes tessellation algorithm that can do what you want. In short:

enclose all your points in square
if square contains 1 point - index square
if square does not contain points - ignore it
if square contains more then 1 point
- split square into 4 equal squares and repeat analysis for each new square

Result should be something like this:
alt text http://download-uk.oracle.com/docs/cd/A64702_01/doc/cartridg.805/a53264/sdo_ina5.gif

Rubyeruch answered 15/2, 2009 at 22:1 Comment(0)

T

2

Edit: The below approach works, but ignores the critical feature of R-trees -- that The splitting behavior of R-tree nodes is well defined, and maintains a balanced tree (through B-tree-like properties). So in fact, all you have to do is:

Pick the maximum number of rectangles per page
Create seed rectangles (use points furthest away from each other, centroids, whatever).
For each point, choose a rectangle to put it into
1. If it already falls into a single rectangle, put it in there
2. If it does not, extend the rectangle that needs to be extended least (different ways to measure "least" exits -- area works)
3. If multiple rectangles apply -- choose one based on how full it is, or some other heuristic
If overflow occurs -- use the quadratic split to move things around...
And so on, using R-tree algorithms straight out of a text book.

I think the method below is ok for finding your initial seed rectangles; but you don't want to populate your whole R-tree that way. Doing the splits and rebalancing all the time can be a bit expensive, so you will probably want to do a decent chunk of the work with the KD approach below; just not all of the work.

The KD approach: enclose everything in a rectangle.

If the number of points in the rectangle is > threshold, sweep in direction D until you cover half the points.

Divide into rectangles left and right (or above and below) the splitting point).

Call the same procedure recursively on the new rectangles, with the next direction (if you were going left to right, you will now go top to bottom, and vice versa).

The advantage this has over the divide-into-squares approach offered by another poster is that it accommodates skewed point distributions better.

Trigonal answered 15/2, 2009 at 22:10 Comment(5)

Yes, this makes a subdivision even though it is a bit crude/naive approach, resulting in a not very good subdivision. There must (I think at least) exist a better algorithm for this problem. Maybe a two-step method, which first generates the Voronoi regions and then converts those into rectangles.. – Denishadenison 16/2, 2009 at 13:14

John, it actually works pretty ok in practice, unless you have so many dimensions you run out of points before you rotate through. This is how KD-trees are built. Try it. – Trigonal 16/2, 2009 at 13:36

Ok, I might give it a try - but I got a really bad case just by testing with paper and pen... What I otherwise will do, if no-one else comes up with an interesting enough suggestion, is to generate the Voronoi diagram and then build one Minimum Bounding Rectangle for each Voronoi cell. /John – Denishadenison 16/2, 2009 at 14:1

Forgot to say that those generated MBRs will sometimes overlap, but that will not be a problem since the resultset will only contain a handful of candidates, which I can test individually. – Denishadenison 16/2, 2009 at 14:5

We're posting at the same time.. so, the whole thing about r-trees is that you CAN overlap. So it's actually NEVER a problem! – Trigonal 16/2, 2009 at 14:6

T

0

I think you a missing something in the problem statement. Assume you have N points (x, y) such that every point has a unique x- and y-coordinate. You can divide your area into N rectangles then by just dividing it into N vertical columns. But that does not help you to solve the nearest POI problem easily, does it? So I think you are thinking about something about the rectangle structure which you haven't articulated yet.

Illustration:

| | | | |5| | |
|1| | | | |6| |
| | |3| | | | |
| | | | | | | |
| |2| | | | | |
| | | | | | |7|
| | | |4| | | |

The numbers are POIs and the vertical lines show a subdivision into 7 rectangular areas. But clearly there isn't much "interesting" information in the subdivision. Is there some additional criterion on the subdivision which you haven't mentioned?

Torrell answered 16/2, 2009 at 8:21 Comment(2)

Well, I mentioned that Voronoi diagrams is not well-suited for the R-Tree, even though they are mathematically correct and divides the area into perfect regions. This implied, at least to me ;-) that I was looking for something similar but with the added constraint that the regions to be rectangular – Denishadenison 16/2, 2009 at 11:9

I ran out of characters in my previous comment... So, yes, my problem statement could be made more exact and explicit - but at least the first two respondents understood my original question. :-) /John – Denishadenison 16/2, 2009 at 11:14

Recommended topics

Hot tags