Joins on spatial mysql indexes
Asked Answered
A

4

12

I have two tables: one with points, the other with polys.

CREATE TABLE `points` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `point` point NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM;

CREATE TABLE `ranges` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `poly` polygon NOT NULL,
  PRIMARY KEY (`id`),
  SPATIAL KEY `poly` (`poly`)
) ENGINE=MyISAM;

I want to join ranges to points on points inside polys. Queries look simple:

SELECT * 
  FROM points 
    LEFT JOIN ranges 
      ON MBRCONTAINS(poly, point) 
  WHERE points.id = 2;

This query works fast and uses indexes, part of explain:

table  | type  | possible_keys | key  | key_len
ranges | range | poly          | poly | 34

But, when I try to join with several rows from table points:

SELECT * 
  FROM points 
   LEFT JOIN ranges 
    ON MBRCONTAINS(poly, point) 
  WHERE points.id IN (1,2,3);

everything breaks down:

+----+-------------+------------+-------+---------------+---------+---------+------+--------+-------------+
| id | select_type | table      | type  | possible_keys | key     | key_len | ref  | rows   | Extra       |
+----+-------------+------------+-------+---------------+---------+---------+------+--------+-------------+
|  1 | SIMPLE      | points     | range | PRIMARY       | PRIMARY | 4       | NULL |      3 | Using where |
|  1 | SIMPLE      | ranges     | ALL   | poly          | NULL    | NULL    | NULL | 155183 |             |
+----+-------------+------------+-------+---------------+---------+---------+------+--------+-------------+

Adding FORCE INDEX (poly) does not help.

Sample data to test queries (sorry, only php version, I'm not common with SQL procedures):

//points
for($i=0;$i<=500;$i++) {
    $point = mt_rand();
    mysql_query('INSERT INTO points (point) VALUES (POINTFROMWKB(POINT('.$point.', 0)))');
}

$qty = 20000;
$max = mt_getrandmax();
$add = $max / $qty
$end = 0;

//polys
while($end < $max) {
    $start = $end;
    $end = mt_rand($start, $start + $add);
    mysql_query('INSERT INTO ranges (poly) VALUES (
        GEOMFROMWKB(POLYGON(LINESTRING(
            POINT('.$start.', -1),
            POINT('.$end.',   -1),
            POINT('.$end.',    1),
            POINT('.$start.',  1),
            POINT('.$start.', -1)
          )))
    )');
}
Advowson answered 23/11, 2011 at 15:16 Comment(7)
dba.stackexchange.com ?Misfeasance
Can you explain what do you mean with 'everthing breaks down'? The error message is hard to read?!Damle
Would you be kind to post a sample insert for those tables, and Did you tried rewriting that IN condition to use Joins?Selfexistent
@David 'everthing breaks down' - query not using indexes.Advowson
@Selfexistent I'll make sample inserts in few hours. I can substitute 'IN(1,2,3)' with 'points.id <= 3'. Thanks to everybody!Advowson
It appears your polygons in this example are all squares... will this always be the case, or will you also be dealing with other polygon shapes that may be non-standard (Regular -- Pentagon, Irregular--Simple House Shape, Concave-- like a bow tie, Complex-- outline of a star) as just examples...Ensepulcher
@Ensepulcher I need to use only squares.Advowson
M
6

I believe that it's because MySQL doesn't support merging spatial indexes. Not sure if it's still true but I've read it somewhere in the past. If you have an OR statement, then the spatial indexes are not used

In your case, where are you doing points.id = 1, that's a straight select with one result returned that gets used in the mbrcontains. That uses the index.

When you add points.in (1,2,3), that returns 3 results and each needs to be mapped to the ranges table, therefore not working

result

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   SIMPLE  points  range   PRIMARY     PRIMARY     4   NULL    3   100.00  Using where
1   SIMPLE  ranges  ALL     poly    NULL    NULL    NULL    6467418     100.00   

You can simplify your test without the the point table by doing this: SELECT * FROM ranges where mbrcontains( poly, GEOMFROMWKB(POINT(0, 0)))

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   SIMPLE  ranges  range   poly    poly    34  NULL    1   100.00  Using where

And now this; SELECT * FROM ranges where mbrcontains( poly, GEOMFROMWKB(POINT(0, 0))) OR mbrcontains( poly, GEOMFROMWKB(POINT(10, 10)))

result

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   SIMPLE  ranges  ALL     poly    NULL    NULL    NULL    6467418     100.00  Using where

See that in the second case, you are not using index and just scanning.

You could force the query to use index by creating UNION for each specific point but I am not sure if that's going to be faster. I did some tests locally and it was a bit slower than your first query.

EXPLAIN EXTENDED 
SELECT *
FROM points
FORCE INDEX (PRIMARY )
LEFT JOIN ranges
FORCE INDEX ( poly ) ON mbrcontains( poly, point )
WHERE points.id = 1
UNION DISTINCT
SELECT *
FROM points
FORCE INDEX (PRIMARY )
LEFT JOIN ranges
FORCE INDEX ( poly ) ON mbrcontains( poly, point )
WHERE points.id = 2
UNION DISTINCT
SELECT *
FROM points
FORCE INDEX (PRIMARY )
LEFT JOIN ranges
FORCE INDEX ( poly ) ON mbrcontains( poly, point )
WHERE points.id = 3

result

id  select_type     table   type    possible_keys   key     key_len     ref     rows    filtered    Extra
1   PRIMARY     points  const   PRIMARY     PRIMARY     4   const   1   100.00   
1   PRIMARY     ranges  range   poly    poly    34  NULL    1   100.00  Using where
2   UNION   points  const   PRIMARY     PRIMARY     4   const   1   100.00   
2   UNION   ranges  range   poly    poly    34  NULL    1   100.00  Using where
3   UNION   points  const   PRIMARY     PRIMARY     4   const   1   100.00   
3   UNION   ranges  range   poly    poly    34  NULL    1   100.00  Using where
NULL    UNION RESULT    <union1,2,3>    ALL     NULL    NULL    NULL    NULL    NULL    NULL     
Millymilman answered 29/11, 2011 at 19:26 Comment(1)
@oroboros102, found the reference just in case you want it: dev.mysql.com/doc/refman/5.0/en/range-optimization.html . Last paragraph in 7.3.1.3.1.Millymilman
F
3

I have used successfully similar queries, with only one difference in data model: a spatial key on the points database. In my case:

CREATE TABLE geopoints (
  pid int(11) NOT NULL AUTO_INCREMENT,
  description varchar(255) NOT NULL DEFAULT '',
  geopoint point NOT NULL,
  PRIMARY KEY (pid),
  SPATIAL KEY geopoint (geopoint) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

And everything went well in queries like this:

SELECT pt.pid, x(geopoint), Y(geopoint), pl.pid, AsText(geopolygon) 
  FROM geopoints pt INNER JOIN geopolygons pl ON MBRCONTAINS(geopolygon, geopoint)
 WHERE pt.pid IN (1,2,4,5) AND pl.pid BETWEEN 1 AND 5;

my two cents,

Fatality answered 28/7, 2012 at 21:23 Comment(0)
E
0

If all you are dealing with are squares, I would just deal with 4 numbers in your table that can be indexed representing Top, Left, Height, Width, then run your query where your point in question has its "X" coordinate between Left, Left+Width and "Y" coordinate between Top, Top+Height.

Ensepulcher answered 30/11, 2011 at 12:27 Comment(2)
But I use geometry, because it is much faster than BETWEEN.Advowson
Actualy, MySQL can not join on BETWEEN. So, I had an idea to make such joins on spatial indexes. But, still, that diddn't worked out.Advowson
S
0

You can force MySQL to use the index by encapsulating the subselect in a function.

For example:

DELIMITER $$

DROP FUNCTION IF EXISTS `GetMyPolygon`$$
CREATE DEFINER=`root`@`localhost` FUNCTION `GetMyPolygon`(p POINT) RETURNS INTEGER
BEGIN

DECLARE ret INTEGER;

SET ret = (SELECT range_id FROM ranges WHERE ST_CONTAINS(poly, p) ;

RETURN ret;

END$$

If polygons do not overlap, you can then make:

SELECT *, GetMyPolygon(point) FROM points 

If they do overlap, but there a few, you could make a similar function that makes group_concat...

Soukup answered 11/10, 2018 at 17:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.