SQL Server : Geography search performance - query nearest stores
Asked Answered
E

2

7

I have a performance query nearest stores:

We have a table that contains around 50,000 records (stores/point of sale locations) in one country.

Each record has location columns of type "geography"

[LOCATION_geo] [geography]

Also for performance I created a SPATIAL INDEX over that location column using this syntax

CREATE SPATIAL INDEX [LOCATION_geoIndex] 
ON [dbo].[StoreLocations] ([LOCATION_geo])
USING GEOGRAPHY_GRID 
WITH (
GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM), 
CELLS_PER_OBJECT = 16, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

I have a stored procedure to return the nearest 1000 store for the user current location.

USE [CompanyDB]
GO
SET STATISTICS TIME ON;  
GO  
declare @point geography;
    set @point = geography::Point(49.2471855, -123.1078987, 4326);

    SELECT top (1000) [id]
          ,[Location_Name]
          ,[LOCATION_geo]from [MYDB].[dbo].[StoreLocations]
        where [LOCATION_geo].STDistance(@point) <= 10000
        ORDER BY [LOCATION_geo].STDistance(@point)

The problem is that query always takes 656 ms to 800 ms. And this is not acceptable performance for our web site, because we expect too many synchronous calls.

(1000 row(s) affected)

SQL Server Execution Times: CPU time = 923 ms, elapsed time = 1511 ms.

Note: that most of stores located in some cities (about 10 cities).

Also I noticed that Clustered Index Seek cost >= 45% of total query cost.

So my question is are there a better way to improve the performance of that query?

Ermey answered 23/12, 2016 at 15:2 Comment(10)
Is the spatial index used? Could you confirm it is used via the execution plan?Imprecation
Yes it's used, costs 8%, and estimated row size 17 B , ordered : true, estimated number of excecutions 109.8.Ermey
Also this Index in the excution plan always give a warning "Columns with no statistics: ....SRID , ....pk0", Actually I don't know what it meansErmey
I guess if you are sure you are using the index, this is the best you can get as performance. You can think of an other solution - for example create 10 tables for storing the locations for each big city, and one table for storing the rest of the records; then, having the coordinates of the user, calculated which of these tables to query; maybe reducing the records will improve the performance;Imprecation
I'm curious if returning 1000 rows is what's causing the slowness. What happens if you drop it to 1 as a test? Also, do you need to return the location_geo column in your result set?Sherrylsherurd
@BenThul, number of results doesn't matter, the big differnce happen only if I reduce the radias from (10,000, to for example 1000 meters), this will reduce the time taken dramatically. note: I'm not returning the location_geo in the result.Ermey
Your example query is returning location_geo, which is why I said as much. As for the performance gain in reducing the radius, you might be able to get what you're looking for by adjusting the grid resolution in your spatial index.Sherrylsherurd
@BenThul , could you be more specific , how could I adjusting the grid resolution?, sorry i have no experience on that :) . are there an example?Ermey
The grid resolution is the GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM) part in your spatial index. The documentation does a much better job than I could. technet.microsoft.com/en-us/library/bb964712(v=sql.105).aspxSherrylsherurd
One other thing that occurred to me is that I've had some success by writing these sorts of queries as where @g.STIntersects([LOCATION_geo]) = 1 with @g being defined as declare @g geography = @point.STBuffer(10000);Sherrylsherurd
B
1

I would suggest to add one more column named distance to the table where distance would be distance of LOCATION_geo from Point(0, 0, 0). See sample insert statement below:

   INSERT INTO [GWDB].[dbo].[StoreLocations]
          ([id]
          ,[Location_Name]
          ,[LOCATION_geo]
          ,[Distance])
    Values(@id
          ,@Location_Name
          ,@LOCATION_geo
          ,@LOCATION_geo..STDistance(Point(0, 0, 0))

You should also create an index on the new column distance and change your Stored Procedure as below:

USE [CompanyDB]
GO
SET STATISTICS TIME ON;  
GO  
declare @point geography;
declare @distance float;
    set @point = geography::Point(49.2471855, -123.1078987, 4326);
    set @distance = @point.STDistance(geography::Point(0, 0, 0);

    SELECT top (1000) [id]
          ,[Location_Name]
          ,[LOCATION_geo]from [GWDB].[dbo].[StoreLocations]
        where 
        distance < @distance AND
        [LOCATION_geo].STDistance(@point) <= 10000
        ORDER BY [LOCATION_geo].STDistance(@point)
Bruns answered 27/12, 2016 at 6:9 Comment(2)
It deserves a try. But this will not result in accurate result, I think if the search point in the west, this query will get only the east stores only (the stores in the east of the search point ).Ermey
I tried it , it doesn't improve the performance scientifically because I think the main reason of low performance is the order by statement. Thank you.Ermey
A
0

I am not sure how well this will work in your application, in certain scenarios this is faster, but slower in others. When the point you are searching around is close to your data, this search process is faster. When the search point is further from your data, it is slower.

In my scenario, all of my points are relatively close (16 million records). These are the speed differences I see.

|--Search Location--|--STIntersects() time--|--Numbers time--|
--------------------------------------------------------------
|Close              |5 seconds              |700 ms          |
|Far                |90 ms                  |4 seconds       |

Basically, the idea is to incrementally expand your search area using a numbers table.

DECLARE @point GEOGRAPHY = GEOGRAPHY::Point(49.2471855, -123.1078987, 4326)
DECLARE @MaximumRaidus INT = 10000

SELECT TOP 100
  ID,
  Location_Name,
  Location_geo
FROM
  GWDB.dbo.StoreLocations WITH(INDEX([LOCATION_geoIndex]))
CROSS JOIN
  GWDB.dbo.Numbers N
WHERE
  N.n BETWEEN 0 AND SQRT(@MaximumRadius)
  AND Location_geo.STIntersects(@Point.STBuffer(POWER(N.n,2))) = 1
ORDER BY
  N.n
Aquilar answered 29/12, 2016 at 23:29 Comment(2)
Thank you, but I didn't get what is the dbo.Numbers table?? I don't have such table.Ermey
@Tarek_El-Mallah a numbers table is simply a table of integers. I.e. The table numbers has one column (n INT) that has the values 0 through 10000 or so. They are very useful for doing things like this, or for filling in gaps in data, or splitting data out for a histogram, etc. Google the term and you should find some references to their uses and how to create them.Aquilar

© 2022 - 2024 — McMap. All rights reserved.