Oracle Analytic Question
Asked Answered
E

4

6

Given a function zipdistance(zipfrom,zipto) which calculates the distance (in miles) between two zip codes and the following tables:

create table zips_required(
   zip varchar2(5)
);

create table zips_available(
   zip varchar2(5),
   locations number(100)
);

How can I construct a query that will return to me each zip code from the zips_required table and the minimum distance that would produce a sum(locations) >= n.

Up till now we've just run an exhaustive loop querying for each radius until we've met the criteria.

--Do this over and over incrementing the radius until the minimum requirement is met
select count(locations) 
from zips_required zr 
left join zips_available za on (zipdistance(zr.zip,za.zip)< 2) -- Where 2 is the radius

This can take a while on a large list. It feels like this could be done with an oracle analytic query along the lines of:

min() over (
  partition by zips_required.zip 
  order by zipdistance( zips_required.zip, zips_available.zip)
  --range stuff here?
) 

The only analytic queries I have done have been "row_number over (partition by order by)" based, and I'm treading into unknown areas with this. Any guidance on this is greatly appreciated.

Engvall answered 23/6, 2009 at 16:42 Comment(0)
W
2

This is what I came up with :

SELECT zr, min_distance
  FROM (SELECT zr, min_distance, cnt, 
               row_number() over(PARTITION BY zr ORDER BY min_distance) rnk
           FROM (SELECT zr.zip zr, zipdistance(zr.zip, za.zip) min_distance,
                         COUNT(za.locations) over(
                             PARTITION BY zr.zip 
                             ORDER BY zipdistance(zr.zip, za.zip)
                         ) cnt
                    FROM zips_required zr
                   CROSS JOIN zips_available za)
          WHERE cnt >= :N)
 WHERE rnk = 1
  1. For each zip_required calculate the distance to the zip_available and sort them by distance
  2. For each zip_required the count with range allows you to know how many zip_availables are in the radius of that distance.
  3. filter (first where COUNT(locations) > N)

I used to create sample data:

INSERT INTO zips_required
   SELECT to_char(10000 + 100 * ROWNUM) FROM dual CONNECT BY LEVEL <= 5;

INSERT INTO zips_available
   (SELECT to_number(zip) + 10 * r, 100 - 10 * r FROM zips_required, (SELECT ROWNUM r FROM dual CONNECT BY LEVEL <= 9));

CREATE OR REPLACE FUNCTION zipdistance(zipfrom VARCHAR2,zipto VARCHAR2) RETURN NUMBER IS
BEGIN
   RETURN abs(to_number(zipfrom) - to_number(zipto));
END zipdistance;
/

Note: you used COUNT(locations) and SUM(locations) in your question, I assumed it was COUNT(locations)

Wulf answered 24/6, 2009 at 7:51 Comment(0)
G
1
SELECT  *
FROM    (
        SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY rn DESC) AS rn2
        FROM    (
                SELECT  zip, zd, ROW_NUMBER() OVER (PARTITION BY zip ORDER BY zd DESC) AS rn
                FROM    (
                        SELECT  zr.zip, zipdistance(zr.zip, za.zip) AS zd
                        FROM    zips_required zr
                        JOIN    zips_available za
                        )
                )
        WHERE   rn <= n
        )
WHERE   rn2 = 1

For each zip_required, this will select the minimal distance into which fit N zip_available's, or maximal distance if the number of zip_available's is less than N.

Grimbald answered 23/6, 2009 at 16:49 Comment(3)
I think this is close. In your example, rn will just be the ranking of the distance between 2 zips ordered by the distance. What I need is the zipdistance of the last one in that list which the sum of its locations plus all previous locations is greater than or equal to N.Engvall
@Josh: this will return the distance of the farthest location with the N closest. Isn't it what do you want?Grimbald
limit 1 in an Oracle query? I missed something.Obese
S
1

I solved the same problem by creating a subset of ZIP's within a square radius from the given zip (easy math: < or > NSWE radius ), then iterating through each entry in the subset to see if it was within the needed radius. Worked like a charm and was very fast.

Sofer answered 27/6, 2009 at 23:42 Comment(0)
C
0

I had partly similar requirements in one of my old projects... to calculate distance between 2 zipcodes in the US. To solve the same I had made great use of US Spatial Data. Basically the approach was to get the Source Zipcode(Latitude, Longitude) and Destination Zipcode(Latitude, Longitude). Now then I had applied a function to get the distance based on the above. The base formula that helps in doing this calculation is available in the following site I had also validated the outcome by referring to this site...

Note: However this will provide approximate distances, so one can use this accordingly. Benefits are once constructed its superfast to fetch the results.

Culp answered 10/8, 2012 at 18:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.