I have a road_events
table:
create table road_events (
event_id number(4,0),
road_id number(4,0),
year number(4,0),
from_meas number(10,2),
to_meas number(10,2),
total_road_length number(10,2)
);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (1,1,2020,25,50,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (2,1,2000,25,50,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (3,1,1980,0,25,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (4,1,1960,75,100,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (5,1,1940,1,100,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (6,2,2000,10,30,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (7,2,1975,30,60,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (8,2,1950,50,90,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (9,3,2050,40,90,100);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (10,4,2040,0,200,200);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (11,4,2013,0,199,200);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (12,4,2001,0,200,200);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (13,5,1985,50,70,300);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (14,5,1985,10,50,300);
insert into road_events (event_id, road_id, year, from_meas, to_meas, total_road_length) values (15,5,1965,1,301,300);
commit;
select * from road_events;
EVENT_ID ROAD_ID YEAR FROM_MEAS TO_MEAS TOTAL_ROAD_LENGTH
---------- ---------- ---------- ---------- ---------- -----------------
1 1 2020 25 50 100
2 1 2000 25 50 100
3 1 1980 0 25 100
4 1 1960 75 100 100
5 1 1940 1 100 100
6 2 2000 10 30 100
7 2 1975 30 60 100
8 2 1950 50 90 100
9 3 2050 40 90 100
10 4 2040 0 200 200
11 4 2013 0 199 200
12 4 2001 0 200 200
13 5 1985 50 70 300
14 5 1985 10 50 300
15 5 1965 1 301 300
I want to select the events that represent the most recent work on each road.
This is a tricky operation, because the events often pertain to only a portion of the road. This means that I can't simply select the most recent event per road; I need to only select the most recent event mileage that doesn't overlap.
Possible logic (in order):
I'm reluctant to guess at how this problem could be solved, because it could end up hurting more than it helps (kind of like the XY Problem). On the other hand, it might provide insight into the nature of the problem, so here it goes:
- Select the most recent event for each road. We'll call the most recent event:
event A
. - If
event A
is>= total_road_length
, then that's all I need. The algorithm ends here. - Else, get the next chronological event (
event B
) that does not have the same extents asevent A
. - If the extents of
event B
overlap the extents ofevent A
, then only get the portion(s) ofevent B
that do not overlap. - Repeat steps 3 and 4 until the total event length is
= total_road_length
. Or stop when there are no more events for that road.
Question:
I know it's a tall order, but what would it take to do this?
This is a classic linear referencing problem. It would be extremely helpful if I could do linear referencing operations as part of queries.
The result would be:
EVENT_ID ROAD_ID YEAR TOTAL_ROAD_LENGTH EVENT_LENGTH
---------- ---------- ---------- ----------------- ------------
1 1 2020 100 25
3 1 1980 100 25
4 1 1960 100 25
5 1 1940 100 25
6 2 2000 100 20
7 2 1975 100 30
8 2 1950 100 30
9 3 2050 100 50
10 4 2040 200 200
13 5 1985 300 20
14 5 1985 300 40
15 5 1965 300 240
Related question: Select where number range does not overlap
Option 1 (dnoeth's option):
Returning 3 rows would be fine; it's the most explicit option.Option 2:
However, in hindsight, I don't think it's really necessary to return thefrom_meas
andto_meas
as part of the query. But it is necessary to know the length of the events that are returned. Therefore, we could just return anevent_length
column, which would be an aggregate of10-20/2010
and40-100/2010
. And of course, a row for20-40/2020
. – Caelian