I have a table in an oracle database. The schema is
create table PERIODS
(
ID NUMBER,
STARTTIME TIMESTAMP,
ENDTIME TIMESTAMP,
TYPE VARCHAR2(100)
)
I have two different TYPE's
: TYPEA
and TYPEB
. The have independent start and end times and they can overlap. What I would like to find are the periods of TYPEB
that started, are fully contained or ended within a given period of TYPEA
.
Here is what I came up with so far (with some sample data)
WITH mydata
AS (SELECT 100 ID,
To_timestamp('2015-08-01 11:00', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:20', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEA' TYPE
FROM dual
UNION ALL
SELECT 110 ID,
To_timestamp('2015-08-01 11:30', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:50', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEA' TYPE
FROM dual
UNION ALL
SELECT 120 ID,
To_timestamp('2015-08-01 12:00', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 12:20', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEA' TYPE
FROM dual
UNION ALL
SELECT 105 ID,
To_timestamp('2015-08-01 10:55', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:05', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEB' TYPE
FROM dual
UNION ALL
SELECT 108 ID,
To_timestamp('2015-08-01 11:05', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 11:15', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEB' TYPE
FROM dual
UNION ALL
SELECT 111 ID,
To_timestamp('2015-08-01 11:15', 'YYYY-MM-DD HH24:MI') STARTTIME,
To_timestamp('2015-08-01 12:25', 'YYYY-MM-DD HH24:MI') ENDTIME,
'TYPEB' TYPE
FROM dual),
typeas
AS (SELECT starttime,
endtime
FROM mydata
WHERE TYPE = 'TYPEA'),
typebs
AS (SELECT id,
starttime,
endtime
FROM mydata
WHERE TYPE = 'TYPEB')
SELECT id
FROM typebs b
join typeas a
ON ( b.starttime BETWEEN a.starttime AND a.endtime )
OR ( b.starttime BETWEEN a.starttime AND a.endtime
AND b.endtime BETWEEN a.starttime AND a.endtime )
OR ( b.endtime BETWEEN a.starttime AND a.endtime )
ORDER BY id;
This seems to work in principle, the result from the query above is
ID
----------
105
108
111
so it selects the three periods TYPEB
that started or ended inside the first TYPEA
period.
The problem is that the table has about 200k entries and already at this size the above query is quite slow --- which is very surprising to me as the number of both TYPEA
and TYPEB
entries is quite low ( 1-2k )
Is there a more efficient way to perform this type of self join? Did I miss something else in my query?
type
values, but eachtype
only has 1,000-2,000 rows. Does that imply that ~196,000 rows in the table have aNULL
for theirtype
? If you're asking a tuning question, can you post the query plan you're getting and what indexes are available? – Nomadicjoin
conditions what you are doing is (A union B union A intersection B), which is equivalent to A union B (conditions being referred to as A,B). Try taking that out and see if you get any performance gain. – Tolerance(type,starttime,endtime)
and that the statistics ontype
were accurate (which might require a histogram depending on how many differenttype
values they really are and whether typeA and typeB are more or less common than other type values). – NomadicWHERE
clause as it is redundant, see sqlfiddle.com/#!4/4d32c/12 – Synchronizeselect type, count(*) from periods group by type;
? – Hotheaded