SQL to find overlapping time periods and sub-faults
Asked Answered
H

2

9

Long time stalker, first time poster (and SQL beginner). My question is similar to this one SQL to find time elapsed from multiple overlapping intervals, except I'm able to use CTE, UDFs etc and am looking for more detail.

On a piece of large scale equipment I have a record of all faults that arise. Faults can arise on different sub-components of the system, some may take it offline completely (complete outage = yes), while others do not (complete outage = no). Faults can overlap in time, and may not have end times if the fault has not yet been repaired.

Outage_ID     StartDateTime     EndDateTime     CompleteOutage
1             07:00 3-Jul-13    08:55 3-Jul13   Yes
2             08:30 3-Jul-13    10:00 4-Jul13   No
3             12:00 4-Jul-13                    No
4             12:30 4-Jul13     12:35 4-Jul-13  No


1 |---------|
2    |---------|
3                 |--------------------------------------------------------------
4                      |---|

I need to be able to work out for a user defined time period, how long the total system is fully functional (no faults), how long its degraded (one or more non-complete outages) and how long inoperable (one or more complete outages). I also need to be able to work out for any given time period which faults were on the system. I was thinking of creating a "Stage Change" table anytime a fault is opened or closed, but I am stuck on the best way to do this - any help on this or better solutions would be appreciated!

Helton answered 16/12, 2013 at 11:26 Comment(5)
Just thinking about doing this in SQL makes my head explode. :) May I ask why not do this in application code?Instinct
This can be done in SQL (search for the gaps and islands problem) but it's too complicated.Goods
imho it would be interesting to do it via SQL, if done via code and you had to go through multiple years of faults of different components it would quickly become unfeasible.Evora
Which DBMS are you using? Postgres? Oracle?Talion
I know how you feel - my head a exploded a couple of days ago already! I figured would be faster in SQL rather than a series of loops in application code. I'm using MS SQL Server 2005.Helton
B
2

This isn't a complete solution (I leave that as an exercise :)) but should illustrate the basic technique. The trick is to create a state table (as you say). If you record a 1 for a "start" event and a -1 for an "end" event then a running total in event date/time order gives you the current state at that particular event date/time. The SQL below is T-SQL but should be easily adaptable to whatever database server you're using.

Using your data for partial outage as an example:

DECLARE @Faults TABLE (
    StartDateTime DATETIME NOT NULL,
    EndDateTime DATETIME NULL
)
INSERT INTO @Faults (StartDateTime, EndDateTime)
    SELECT '2013-07-03 08:30', '2013-07-04 10:00'
    UNION ALL SELECT '2013-07-04 12:00', NULL
    UNION ALL SELECT '2013-07-04 12:30', '2013-07-04 12:35'

-- "Unpivot" the events and assign 1 to a start and -1 to an end
;WITH FaultEvents AS (
    SELECT *, Ord = ROW_NUMBER() OVER(ORDER BY EventDateTime)
        FROM (
            SELECT EventDateTime = StartDateTime, Evt = 1
                FROM @Faults
            UNION ALL SELECT EndDateTime, Evt = -1
                FROM @Faults
                WHERE EndDateTime IS NOT NULL
        ) X
)
-- Running total of Evt gives the current state at each date/time point
, FaultEventStates AS (
    SELECT A.Ord, A.EventDateTime, A.Evt, [State] = (SELECT SUM(B.Evt) FROM FaultEvents B WHERE B.Ord <= A.Ord)
        FROM FaultEvents A
)
SELECT StartDateTime = S.EventDateTime, EndDateTime = F.EventDateTime
    FROM FaultEventStates S
        OUTER APPLY (
            -- Find the nearest transition to the no-fault state
            SELECT TOP 1 *
                FROM FaultEventStates B
                WHERE B.[State] = 0
                    AND B.Ord > S.Ord
                ORDER BY B.Ord
        ) F
    -- Restrict to start events transitioning from the no-fault state
    WHERE S.Evt = 1 AND S.[State] = 1

If you are using SQL Server 2012 then you have the option to calculate the running total using a windowing function.

Berget answered 17/12, 2013 at 9:10 Comment(1)
Thanks David! Took me a while to get my head around this but I think I can use it as a base.Helton
T
1

The below is a rough guide to getting this working. It will compare against an interval table of dates and an interval table of 15 mins. It will then sum the outage events (1 event per interval), but not sum a partial outage if there is a full outage.

You could use a more granular time interval if you needed, I choose 15 mins for speed of coding.

I already had a date interval table set up "CAL.t_Calendar" so you would need to create one of your own to run this code.

Please note, this does not represent actual code you should use. It is only intended as a demonstration and to point you in a possible direction...

EDIT I've just realised I have't accounted for the null end dates. The code will need amending to check for NULL endDates and use @EndDate or GETDATE() if @EndDate is in the future

            --drop table ##Events
            CREATE TABLE #Events (OUTAGE_ID INT IDENTITY(1,1) PRIMARY KEY 
                                    ,StartDateTime datetime
                                    ,EndDateTime datetime
                                    , completeOutage bit)

            INSERT INTO #Events VALUES ('2013-07-03 07:00','2013-07-03 08:55',1),('2013-07-03 08:30','2013-07-04 10:00',0)
                                        ,('2013-07-04 12:00',NULL,0),('2013-07-04 12:30','2013-07-04 12:35',0)


            --drop table #FiveMins
            CREATE TABLE #FiveMins (ID int IDENTITY(1,1) PRIMARY KEY, TimeInterval Time)


            DECLARE @Time INT = 0

            WHILE @Time <= 1410 --number of 15 min intervals in day * 15 

            BEGIN

                INSERT INTO #FiveMins SELECT DATEADD(MINUTE , @Time,  '00:00')

                SET @Time = @Time + 15

            END

            SELECT * from #FiveMins



            DECLARE @StartDate DATETIME = '2013-07-03'
            DECLARE @EndDate DATETIME = '2013-07-04 23:59:59.999'


            SELECT SUM(FullOutage) * 15 as MinutesFullOutage
                    ,SUM(PartialOutage) * 15 as MinutesPartialOutage
                    ,SUM(NoOutage) * 15  as MinutesNoOutage
            FROM
            (
                SELECT DateAnc.EventDateTime
                        , CASE WHEN COUNT(OU.OUTAGE_ID) > 0 THEN 1 ELSE 0 END AS FullOutage
                        , CASE WHEN COUNT(OU.OUTAGE_ID) = 0 AND COUNT(pOU.OUTAGE_ID) > 0 THEN 1 ELSE 0 END AS PartialOutage
                        , CASE WHEN COUNT(OU.OUTAGE_ID) > 0 OR  COUNT(pOU.OUTAGE_ID) > 0 THEN 0 ELSE 1 END AS NoOutage
                FROM 
                (
                    SELECT CAL.calDate + MI.TimeInterval AS EventDateTime
                    FROM CAL.t_Calendar CAL

                    CROSS JOIN #FiveMins MI

                    WHERE CAL.calDate BETWEEN @StartDate AND @EndDate 
                ) DateAnc

                LEFT JOIN #Events OU
                ON DateAnc.EventDateTime BETWEEN OU.StartDateTime AND OU.EndDateTime
                AND OU.completeOutage = 1

                LEFT JOIN #Events pOU
                ON DateAnc.EventDateTime BETWEEN pOU.StartDateTime AND pOU.EndDateTime
                AND pOU.completeOutage = 0

                GROUP BY DateAnc.EventDateTime
            ) AllOutages
Thiazine answered 16/12, 2013 at 17:1 Comment(2)
I thought about using a similar approach to this, but I need to run the same code for multiple systems at multiple sites (roughly 60 total), so I thought that it would get quite slow. Granularity is important (some faults are less than a minute for systems resets) so would have to use short time intervals too, which makes it worse. Thanks for the answer though, I do appreciate it!Helton
It is true that this solution may not scale very well, especially over long periods of time. I would also say that the CTE method will perhaps also suffer from scaling issues, but from number of events, not necessarily time range. I'd test both to be honest and keep an eye out for another method - whichever suits your needs better.Thiazine

© 2022 - 2024 — McMap. All rights reserved.