Query for maximum number of concurrent time spans
Asked Answered
Z

5

8

I have a SQL Server table with two datetime fields (CnxStartdatetime, CnxEnddatetime). Each row represents a transmission of information. I am trying to find the maximum number of concurrent transmissions based on these two timestamps. I have a working query but it is both slow and extremely cumbersome. I know there must be a better way to go about this but can't come up with any.

For the current version, if I run it with 5 "levels" and get results I have to go back and add a ton of SQL to test if there are instances of 6 concurrent transmissions, etc. Once the query gets 7-8 "levels" deep it becomes very slow.

Snippet of current version:

select 
    t1.id, t2.id, t3.id, t4.id, t5.id, t6.id, t7.id, t8.id, t9.id, t10.id

FROM
dbo.MyTable t1, dbo.MyTable t2, dbo.MyTable t3, dbo.MyTable t4, dbo.MyTable t5,
dbo.MyTable t6, dbo.MyTable t7, dbo.MyTable t8, dbo.MyTable t9, dbo.MyTable t10
WHERE
(((t2.cnxstartdatetime >= t1.cnxstartdatetime) and (t2.cnxstartdatetime <= t1.cnxenddatetime))
or ((t2.cnxenddatetime >= t1.cnxstartdatetime) and (t2.cnxenddatetime <= t1.cnxenddatetime)))
AND
t2.id != t1.id
AND
(((t3.cnxstartdatetime >= t2.cnxstartdatetime) and (t3.cnxstartdatetime >= t1.cnxstartdatetime)and (t3.cnxstartdatetime <= t1.cnxenddatetime) and (t3.cnxstartdatetime <= t2.cnxenddatetime))
or ((t3.cnxenddatetime >= t2.cnxstartdatetime) and (t3.cnxenddatetime >= t1.cnxstartdatetime)and (t3.cnxenddatetime <= t1.cnxenddatetime) and (t3.cnxenddatetime <= t2.cnxenddatetime)))
AND
t3.id != t2.id AND t3.id != t1.id
AND
(((t4.cnxstartdatetime >= t3.cnxstartdatetime) and (t4.cnxstartdatetime >= t1.cnxstartdatetime)and (t4.cnxstartdatetime >= t2.cnxstartdatetime) and (t4.cnxstartdatetime <= t1.cnxenddatetime) and (t4.cnxstartdatetime <= t3.cnxenddatetime)and (t4.cnxstartdatetime <= t2.cnxenddatetime))
or ((t4.cnxenddatetime >= t3.cnxstartdatetime) and (t4.cnxenddatetime >= t1.cnxstartdatetime)and (t4.cnxenddatetime >= t2.cnxstartdatetime) and (t4.cnxenddatetime <= t1.cnxenddatetime)and (t4.cnxenddatetime <= t3.cnxenddatetime)and (t4.cnxenddatetime <= t2.cnxenddatetime)))
AND
t4.id != t3.id AND t4.id != t2.id AND t4.id != t1.id
... *snip*

Edit Many of the responses are suggesting I use a cross join. This does not achieve the results I am looking for. Here's an example of the results of the cross join for one record's "overlaps." This is the list that it gives me for ID 11787 As you can see, 11781 does not overlap 11774 This is simply a list of any record whose time span intersects 11787

11774    2011-04-29 01:02:56.780    2011-04-29 01:02:58.793
11777    2011-04-29 01:02:56.780    2011-04-29 01:02:58.843
11778    2011-04-29 01:02:56.780    2011-04-29 01:02:58.950
11775    2011-04-29 01:02:56.793    2011-04-29 01:02:58.843
11776    2011-04-29 01:02:56.793    2011-04-29 01:02:58.890
11780    2011-04-29 01:02:58.310    2011-04-29 01:03:02.687
11779    2011-04-29 01:02:58.327    2011-04-29 01:03:02.543
11787    2011-04-29 01:02:58.530    2011-04-29 01:03:08.827 **
11781    2011-04-29 01:02:59.030    2011-04-29 01:03:05.187
11782    2011-04-29 01:02:59.247    2011-04-29 01:03:05.467
11784    2011-04-29 01:02:59.293    2011-04-29 01:03:05.810
11791    2011-04-29 01:03:00.107    2011-04-29 01:03:13.623
11786    2011-04-29 01:03:00.843    2011-04-29 01:03:08.983
11783    2011-04-29 01:03:02.560    2011-04-29 01:03:05.793
11785    2011-04-29 01:03:02.717    2011-04-29 01:03:07.357
11790    2011-04-29 01:03:05.200    2011-04-29 01:03:14.153
11804    2011-04-29 01:03:05.687    2011-04-29 01:03:25.577
11811    2011-04-29 01:03:07.093    2011-04-29 01:03:35.153
11799    2011-04-29 01:03:07.123    2011-04-29 01:03:24.437
11789    2011-04-29 01:03:08.793    2011-04-29 01:03:13.577

I've also attempted writing a CTE with recursion but I can't figure out how to insure the current ID doesn't match any previous ID in the current stack of concurrency. The below just recurses upon itself until it hits the limit.

WITH TransmissionConcurrency (StartTime, EndTime, ConcurrencyLevel) AS
(
    SELECT
        CnxStartDatetime AS StartTime,
        CnxEndDatetime AS EndTime,
        1 AS ConcurrencyLevel
    FROM dbo.MyTable

    UNION ALL

    SELECT
        CASE WHEN d.CnxStartDatetime > tc.StartTime THEN d.CnxStartDatetime ELSE tc.StartTime END AS StartTime,
        CASE WHEN d.CnxEndDatetime < tc.EndTime THEN d.CnxEndDatetime ELSE tc.EndTime END AS EndDate,
        tc.ConcurrencyLevel + 1 as ConcurrencyLevel
    FROM dbo.MyTable d
        INNER JOIN TransmissionConcurrency tc ON
            ((d.CnxStartDatetime between tc.StartTime and tc.EndTime)
            or
            (d.CnxEndDatetime between tc.StartTime and tc.EndTime)
            or
            (d.CnxStartDatetime <= tc.StartTime and d.CnxEndDatetime >= tc.EndTime))
)

SELECT * 
FROM TransmissionConcurrency
ORDER BY ConcurrencyLevel, StartTime, EndTime

I've come up with the below diagram to try to better explain what I'm looking for.

A         [--------]
B    [-----]
C              [------]
D   [---]
E             [---]
F         [-]

In this instance, the cross join methods would tell me that the maximum concurrency with A is 6 (A with B, C, D, E and F) What I'm looking for would be a max concurrency of 3 (A with B,F or A with C,E)

Zeal answered 29/4, 2011 at 5:31 Comment(1)
possible duplicate of Finding simultaneous events in a database between timesTrillium
S
1

Jeff. I've written a similar query once - but in Oracle - not sure whether this will work in SQL-Server, but it's worth trying: maybe it'll give you some idea:

select
  t.time as b,
  lead(t.time)  over (order by t.time, t.weight desc) as e,
  sum(t.weight) over (order by t.time, t.weight desc) as cnt
from
  ( select trunc(:aStartWith)   as time,  0 as weight from dual
    union all
    select req_recieved as time, +1 as weight
      from log_tbl
      where trunc(req_recieved, 'mi') between :aStartWith - interval '10' minute and :aEndWith + interval '10' minute
    union all
    select response_sent as time, -1 as weight
      from log_tbl
      where trunc(req_recieved, 'mi') between :aStartWith - interval '10' minute and :aEndWith + interval '10' minute
    union all
    select trunc(:aEndWith) as time,  0 as weight from dual
  ) t

The general idea is that I go through all requests between :aStartWith date and :aEndWith date assigning +1 weight portion to every request that starts in the given period and -1 to every request that end in the same period.

Here I assume that requests are no longer that 10 minutes (where trunc(req_recieved, 'mi') between :aStartWith - interval '10' minute and :aEndWith + interval '10' minute); and select ... from dual are boundary conditions.

Then with analytic functions I find the end time of the request (lead(t.time) over (order by t.time, t.weight desc) as e) and sum the weights for the current request - this will give a number of requests starting at time b and ending at time e (sum(t.weight) over (order by t.time, t.weight desc) as cnt).

To find maximum number of requests you can just wrap this query with desired evaluations.

Could you please try if this scenario works for you? Hope it does :)

Standoffish answered 29/4, 2011 at 6:42 Comment(1)
So this is based on passing in pairs of timestamps representing time spans to check for the number of requests occurring within the time span? Given that the timing of requests starting/ending goes down to the millisecond how exactly am I supposed come up with all suitable pairs of start/end timestamps to pass in? I might be confused about how your solution works.Zeal
V
1
declare @T table (ID int, Starts datetime, Ends datetime)
insert into @T (ID, Starts, Ends) values
(1, '2000-12-30', '2000-12-31'),
(2, '2001-01-01', '2001-01-10'),
(3, '2001-01-02', '2001-01-05'),
(4, '2001-01-03', '2001-01-04'),
(5, '2001-01-05', '2001-01-10')

select T1.ID, count(*) as Levels
from @T as T1
  cross join @T as T2
where
  T1.Starts < T2.Ends and
  T1.Starts > T2.Starts
group by T1.ID

select top 1 T1.ID, count(*) as Levels
from @T as T1
  cross join @T as T2
where
  T1.Starts < T2.Ends and
  T1.Starts > T2.Starts
group by T1.ID
order by count(*) desc

Result

ID          Levels
----------- -----------
3           1
4           2
5           1

(3 row(s) affected)

ID          Levels
----------- -----------
4           2

If you want the rows that is the involved you can use this:

select T2.*
from (select top 1 T1.ID
      from @T as T1
        cross join @T as T2
      where
        T1.Starts < T2.Ends and
        T1.Starts > T2.Starts
      group by T1.ID
      order by count(*) desc) as C
  inner join @T as T1
    on C.ID = T1.ID
  inner join @T as T2
    on T1.Starts < T2.Ends and
       T1.Starts > T2.Starts or
       T2.ID = C.ID

Result:

ID          Starts                  Ends
----------- ----------------------- -----------------------
2           2001-01-01 00:00:00.000 2001-01-10 00:00:00.000
3           2001-01-02 00:00:00.000 2001-01-05 00:00:00.000
4           2001-01-03 00:00:00.000 2001-01-04 00:00:00.000
Versed answered 29/4, 2011 at 17:13 Comment(2)
I added some information to the question to show how this isn't quite what I'm looking for.Zeal
@Jeff Added another version. It is not the cross join that is to blame it is the where clause. I guess that the duplicate suggested from Martin is what you want. The difference between that and my change is that when using between is that between is inclusive. This version counts the max concurrent levels instead of total overlapping spans.Versed
B
0

It is rather reporting solution than "standard" database query. Best option for this is to write somewhere transactions quantity at start of every transaction). All other solutions will be slow. But if you really need this...

Simplest solution is to split time period for small parts (eg days) and analyse counts in every piece of period. Here is an example:

DECLARE @table TABLE
    (
      starts DATETIME ,
      ends DATETIME ,
      trn INT
    )

INSERT  INTO @table
        ( starts ,
          ends ,
          trn
        )
        SELECT  '2003-01-01' ,
                '2003-01-03' ,
                1
        UNION
        SELECT  '2003-01-02' ,
                '2003-01-04' ,
                2
        UNION
        SELECT  '2003-01-02' ,
                '2005-06-06' ,
                3 ;
WITH    numbers
          AS ( SELECT   Row_NUmber() OVER ( ORDER BY o.object_id, o2.object_id ) Number
               FROM     sys.objects o
                        CROSS JOIN sys.objects o2
             ),
        Maxx
          AS ( SELECT   MIN(starts) MaxStart ,
                        MAX(ends) MaxEnd
               FROM     @table
             ),
        DDays
          AS ( SELECT   MIN(starts) DDay
               FROM     @table
               UNION ALL
               SELECT   DDay + 1
               FROM     DDays
               WHERE    dday + 1 <= ( SELECT    MaxEnd
                                      FROM      Maxx
                                    )
             )
    SELECT  DDay ,
            COUNT(*) Transactions
    FROM    @Table T
            JOIN DDays D ON D.DDay >= T.starts
                            AND D.DDay <= T.ends
    GROUP BY DDay
    HAVING COUNT(*)>1
    ORDER BY COUNT(*) DESC
OPTION  ( MAXRECURSION 0 )

You can modify last statement to get needed information (transactions in maximum loading period etc)

Bulbul answered 29/4, 2011 at 7:4 Comment(1)
I get the feeling this approach has the same issue as andr's in that the timestamps I'm talking about differ by milliseconds while your approach requires I concoct some range of time periods that cover every possible scenario. Or am I misunderstanding how this works?Zeal
O
0
/* prepare sample data (if needed) */
CREATE TABLE MyTable (ID int, CnxStartdatetime datetime, CnxEnddatetime datetime);
INSERT INTO MyTable (ID, CnxStartdatetime, CnxEnddatetime)
SELECT 11774, '2011-04-29 01:02:56.780', '2011-04-29 01:02:58.793' UNION ALL
SELECT 11777, '2011-04-29 01:02:56.780', '2011-04-29 01:02:58.843' UNION ALL
SELECT 11778, '2011-04-29 01:02:56.780', '2011-04-29 01:02:58.950' UNION ALL
SELECT 11775, '2011-04-29 01:02:56.793', '2011-04-29 01:02:58.843' UNION ALL
SELECT 11776, '2011-04-29 01:02:56.793', '2011-04-29 01:02:58.890' UNION ALL
SELECT 11780, '2011-04-29 01:02:58.310', '2011-04-29 01:03:02.687' UNION ALL
SELECT 11779, '2011-04-29 01:02:58.327', '2011-04-29 01:03:02.543' UNION ALL
SELECT 11787, '2011-04-29 01:02:58.530', '2011-04-29 01:03:08.827' UNION ALL
SELECT 11781, '2011-04-29 01:02:59.030', '2011-04-29 01:03:05.187' UNION ALL
SELECT 11782, '2011-04-29 01:02:59.247', '2011-04-29 01:03:05.467' UNION ALL
SELECT 11784, '2011-04-29 01:02:59.293', '2011-04-29 01:03:05.810' UNION ALL
SELECT 11791, '2011-04-29 01:03:00.107', '2011-04-29 01:03:13.623' UNION ALL
SELECT 11786, '2011-04-29 01:03:00.843', '2011-04-29 01:03:08.983' UNION ALL
SELECT 11783, '2011-04-29 01:03:02.560', '2011-04-29 01:03:05.793' UNION ALL
SELECT 11785, '2011-04-29 01:03:02.717', '2011-04-29 01:03:07.357' UNION ALL
SELECT 11790, '2011-04-29 01:03:05.200', '2011-04-29 01:03:14.153' UNION ALL
SELECT 11804, '2011-04-29 01:03:05.687', '2011-04-29 01:03:25.577' UNION ALL
SELECT 11811, '2011-04-29 01:03:07.093', '2011-04-29 01:03:35.153' UNION ALL
SELECT 11799, '2011-04-29 01:03:07.123', '2011-04-29 01:03:24.437' UNION ALL
SELECT 11789, '2011-04-29 01:03:08.793', '2011-04-29 01:03:13.577';
/* start the job: */
WITH columnified AS (
  /* transform every row of (ID, CnxStartdatetime, CnxEnddatetime)
     into two rows as follows:
     (ID, CnxStartdatetime, CountChange = 1)
     (ID, CnxEnddatetime, CountChange = -1)
  */
  SELECT
    t.ID,
    Time = CASE x.CountChange WHEN 1 THEN CnxStartdatetime ELSE CnxEnddatetime END,
    x.CountChange
  FROM dbo.MyTable t
    CROSS JOIN (SELECT 1 AS CountChange UNION ALL SELECT -1) x
),
groupedandranked AS (
  /* group and rank the timestamps */
  SELECT
    Time,
    CountChange = SUM(CountChange),
    TimeRN = ROW_NUMBER() OVER (ORDER BY Time)
  FROM columnified
  GROUP BY time
),
counted AS (
  /* get the running counts by summing CountChange */
  SELECT
    Time,
    TimeRN,
    RunningCount = CountChange
  FROM groupedandranked
  WHERE TimeRN = 1
  UNION ALL
  SELECT
    t.Time,
    t.TimeRN,
    RunningCount = t.CountChange + c.RunningCount
  FROM groupedandranked t
    INNER JOIN counted c ON t.TimeRN = c.TimeRN + 1
),
countsranked AS (
  /* rank the running counts */
  SELECT
    *,
    CountRN = DENSE_RANK() OVER (ORDER BY RunningCount DESC)
  FROM counted
)
/* get the top ranked rows and their corresponding
   subsequent rows (for the ending timestamps) */
SELECT
  MaxCount = s.RunningCount,
  MaxCountStart = s.Time,
  MaxCountEnd = e.Time
FROM countsranked s
  LEFT JOIN countsranked e ON e.TimeRN = s.TimeRN + 1
WHERE s.CountRN = 1;
/* remove the sample data (unless it's your table) */
DROP TABLE MyTable
Outworn answered 30/4, 2011 at 14:31 Comment(0)
D
0

I know cursors are frowned upon, but so are cross joins. This returns 8 for the sample data provided.

-- assuming times table with columns s and e
declare @s datetime, @e datetime;
declare @t table(d datetime);
declare c cursor for select s,e from times order by s;
open c
while(1=1) begin
  fetch next from c into @s,@e
  if @@FETCH_STATUS<>0 break;
  update top(1) @t set d=@e where d<=@s;
  if @@ROWCOUNT=0 insert @t(d) values(@e);
end
close c
deallocate c

select COUNT(*) as MaxConcurrentTimeSpans from @t
Deming answered 11/10, 2013 at 23:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.