SQL Server CTE loop; insert all record together
Asked Answered
R

2

6

I have this situation:

drop table #t1;
drop table #t2

select * 
into #t1
from
    (select 'va1'c1,'vb1'c2,'vc1'c3 union all
     select 'va2'c1,'vb2'c2,'vc2'c3 union all
     select 'va3'c1,'vb3'c2,'vc3'c3 union all
     select 'va1'c1,'vb1'c2,'vc1'c3 union all
     select 'va2'c1,'vb2'c2,'vc2'c3 union all
     select 'va3'c1,'vb3'c2,'vc3'c3 union all
     select 'va1'c1,'vb1'c2,'vc1'c3 union all
     select 'va2'c1,'vb2'c2,'vc2'c3 union all
     select 'va3'c1,'vb3'c2,'vc3'c3 union all
     select 'va1'c1,'vb1'c2,'vc1'c3 union all
     select 'va2'c1,'vb2'c2,'vc2'c3 union all
     select 'va3'c1,'vb3'c2,'vc3'c3 union all
     select 'va4'c1,'vb4'c2,'vc4'c3) t

select *
into #t2
from #t1
where 0 = 1

;with tmp1 as
(
    select 
        t1.*,
        ROW_NUMBER() over (partition by t1.c1 order by (select null)) r
    from 
        #t1 t1
    left join 
        #t2 t2 on t1.c1 = t2.c1
    where 
        t2.c1 is null   
), tmp2 as
(
    select 
        0 n,*
    from 
        tmp1
    union all
    select 
        n+1 n, t1.c1, t1.c2, t1.c3, t1.r
    from 
        tmp2 t1
    join 
        tmp1 t2 on t1.c1 = t2.c1
                and t2.r = t1.r + 1
    where 
        n < 10
)
--insert #t2
select c1, c2, c3  --,r
from tmp2

When I run this, it selects everything just fine (103 records).

The problem is when I this code to insert into #t2 (13 records!!!)

I think SQL runs step by step and insert records during running and than my condition in tmp1 is over...

How to resolve it?

My goal is to check if data exists, than loop and insert results...but SQL stops after 1st cycle...

Ruvolo answered 3/5, 2018 at 14:7 Comment(14)
i d like to insert 103 records like on select...not only 13...Ruvolo
I am sure this is clear to you but for those of us not close to what you are doing this doesn't make a lot of sense.Geocentric
How do you get 103 rows from 13?Motherland
no, so...if you run cte query u ll have 103 records...if u run same cte query with insert u ll have 13 records..Ruvolo
If you run the provided query, you'll see the final select returns 103 records, if the insert is uncommented, 13 records are inserted. I think that is the question.Spiceberry
yes thank u HoneyBadger! it s what i need...i would like to don t use temp table to keep results and than insert it in final table..Ruvolo
You have some really strange stuff going on here. You create an empty table and then do a left join to it. What is the point of that? If you don't use the select into to create an empty table and remove that join in the first cte you could then use select into at the end at it works just fine.Geocentric
yes, i know...this is only a example...in real case it s more complex...Ruvolo
It looks like SQL ignoring recursive CTE if using it with insert.Candi
i think sql has insert step by step results...on first cycle insert, than run the second but left join don t get resuts, than nothing else to insert..Ruvolo
I would suggest reporting this as a bug to Microsoft. In the meantime, @lad2025's workaround gets there as would doing a SELECT INTO a third temporary table and then using that to insert into your #t2Euell
are u sure it s a bug?? in case, how can i report it?Ruvolo
Here is the really weird part. The left join to #t2 in tmp1 is pointless. If you remove that left join and the where clause from tmp1 everything works as advertised. But with that left join it only inserts 13 rows even though the select statement returns 103. Quite strange indeed.Geocentric
yes exactly...but i need the left join to check which rows i need process...i think sql run first loop step, then insert result and on second step cte run again tmp1 and no rows to process..Ruvolo
V
6

You could use MERGE:

select * into #t1
from(
select 'va1'c1,'vb1'c2,'vc1'c3 union all
select 'va2'c1,'vb2'c2,'vc2'c3 union all
select 'va3'c1,'vb3'c2,'vc3'c3 union all
select 'va1'c1,'vb1'c2,'vc1'c3 union all
select 'va2'c1,'vb2'c2,'vc2'c3 union all
select 'va3'c1,'vb3'c2,'vc3'c3 union all
select 'va1'c1,'vb1'c2,'vc1'c3 union all
select 'va2'c1,'vb2'c2,'vc2'c3 union all
select 'va3'c1,'vb3'c2,'vc3'c3 union all
select 'va1'c1,'vb1'c2,'vc1'c3 union all
select 'va2'c1,'vb2'c2,'vc2'c3 union all
select 'va3'c1,'vb3'c2,'vc3'c3 union all
select 'va4'c1,'vb4'c2,'vc4'c3 
)t;

select * into #t2 from #t1 where 0=1;

;with tmp1 as(
    select t1.*, ROW_NUMBER()over(partition by t1.c1 order by(select null))r
    from #t1 t1
    left join #t2 t2 
      on t1.c1=t2.c1
    where t2.c1 is null 
),tmp2 as (
    select 0 n,*
    from tmp1
    union all
    select n+1 n,t1.c1,t1.c2,t1.c3,t1.r
    from tmp2 t1
    join tmp1 t2
      on t1.c1=t2.c1
     and t2.r=t1.r+1
    where n<10
)
MERGE #t2
USING tmp2
  ON #t2.c1 = tmp2.c1
WHEN NOT MATCHED THEN
  INSERT VALUES (tmp2.c1, tmp2.c2, tmp2.c3);

SELECT @@ROWCOUNT;
-- 103

DBFiddle Demo


EDIT:

Kudos to Bartosz Ratajczyk for examining this case:

It turns out it is related to lazy/eager table/index spooling. There are at least two more ways to force SQL Server to generate different execution plan:

a) By using TOP (100) PERCENT

DECLARE @n INT = 100;

;with tmp1 as (
    select t1.*,
           ROW_NUMBER()over(partition by t1.c1 order by(select null))r
    from #t1 t1
    left join #t2 t2 
      on t1.c1=t2.c1
    where t2.c1 is null 
),tmp2 as
(
    select 0 n,*
    from tmp1
    union all
    select n+1 n,t1.c1,t1.c2,t1.c3,t1.r
    from tmp2 t1
    join tmp1 t2
      on t1.c1=t2.c1
     and t2.r=t1.r+1
    where n<10
)
insert #t2
select TOP (@n) PERCENT c1, c2, c3  --,r
from tmp2

SELECT @@ROWCOUNT;

b) By using ORDER BY .. OFFSET 0 ROWS:

;with tmp1 as(
    select t1.*,
           ROW_NUMBER()over(partition by t1.c1 order by(select null))r
    from #t1 t1
    left join #t2 t2 
      on t1.c1=t2.c1
    where t2.c1 is null 
),tmp2 as
(
    select 0 n,*
    from tmp1
    union all
    select n+1 n,t1.c1,t1.c2,t1.c3,t1.r
    from tmp2 t1
    join tmp1 t2
      on t1.c1=t2.c1
     and t2.r=t1.r+1
    where n<10
)
insert #t2
select c1, c2, c3  --,r
from tmp2
ORDER BY 1 OFFSET 0 ROWS;

SELECT @@ROWCOUNT;

db<>fiddle demo2


Addendum: How does the recursive CTE work? by Bartosz Ratajczyk

Vesiculate answered 3/5, 2018 at 14:24 Comment(0)
P
2

You hit a peculiarity in MS SQL server's CTE implementation. It is not handled that way in all backends. You have to select first into a temporary cursor and then insert from it. ie:

SELECT *
INTO #t1
FROM(
    SELECT 'va1' c1, 'vb1' c2, 'vc1' c3
    UNION ALL
    SELECT 'va2' c1, 'vb2' c2, 'vc2' c3
    UNION ALL
    SELECT 'va3' c1, 'vb3' c2, 'vc3' c3
    UNION ALL
    SELECT 'va1' c1, 'vb1' c2, 'vc1' c3
    UNION ALL
    SELECT 'va2' c1, 'vb2' c2, 'vc2' c3
    UNION ALL
    SELECT 'va3' c1, 'vb3' c2, 'vc3' c3
    UNION ALL
    SELECT 'va1' c1, 'vb1' c2, 'vc1' c3
    UNION ALL
    SELECT 'va2' c1, 'vb2' c2, 'vc2' c3
    UNION ALL
    SELECT 'va3' c1, 'vb3' c2, 'vc3' c3
    UNION ALL
    SELECT 'va1' c1, 'vb1' c2, 'vc1' c3
    UNION ALL
    SELECT 'va2' c1, 'vb2' c2, 'vc2' c3
    UNION ALL
    SELECT 'va3' c1, 'vb3' c2, 'vc3' c3
    UNION ALL
    SELECT 'va4' c1, 'vb4' c2, 'vc4' c3
    )t;

SELECT * INTO #t2 FROM #t1 WHERE 0=1;

DECLARE @tmp TABLE(c1 VARCHAR(10), c2 VARCHAR(10), c3 VARCHAR(10));

WITH
    tmp1 AS (
                SELECT t1.*, ROW_NUMBER() OVER (PARTITION BY t1.c1 ORDER BY(SELECT NULL)) r
                FROM #t1 t1
                     LEFT JOIN #t2 t2 ON t1.c1=t2.c1
                WHERE t2.c1 IS NULL
            ),
    tmp2 AS (
                SELECT 0 n, * FROM tmp1
                UNION ALL
                SELECT n+1 n, t1.c1, t1.c2, t1.c3, t1.r
                FROM tmp2 t1
                     JOIN tmp1 t2 ON t1.c1=t2.c1
                                 AND t2.r=t1.r+1
                WHERE n<10
            )
INSERT @tmp(c1, c2, c3)
SELECT c1, c2, c3 --,r
FROM tmp2;

INSERT #t2 SELECT * FROM @tmp;

SELECT * FROM #t2;

DROP TABLE #t1;
DROP TABLE #t2;
Pinole answered 3/5, 2018 at 14:22 Comment(7)
hi, yes i know about this solution, but i like to do it directly because i have lot of records in real case...Ruvolo
This makes no sense at all....but it works. Do you have any documentation on this? I have never seen this behavior before. Even more strange is you can insert into a table variable correctly but not a temp table. Something fishy is happening here.Geocentric
If you do SELECT tmp2.c1, tmp2.c2,tmp2.c3 INTO #t3 FROM tmp2, you also get the expected result (103 rows). I too would love to see some documentation on this.Spiceberry
@SeanLange probably it is considered a bug and not found in documents. I just happen to know from experience.Pinole
I realize it is a bug. But obviously something you have encountered before. I am asking if you have any references on this as it is something I have never seen before and the behavior makes no sense from how it should be working.Geocentric
sorry, i don t have any refereces, i found it working on sql...do u know how report it to microsoft??Ruvolo
This is bizarre. If they are both table variables it also fails.Kirst

© 2022 - 2024 — McMap. All rights reserved.