+= in SELECT clause; constant vs column [duplicate]
Asked Answered
P

3

9

I observed some strange behavior in MS SQL Server 2017.

  • += in select acts as aggregator ('concatenate values from all rows') when on the right is a constant.
  • += in select acts as 'just set the value' when on the right is a column name. (also, this turns of aggregating behavior for other columns)

So my questions are:

  1. Why @c1 result contains value only last the row, even when += is used?
  2. Why is @c2 affected, by the change +=->= for @c1?

Version 1:

BEGIN
    DECLARE
        @c1 NVARCHAR(MAX) = N'',
        @c2 NVARCHAR(MAX) = N'';

    SELECT
        @c1 = constraint_name, -- version-1
        @c2 += '+'
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
    ORDER BY 1 DESC
    ;

    PRINT '@c1=' + @c1;
    PRINT '@c2=' + @c2;
END
;

Version 1 result:

@c1 = fk_abcde
@c2 = ++++++++++++++++++++++++++++++++++++++++++
(`@c2` result is aggregation of many rows; one plus for each row)

Version 2:

BEGIN
    DECLARE
        @c1 NVARCHAR(MAX) = N'',
        @c2 NVARCHAR(MAX) = N'';

    SELECT
        @c1 += constraint_name, -- version-2
        @c2 += '+'
    FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
    ORDER BY 1 DESC
    ;

    PRINT '@c1=' + @c1;
    PRINT '@c2=' + @c2;
END
;

Version 2 Result:

@c1 = fk_abcde
@c2 = +
(`@c2` is just value assigned from last processed row)

This feels very strange - kind of like a bug. I fail to find any docs about this. The doc on '+= string' don't mention += usage in select query at all.

(at the moment my goal is to understand the behavior fully, so I would not accidentally step on it. Any hints to right documentation/keywords to search for would be helpful)

Penninite answered 27/2, 2019 at 6:39 Comment(8)
It's in the wrong place in the documentation, so not surprising that you haven't found it: "Don't use a variable in a SELECT statement to concatenate values (that is, to compute aggregate values). Unexpected query results may occur. Because, all expressions in the SELECT list (including assignments) aren't necessarily run exactly once for each output row"Newbold
@Newbold To add to your comment, I was going to suggest that what we are seeing in the second case is just the variable concatenation happening once, the other times being ignored or discarded.Chape
And to add some more: This approach is called quirky update and is - in most cases - something to avoid...Pulchritudinous
@Newbold I believe your comment and the link to the docs should be the answer here.Whittaker
@Penninite what are you trying to do? If you want to aggregate strings use STRING_AGG in SQL Server 2017. There are other ways to do the same in previous versions, all of them described in Aaron Bertrand's articles. The fastest and most scalable ways are to use a SQLCLR UDF or FOR XMLPippin
@Penninite what you posted is a hack often used in MySQL, that doesn't really work there either. It depends on the server using a very specific way to execute the query and produce results which changes from version to version and can't ever include parallel execution.Pippin
@Newbold Please copy your comment (link + quote) as an answer.Penninite
@PanagiotisKanavos Initially I wanted to aggregate/concatenate strings. For that I will use STRING_AGG. Thanks.Penninite
N
2

It's in the wrong place in the documentation, so not surprising that you haven't found it:

Don't use a variable in a SELECT statement to concatenate values (that is, to compute aggregate values). Unexpected query results may occur. Because, all expressions in the SELECT list (including assignments) aren't necessarily run exactly once for each output row

It's better to look for different ways of doing string concatenation. If your version supports it, choose to use STRING_AGG. For earlier versions, Aaron Bertrand has provided a good set of options (hat tip to Panagiotis Kanavos for providing the link)

Newbold answered 28/2, 2019 at 6:21 Comment(0)
R
-1

Well, what I found is quite interesting: without ORDER BY query runs consistently and expectedly.

But when adding ORDER BY caluse, I get same results as you.

What I suggest is to use CTE with ordering - unfortunately ordering isn't allowed in CTE or subqueries, but workaround is to use TOP keyword, which allows us to order in such situations.

See script below:

;with cte as (
    select top 100 percent table_schema
    from information_schema.columns
    order by 1
)
-- This is more reliable
select @c1 = table_schema, @c2 += '+' from cte
Reviewer answered 27/2, 2019 at 8:46 Comment(1)
There's a reason this is called a quirky update and discouraged against. It works by accident and can always break even among minor patch versions, or even if eg a parallel operation changes how the results are processed.Pippin
S
-2

You should use Order by constraint_name instead of Order by 1, because you're assigning values into variable. Its not select statement.

Screamer answered 27/2, 2019 at 7:21 Comment(1)
Doesn't matter how you tweak it. As I've linked to in the comments, using variables for aggregation in a SELECT is documented to be inherently unreliable.Newbold

© 2022 - 2025 — McMap. All rights reserved.