What are the pitfalls of using sql_variant?
Asked Answered
H

4

12

I've read and heard several times that sql_variant should be avoided. I think I have a great use case for it. I've used varchar(max) in the past to store different types in the same column, but it seems sensible to avoid the de/serialization overhead when there's a built-in type that does exactly what I want.

So, what exactly are the pitfalls of using sql_variant? Are they performance related, or easy-to-make programming mistakes, or something else? Incidentally, I'll be interacting with this column from client code and CLR functions, if that's something to consider.

Hexachord answered 20/10, 2011 at 16:33 Comment(2)
Why are you storing different types in the same column? Is this an EAV structure?Bradberry
No. I don't really want to get sidetracked with the validity of my use case, but I have a table of filters that can be applied to various columns. Thus the comparable values are of different types.Hexachord
I
6

Storing different types in the same column via SQL_VARIANT is almost the same thing as casting everything to Object in .NET. And sometimes there are valid reasons for using this type as it can certainly allow for a more generic programmatic structure.

However, as you were anticipating, there are some pitfalls to using SQL_VARIANT that you should be aware of, especially as one of them might be a deal-breaker:

  1. Just like casting everything to Object in .NET (and possibly requiring boxing / unboxing depending on the base type), there is a definite performance hit when using SQL_VARIANT. Depending on the use case, it might be acceptable to have reduced performance if the functionality really needs it and/or the usage is not very frequent (i.e. many times per second).

  2. Unlike casting everything to Object in .NET, the SQL_VARIANT datatype has limitations on what base datatypes it can contain. The following datatypes cannot be stored as SQL_VARIANT:

    • VARCHAR(MAX)
    • NVARCHAR(MAX)
    • VARBINARY(MAX)
    • XML
    • TIMESTAMP / ROWVERSION
    • TEXT (you shouldn't be using this type anyway as of SQL Server 2005)
    • NTEXT (you shouldn't be using this type anyway as of SQL Server 2005)
    • IMAGE (you shouldn't be using this type anyway as of SQL Server 2005)

    This limitation can easily prevent the possibility of using SQL_VARIANT if there is a requirement to store any of these datatypes. Please note that the issue here is the base datatype and not the size of the data, as the following test shows:

    DECLARE @tmp1 TABLE (col1 SQL_VARIANT NOT NULL);
    INSERT INTO @tmp1 (col1) VALUES (CONVERT(VARCHAR(MAX), 'g'));
    

    Returns:

    Msg 206, Level 16, State 2, Line 2
    Operand type clash: varchar(max) is incompatible with sql_variant
    

To be fair, one benefit to using SQL_VARIANT over casting everything to NVARCHAR is that SQL_VARIANT retains the underlying type info and enforces its usage so that you cannot easily misuse values in completely inappropriate contexts.

DECLARE @tmp2 TABLE (col1 SQL_VARIANT NOT NULL);
INSERT INTO @tmp2 (col1) VALUES (1);

SELECT CONVERT(DATETIME, col1) FROM @tmp2;

SELECT CONVERT(TIME, col1) FROM @tmp2;

Returns:

1900-01-02 00:00:00.000

Msg 529, Level 16, State 3, Line 6
Explicit conversion from data type int to time is not allowed.

Regarding not being able to use SQL_VARIANT as a PK: this is really a non-issue since the very nature of a generic datatype pretty much excludes it from being desirable in the first place for such a use.

Regarding not being able to use SQL_VARIANT with a LIKE operator: this is mostly a non-issue due to being able to convert it to an appropriate type that does work with LIKE, as in:

WHERE CONVERT(NVARCHAR(50), [sql_variant_field]) LIKE '%something%'

The above is certainly not the most efficient, but it is functional, and as mentioned above, efficiency was already ruled out as it was sacrificed in return for functionality when deciding to use the SQL_VARIANT datatype.

Interference answered 11/8, 2015 at 20:7 Comment(0)
B
2

I've seen both performance problems and code quality related problems:

Most of the time you access this field, you are going to have to check the type (using sql_variant_property). This makes your queries more complex, which can cause both of the problems you list.

You will also have to cast this field every time you use it, causing further performance penalty.

Further, sql_variant columns cant be part of a primary key, they don't work as part of a computed column, and they don't work with LIKE in a WHERE clause.

Bauman answered 31/1, 2015 at 21:3 Comment(0)
F
2

It also makes it easier for programming errors to occur. A DBA/Programmer looks at a column and it looks like a integer, so he puts an integer in it, but farther down the line a process wants that to be a string. I've seen this with poorly written imports into sql_variant columns.

Folacin answered 1/8, 2015 at 3:39 Comment(0)
T
1

The only obvious pitfall that comes to mind is in situations wherein you have values that you want to push into your sql_variant field that exceed its max length (8016 bytes, per this web page: http://msdn.microsoft.com/en-us/library/ms173829.aspx). If your values never approach that limit, then sql_variant can be a very good approach. Else, you could still use sql_variant, but provide a separate "isBlob" bit field that points to a separate table with your varbinary(max) values (for eg).

Tessin answered 20/10, 2011 at 17:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.