How to create a unique index on a NULL column?
Asked Answered
E

5

116

I am using SQL Server 2005. I want to constrain the values in a column to be unique, while allowing NULLS.

My current solution involves a unique index on a view like so:

CREATE VIEW vw_unq WITH SCHEMABINDING AS
    SELECT Column1
      FROM MyTable
     WHERE Column1 IS NOT NULL

CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)

Any better ideas?

Ezequiel answered 10/10, 2008 at 14:3 Comment(2)
no chance of using sql 2008? you can create a filtered index using 'where'Coimbra
You didn't mean unique, allowing NULLs, you seem to have meant unique, but including multiple NULLs. Otherwise, NULL is indexed like any other value and the uniqueness constraint works as expected - just not according to SQL standards, as @pst mentioned in a comment below.Rennin
M
25

Pretty sure you can't do that, as it violates the purpose of uniques.

However, this person seems to have a decent work around: http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html

Macaroon answered 10/10, 2008 at 14:17 Comment(7)
It seems the content of the link you provided was actually (partially) copied without attribution from here: decipherinfosys.wordpress.com/2007/11/30/…Fourteen
I disagree that it "violates the purpose of uniques" -- NULL is a special value in SQL (similar in many ways to NaN) and needs to be treated accordingly. It's actually a failure in in SQL Server to honor various SQL specifications: here is a link for a request for the "correct implementation" for what it is worth: connect.microsoft.com/SQLServer/feedback/details/299229/….Killian
for reference in 2008 on you can do CREATE UNIQUE INDEX foo ON dbo.bar(key) WHERE key IS NOT NULL;Awl
I disagree as well with "violates the purpose of uniques", NULL does not equal to NULL, so you should be able create unique index on nullable column and insert multiple nulls.Poorhouse
Null not equally null is pedantly. null == null -> IS NULL AND IS NULL, there's no reason for this to not work on a constraint about uniquenessTrait
SQL is not the same as programming languages wherenull is a deliberate value, and null==null. In SQL null means no value. Note that (a) even in Microsoft, constraints (such as foreign key and check constraints) are not applied to null, (b) even in Microsoft WHERE null=null is false and (c) most, if not all, other DBMSs don’t apply the UNIQUE constraint to null, MSSQL is really the odd one out.Smearcase
Please make answers self-contained so it remains valid and useful. At least summarize or mention what the linked content says, what you link to. The linked page may and eventually will become unavailable. stackoverflow.com/help/how-to-answerKrick
L
119

Using SQL Server 2008, you can create a filtered index.

CREATE UNIQUE INDEX AK_MyTable_Column1 ON MyTable (Column1) WHERE Column1 IS NOT NULL

Another option is a trigger to check uniqueness, but this could affect performance.

Linalool answered 7/7, 2010 at 2:50 Comment(5)
create unique index UIX on MyTable (Column1) where Column1 is not nullXebec
Note: currently SQL Server Management Studio doesn't seem to know how to create such indexes so if you later modify the table it'll get confused and try to drop it so remember to recreate itCoimbra
It seems that Microsoft has updated SSMS to support this. I have SSMS 10.50.1617 and in the Index Properties dialog you can select the Filter page to edit the filter. e.g. "([Column1] IS NOT NULL)"Linalool
Allowing multiple nulls in an index and filtering nulls from an index are separate things. Filtering an index actually excludes records from the index, whereas the other solutions transform the null into a useful unique value. Be aware of the difference.Rennin
"whereas the other solutions transform the null into a useful unique value. Be aware of the difference" - yes they're excluded, but what if the column holds some ints. And the pk is some other ints. If you're searching for where your nullable column = some int then the nullbuster technique is actively dangerous - there is a non-zero chance that there will be an int value in the nullable column that is also present in the pk column, potentially allowing you to find the wrong row entirely! Excluding rows you don't want to find is safer.Blindworm
C
76

The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:

CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X  int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)
Carlettacarley answered 10/10, 2008 at 14:56 Comment(5)
This looks like a cool trick. Oddly searching for nullbuster doesn't bring up too much stuff. I'm wondering if this will be useful for speeding up searches too - rather than a computed column of just 1 and 0 for null or not, if using the PK gives the index something more to work with? Going to test this weekend on a big table and see.Kowtow
@DavidStorfer, you can't do that because you could have a collision between the IDs of the two different tables.Picaroon
Improvement: ISNULL(X, CONVERT(VARCHAR(10),pk))Conferee
@Faiz: Improvement is in the eye of the beholder. I prefer the look of the original.Carlettacarley
@NunoG, this should be the accepted answer since it provide a good solution compliant with your requirements, instead of just linking an external site which may disappear.Marlo
M
25

Pretty sure you can't do that, as it violates the purpose of uniques.

However, this person seems to have a decent work around: http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html

Macaroon answered 10/10, 2008 at 14:17 Comment(7)
It seems the content of the link you provided was actually (partially) copied without attribution from here: decipherinfosys.wordpress.com/2007/11/30/…Fourteen
I disagree that it "violates the purpose of uniques" -- NULL is a special value in SQL (similar in many ways to NaN) and needs to be treated accordingly. It's actually a failure in in SQL Server to honor various SQL specifications: here is a link for a request for the "correct implementation" for what it is worth: connect.microsoft.com/SQLServer/feedback/details/299229/….Killian
for reference in 2008 on you can do CREATE UNIQUE INDEX foo ON dbo.bar(key) WHERE key IS NOT NULL;Awl
I disagree as well with "violates the purpose of uniques", NULL does not equal to NULL, so you should be able create unique index on nullable column and insert multiple nulls.Poorhouse
Null not equally null is pedantly. null == null -> IS NULL AND IS NULL, there's no reason for this to not work on a constraint about uniquenessTrait
SQL is not the same as programming languages wherenull is a deliberate value, and null==null. In SQL null means no value. Note that (a) even in Microsoft, constraints (such as foreign key and check constraints) are not applied to null, (b) even in Microsoft WHERE null=null is false and (c) most, if not all, other DBMSs don’t apply the UNIQUE constraint to null, MSSQL is really the odd one out.Smearcase
Please make answers self-contained so it remains valid and useful. At least summarize or mention what the linked content says, what you link to. The linked page may and eventually will become unavailable. stackoverflow.com/help/how-to-answerKrick
C
1

It is possible to use filter predicates to specify which rows to include in the index.

From the documentation:

WHERE <filter_predicate> Creates a filtered index by specifying which rows to include in the index. The filtered index must be a nonclustered index on a table. Creates filtered statistics for the data rows in the filtered index.

Example:

CREATE TABLE Table1 (
  NullableCol int NULL
)

CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;
Conventionalize answered 22/2, 2021 at 17:43 Comment(2)
Filtered indexes were introduced in SQL Server 2008. OP states he is using 2005 (The question is 12.5 years old, hence the outdated version number).Kissie
@Kissie Thank you for the explanation.Conventionalize
P
-3

Strictly speaking, a unique nullable column (or set of columns) can be NULL (or a record of NULLs) only once, since having the same value (and this includes NULL) more than once obviously violates the unique constraint.

However, that doesn't mean the concept of "unique nullable columns" is valid; to actually implement it in any relational database we just have to bear in mind that this kind of databases are meant to be normalized to properly work, and normalization usually involves the addition of several (non-entity) extra tables to establish relationships between the entities.

Let's work a basic example considering only one "unique nullable column", it's easy to expand it to more such columns.

Suppose we the information represented by a table like this:

create table the_entity_incorrect
(
  id integer,
  uniqnull integer null, /* we want this to be "unique and nullable" */
  primary key (id)
);

We can do it by putting uniqnull apart and adding a second table to establish a relationship between uniqnull values and the_entity (rather than having uniqnull "inside" the_entity):

create table the_entity
(
  id integer,
  primary key(id)
);

create table the_relation
(
  the_entity_id integer not null,
  uniqnull integer not null,

  unique(the_entity_id),
  unique(uniqnull),
  /* primary key can be both or either of the_entity_id or uniqnull */
  primary key (the_entity_id, uniqnull), 
  foreign key (the_entity_id) references the_entity(id)
);

To associate a value of uniqnull to a row in the_entity we need to also add a row in the_relation.

For rows in the_entity were no uniqnull values are associated (i.e. for the ones we would put NULL in the_entity_incorrect) we simply do not add a row in the_relation.

Note that values for uniqnull will be unique for all the_relation, and also notice that for each value in the_entity there can be at most one value in the_relation, since the primary and foreign keys on it enforce this.

Then, if a value of 5 for uniqnull is to be associated with an the_entity id of 3, we need to:

start transaction;
insert into the_entity (id) values (3); 
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;

And, if an id value of 10 for the_entity has no uniqnull counterpart, we only do:

start transaction;
insert into the_entity (id) values (10); 
commit;

To denormalize this information and obtain the data a table like the_entity_incorrect would hold, we need to:

select
  id, uniqnull
from
  the_entity left outer join the_relation
on
  the_entity.id = the_relation.the_entity_id
;

The "left outer join" operator ensures all rows from the_entity will appear in the result, putting NULL in the uniqnull column when no matching columns are present in the_relation.

Remember, any effort spent for some days (or weeks or months) in designing a well normalized database (and the corresponding denormalizing views and procedures) will save you years (or decades) of pain and wasted resources.

Plebeian answered 24/2, 2015 at 5:33 Comment(1)
As already stated on accepted answer's comment with fifty upvotes, it should be supported by MS Sql Server to have multiple null in a columns indexed as unique. It is a failure to implement SQL standards not to allow so. Null is not a value, null is not equal to null, that is a basic SQL rule since years. So your first sentence is wrong and most readers will not bother reading on.Marlo

© 2022 - 2024 — McMap. All rights reserved.