Len() vs datalength() in SQL Server 2005
Asked Answered
G

6

35

Recently I faced a issue when using len() in a query to find out the length of a query, len() was not counting the trailing spaces in the value. But datalength() is counting the trailing spaces also.

Does this means that if I'm doing some operation which deals with the actual length of the value then I have to use dalalength() over len().

ex: If I need the value of a particular value to be is of 10 character length. i.e. If the value is 3 character length I've to append 7 spaces to it.

Gainey answered 20/7, 2009 at 4:20 Comment(0)
S
23

Yes that's exactly what you must do. If you want to just get number of characters excluding blanks you would use LEN() function, while in all other cases DATALENGTH().

Even LEN() documentation has an information that to get number of bytes to represent the extension you should use DATALENGTH()

Here are the links to MSDN docs:

LEN()

DATALENGTH()

Silence answered 20/7, 2009 at 4:26 Comment(6)
LEN(N'testing one two three, nothing but a test') returns 41 characters including whitespace (and DATALENGTH() on that returns 82)Epilepsy
@Epilepsy ... it doesn't count trailing whitespace though. LEN('HELLO'), LEN('HELLO ') and LEN(' HELLO') returns 5, 5 and 10 consecutively. This is documented ("").Epigastrium
Those examples were meant to have no leading/trailing whitespace, five spaces after HELLO and five spaces before HELLO. The joys of HTML collapsing whitespace :-)Epigastrium
@marc_s, It doesn't always count whitespaces, right? check this: sqlfiddle.com/#!18/56f78/236Jarvis
@DT: yes, as "Chris J" already commented - LEN() does not count trailing whitespace(s) - but it does count whitespace embedded within the stringEpilepsy
Another difference seems to be that DATALENGTH() is faster than LEN(), probably because there is no string processing in the first case.Etheridge
S
37

Be careful. DATALENGTH returns the number of bytes used, not the number of characters.

Seineetmarne answered 20/7, 2009 at 4:28 Comment(2)
+1 LEN on a NVARCHAR field returns the number of characters in the string, while DATALENGTH() returns twice that number (since NVARCHAR stores 2 bytes per character)Epilepsy
@Epilepsy You are spot on, as always :)Linage
S
23

Yes that's exactly what you must do. If you want to just get number of characters excluding blanks you would use LEN() function, while in all other cases DATALENGTH().

Even LEN() documentation has an information that to get number of bytes to represent the extension you should use DATALENGTH()

Here are the links to MSDN docs:

LEN()

DATALENGTH()

Silence answered 20/7, 2009 at 4:26 Comment(6)
LEN(N'testing one two three, nothing but a test') returns 41 characters including whitespace (and DATALENGTH() on that returns 82)Epilepsy
@Epilepsy ... it doesn't count trailing whitespace though. LEN('HELLO'), LEN('HELLO ') and LEN(' HELLO') returns 5, 5 and 10 consecutively. This is documented ("").Epigastrium
Those examples were meant to have no leading/trailing whitespace, five spaces after HELLO and five spaces before HELLO. The joys of HTML collapsing whitespace :-)Epigastrium
@marc_s, It doesn't always count whitespaces, right? check this: sqlfiddle.com/#!18/56f78/236Jarvis
@DT: yes, as "Chris J" already commented - LEN() does not count trailing whitespace(s) - but it does count whitespace embedded within the stringEpilepsy
Another difference seems to be that DATALENGTH() is faster than LEN(), probably because there is no string processing in the first case.Etheridge
B
12

len counts the number of characters used not the storage required, this will be even more evident when you use nvarchar instead of varchar

len does not count trailing spaces either

take a look at this

declare @v nchar(5)
select @v ='ABC  '


select len(@v),datalength(@v)

and the output for len is 3 while the output for datalength =10

Berber answered 20/7, 2009 at 14:20 Comment(0)
S
3

Just use Replace():

SELECT LEN(REPLACE(N'4 Trailing Spaces:    ', ' ', '_'))

This will replace trailing spaces with a character that LEN() will actually count.

The problem with DataLength() is you have to keep track of wether your string is Unicode (nChar, nVarChar) or ASCII (Char, VarChar) to know if you also need to divide the datalength of a Unicode string by 2.

Sled answered 18/3, 2013 at 21:34 Comment(0)
P
2

I was about to use the Len(Replace('blah blah ',' ','_') suggestion when it struck me it may be more efficient to use. Just posting in case someone stumbles upon this thread as I did.

len('blah blah ' + '.')-1

Paresh answered 26/7, 2017 at 4:18 Comment(0)
O
1

Unfortunately there is no perfect solution that I am aware of.

One of the proposed solutions, LEN(string + '.')-1 returns wrong results (-1) if the string is Unicode of size 4000 or non-Unicode and of size 8000. That is because the concatenation is ignored. You can overcome this if you want, by casting the string to a MAX-size string: LEN(CAST(string as nvarchar(max)) + '.')-1, but is it worth it?

As mentioned by others, DATALENGTH(string) returns the number of bytes used for storage. For Unicode strings, it may not be enough to divide the result by 2: Unicode surrogate characters can take more than 16 bits.

All in all, be mindful of the limitations of each approach and choose whatever you believe will cause you less issues.

Obbligato answered 11/10, 2018 at 6:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.