Fastest way to check if a character is a digit?
Asked Answered
A

3

3

I am having issues with sqlserver's ISNUMERIC function where it is returning true for ','

I am parsing a postal code and trying to see if the second char (supposed to be a digit) is a 0 or not and do something different in each case. The issue is that I can't just cast the char by checking isNumeric first. Here is the code for my scalar-valued function to return the digit in the second char location, and -1 if it is not a digit.

@declare firstDigit int

IF ISNUMERIC(SUBSTRING(@postal,2,1) AS int) = 1
   set @firstDigit = CAST(SUBSTRING(@postal,2,1) AS int)
ELSE
   set @firstDigit = -1       

RETURN @firstdigit

Since this fails when the postal code is not quite valid. I am just trying to find out how to check if the nvarchar @postal 's second character is a digit from 0-9. I have seen different types of solutions such as using LIKE [0-9] or using PATINDEX etc.

Is there a better/easier way to do this, and if not which method will be the fastest?

EDIT: Code added as per Aaron Bertrand's suggestion

ON z.postal = 
   CASE
      WHEN CONVERT(INT, CASE WHEN SUBSTRING(v.patientPostal,2,1) LIKE '[0-9]' 
          THEN SUBSTRING(v.patientPostal, 2,1) END) = 0 then v.patientPostal
      WHEN CONVERT(INT, CASE WHEN SUBSTRING(v.patientPostal,2,1) LIKE '[0-9]' 
          THEN SUBSTRING(v.patientPostal, 2,1) END) > 0 then LEFT(v.patientPostal,3)
Amorous answered 22/9, 2011 at 17:57 Comment(5)
Could you post more of the code? I think the reason it's failing on a non-numeric 2nd digit may have to do more with what follows the IF clause you've shown.Protege
That is pretty much all it is, I created a scalar-valued function to return the digit or -1 if its not. I can update it with the whole function though if that helps.Amorous
Thanks I will look into that, so do you think I should just do the whole check in place and just use like? I didn't realize scalar UDF was a bit hit, I have only been using sqlserver for a short timeAmorous
What patterns are good and bad? Postcodes are different in each country...Nickey
I am just looking for a digit there, the postalcode patterns I am looking at only relate to Ontario Canada which are like A1B2C3Amorous
O
9

I'd be very surprised if you would ever be able to detect any difference between WHERE col LIKE '[0-9]' and any other methods you come up with. But I agree with Denis, put that away in a function so that you use the same check consistently throughout all your code (or at least, if you're avoiding UDFs because of large scans etc., put a marker in your code that will make it easy to change on a wide scale later).

That said, you are most certainly going to see more of a performance hit just by using a scalar UDF than what method you use to parse inside the function. You really ought to compare performance of the UDF vs. doing that inline using CASE. e.g.

SELECT Postal = CONVERT(INT, CASE WHEN SUBSTRING(postal,2,1) LIKE '[0-9]' 
       THEN SUBSTRING(postal, 2,1) END)
FROM ...

This will yield NULL if the character is not numeric.

If you are only dealing with checking local variables, it really is not going to matter what parsing method you use, and you are better off focusing your optimization efforts elsewhere.

EDIT adding suggestion to demonstrated JOIN clause. This will potentially lead to less constant scans but is a lot more readable (far fewer substring calls etc):

;WITH v AS 
(
    SELECT /* other columns, */ patientPostal, 
      ss = SUBSTRING(v.patientPostal,2,1),
      FROM [whatever table is aliased v in current query]
)
SELECT /* column list */
FROM [whatever table is aliased z in current query]
INNER JOIN v ON z.postal = CONVERT(INT, CASE 
    WHEN v.ss = '0' THEN ss
    WHEN v.ss LIKE '[1-9]' THEN LEFT(v.patientPostal, 3)
END);
Oral answered 22/9, 2011 at 18:3 Comment(5)
Thanks for the info, I didn't realize scalar UDF's were a bit hit. I will edit in the new code so it makes more sense. I was trying to move the check outside of the function since it was quite busy already and it was inside of a case statement itself. But if it is a bigger hit to do that I am okay with having a little extra code in the function.Amorous
It often is, but not always. It depends on when the UDF gets hit; for example, if your query returns only one row, it may not only got called once, but as with a lot of things it depends.Oral
So I have edited in what my join statement's on clause looks like. Is it fastest to leave it as such or will I benefit at all from moving the check into its own function?Amorous
Thanks for all of the info, I just tried it both ways and doing it inline took 9 seconds while doing it the old way took 1:11Amorous
We call that #SQLWinning. :-)Oral
I
3

The best way to do it is this:

IF SUBSTRING(@postal,2,1) LIKE [0-9]
CAST(SUBSTRING(@postal,2,1) AS int)
Inbeing answered 22/9, 2011 at 18:3 Comment(2)
Just to play devil's advocate, why is that "best"? And relative to what?Oral
"best" might be a bit strong. Given the requirement using LIKE [0-9] without adding a UDF for repeatability gets him up an running now. He may have access to only modify that script; adding a UDF might be outside of scope or additional testing that there isn't time for. So your answer is much better under optimal circumstances; under non-optimal circumstances "making this work" no matter how it happens is the likely result.Inbeing
Q
1

Take a look at IsNumeric, IsInt, IsNumber it has checks for those 3 types

Quathlamba answered 22/9, 2011 at 18:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.