Using RegEx in SQL Server
Asked Answered
S

7

126

I'm looking how to replace/encode text using RegEx based on RegEx settings/params below:

RegEx.IgnoreCase = True     
RegEx.Global = True     
RegEx.Pattern = "[^a-z\d\s.]+"   

I have seen some examples on RegEx, but confused as to how to apply it the same way in SQL Server. Any suggestions would be helpful. Thank you.

Sulfathiazole answered 19/1, 2012 at 15:9 Comment(3)
Hi take a look at this article: codeproject.com/Articles/42764/…Emmaline
There's also a fine TSQL + Windows API solution at Robyn Page and Phil Factor's that relies on VBScript.RegExp class, which, I belieave, is shipped on every Windows version since Windows 2000.Wauters
If you absolutely positively need RegEx via TSQL, an option for SQL Server 2016 and above is to use R services.Liverpool
I
139

You do not need to interact with managed code, as you can use LIKE:

CREATE TABLE #Sample(Field varchar(50), Result varchar(50))
GO
INSERT INTO #Sample (Field, Result) VALUES ('ABC123 ', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123.', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123&', 'Match')
SELECT * FROM #Sample WHERE Field LIKE '%[^a-z0-9 .]%'
GO
DROP TABLE #Sample

As your expression ends with + you can go with '%[^a-z0-9 .][^a-z0-9 .]%'

EDIT:
To make it clear: SQL Server doesn't support regular expressions without managed code. Depending on the situation, the LIKE operator can be an option, but it lacks the flexibility that regular expressions provides.

Inseverable answered 31/1, 2012 at 8:22 Comment(4)
@MikeYoung, you're right. This answer incorrectly addresses the + quantifier as {1,2} when it should take it as {1, }. Surprisingly, this worked to the OP.Inseverable
This won't works in sql server since it doesn't support regex.Araxes
@VVN, LIKE is not regex (it's a more limited pattern-matching syntax), so the lack of regex support doesn't mean that this won't work.Bumgarner
@RubensFarias wouldn't it be nice to update the answer in light of comments from @mike-young?Droplight
B
13

You will have to build a CLR procedure that provides regex functionality, as this article illustrates.

Their example function uses VB.NET:

Imports System
Imports System.Data.Sql
Imports Microsoft.SqlServer.Server
Imports System.Data.SqlTypes
Imports System.Runtime.InteropServices
Imports System.Text.RegularExpressions
Imports System.Collections 'the IEnumerable interface is here  


Namespace SimpleTalk.Phil.Factor
    Public Class RegularExpressionFunctions
        'RegExIsMatch function
        <SqlFunction(IsDeterministic:=True, IsPrecise:=True)> _
        Public Shared Function RegExIsMatch( _
                                            ByVal pattern As SqlString, _
                                            ByVal input As SqlString, _
                                            ByVal Options As SqlInt32) As SqlBoolean
            If (input.IsNull OrElse pattern.IsNull) Then
                Return SqlBoolean.False
            End If
            Dim RegExOption As New System.Text.RegularExpressions.RegExOptions
            RegExOption = Options
            Return RegEx.IsMatch(input.Value, pattern.Value, RegExOption)
        End Function
    End Class      ' 
End Namespace

...and is installed in SQL Server using the following SQL (replacing '%'-delimted variables by their actual equivalents:

sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE

IF EXISTS ( SELECT   1
            FROM     sys.objects
            WHERE    object_id = OBJECT_ID(N'dbo.RegExIsMatch') ) 
   DROP FUNCTION dbo.RegExIsMatch
go

IF EXISTS ( SELECT   1
            FROM     sys.assemblies asms
            WHERE    asms.name = N'RegExFunction ' ) 
   DROP ASSEMBLY [RegExFunction]

CREATE ASSEMBLY RegExFunction 
           FROM '%FILE%'
GO

CREATE FUNCTION RegExIsMatch
   (
    @Pattern NVARCHAR(4000),
    @Input NVARCHAR(MAX),
    @Options int
   )
RETURNS BIT
AS EXTERNAL NAME 
   RegExFunction.[SimpleTalk.Phil.Factor.RegularExpressionFunctions].RegExIsMatch
GO

--a few tests
---Is this card a valid credit card?
SELECT dbo.RegExIsMatch ('^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\d{3})\d{11})$','4241825283987487',1)
--is there a number in this string
SELECT dbo.RegExIsMatch( '\d','there is 1 thing I hate',1)
--Verifies number Returns 1
DECLARE @pattern VARCHAR(255)
SELECT @pattern ='[a-zA-Z0-9]\d{2}[a-zA-Z0-9](-\d{3}){2}[A-Za-z0-9]'
SELECT  dbo.RegExIsMatch (@pattern, '1298-673-4192',1),
        dbo.RegExIsMatch (@pattern,'A08Z-931-468A',1),
        dbo.RegExIsMatch (@pattern,'[A90-123-129X',1),
        dbo.RegExIsMatch (@pattern,'12345-KKA-1230',1),
        dbo.RegExIsMatch (@pattern,'0919-2893-1256',1)
Barnaba answered 19/1, 2012 at 15:11 Comment(5)
This is in Classic ASP, does it support? I think CLR is only for .NET functions, right?Sulfathiazole
CLR procedures are installed into the SQL Server environment and can be invoked like any other stored procedure or user-defined function, so if Classic ASP can invoke a stored procedure or user-defined function, it can invoke a CLR procedure.Barnaba
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From ReviewMonaural
Thanks @FedericoklezCulloca. This was an old answer and I've updated it accordingly.Barnaba
@Barnaba thanks for that. I see it's old, but it popped up in a review queue :)Monaural
P
12

Slightly modified version of Julio's answer.

-- MS SQL using VBScript Regex
-- select dbo.RegexReplace('aa bb cc','($1) ($2) ($3)','([^\s]*)\s*([^\s]*)\s*([^\s]*)')
-- $$ dollar sign, $1 - $9 back references, $& whole match

CREATE FUNCTION [dbo].[RegexReplace]
(   -- these match exactly the parameters of RegExp
    @searchstring varchar(4000),
    @replacestring varchar(4000),
    @pattern varchar(4000)
)
RETURNS varchar(4000)
AS
BEGIN
    declare @objRegexExp int, 
        @objErrorObj int,
        @strErrorMessage varchar(255),
        @res int,
        @result varchar(4000)

    if( @searchstring is null or len(ltrim(rtrim(@searchstring))) = 0) return null
    set @result=''
    exec @res=sp_OACreate 'VBScript.RegExp', @objRegexExp out
    if( @res <> 0) return '..VBScript did not initialize'
    exec @res=sp_OASetProperty @objRegexExp, 'Pattern', @pattern
    if( @res <> 0) return '..Pattern property set failed'
    exec @res=sp_OASetProperty @objRegexExp, 'IgnoreCase', 0
    if( @res <> 0) return '..IgnoreCase option failed'
    exec @res=sp_OAMethod @objRegexExp, 'Replace', @result OUT,
         @searchstring, @replacestring
    if( @res <> 0) return '..Bad search string'
    exec @res=sp_OADestroy @objRegexExp
    return @result
END

You'll need Ole Automation Procedures turned on in SQL:

exec sp_configure 'show advanced options',1; 
go
reconfigure; 
go
sp_configure 'Ole Automation Procedures', 1; 
go
reconfigure; 
go
sp_configure 'show advanced options',0; 
go
reconfigure;
go
Paradies answered 19/7, 2016 at 15:21 Comment(1)
BTW, it's much faster to destroy and recreate the regex object than to cache and reuse it. We ran 10,000 comparisons with significantly higher numbers reusing the object.Paradies
I
12

Regular Expressions In SQL Server Databases Implementation Use

Regular Expression - Description
. Match any one character
* Match any character
+ Match at least one instance of the expression before
^ Start at beginning of line
$ Search at end of line
< Match only if word starts at this point
> Match only if word stops at this point
\n Match a line break
[] Match any character within the brackets
[^...] Matches any character not listed after the ^
[ABQ]% The string must begin with either the letters A, B, or Q and can be of any length
[AB][CD]% The string must have a length of two or more and which must begin with A or B and have C or D as the second character
[A-Z]% The string can be of any length and must begin with any letter from A to Z
[A-Z0-9]% The string can be of any length and must start with any letter from A to Z or numeral from 0 to 9
[^A-C]% The string can be of any length but cannot begin with the letters A to C
%[A-Z] The string can be of any length and must end with any of the letters from A to Z
%[%$#@]% The string can be of any length and must contain at least one of the special characters enclosed in the bracket

Icing answered 19/10, 2019 at 3:27 Comment(5)
I am confused by this post. Not all of the above are supported by SQL Server LIKE. But some of them are. Is this a mixture of LIKE expressions and standard regex regular expressions? e.g. < and > and ^ and $ for start and end of line.Picturesque
@Picturesque this answer is misleading. LIKE (or PATINDEX for that matter) only supports the wildcard matches listed in the docs, which are just %, [], [^], _.Gardiner
This is very misleading and unclear. Please clarify where these apply, and what restrictions there are.Zee
I have just explaining how Regular Expression use each expression.Icing
@RaviMakwana Yes, but your answer is incorrect. This is fine for languages that have regular expressions implemented, but TSQL does not.Abolition
H
6
SELECT * from SOME_TABLE where NAME like '%[^A-Z]%'

Or some other expression instead of A-Z

Haught answered 12/5, 2016 at 9:32 Comment(1)
How can I check for email validation using the above method ?Dump
C
2

A similar approach to @mwigdahl's answer, you can also implement a .NET CLR in C#, with code such as;

using System.Data.SqlTypes;
using RX = System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
 [Microsoft.SqlServer.Server.SqlFunction]
 public static SqlString Regex(string input, string regex)
 {
  var match = RX.Regex.Match(input, regex).Groups[1].Value;
  return new SqlString (match);
 }
}

Installation instructions can be found here

Carter answered 17/12, 2019 at 13:28 Comment(0)
E
0

I thought I needed RegEx in order to perform my query (to find content that has an <img> tag but does not have an alt attribute in the tag) but that ended up not being the case (I had found a regex for the above criteria but it wasn't working for me in SQL).

In my case the following query was perfect. You may also have a situation where you don't actually need RegEx for your situation. The below query works because of sequential processing of the WHERE clause:

select *
from Content c
where c.Deleted = 0
    and c.content_html like '%<img%'
    and c.content_html not like '%alt=%'
Excepting answered 10/4, 2024 at 17:6 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.