Perform regex (replace) in an SQL query
Asked Answered
G

5

18

What is the best way to replace all '&lt' with < in a given database column? Basically perform s/&lt[^;]/</gi

Notes:

  • must work in MS SQL Server 2000
  • Must be repeatable (and not end up with <;;;;;;;;;)
Gabbi answered 29/9, 2008 at 22:16 Comment(2)
your substitution will change "&lt." to "<", losing the character after the &lt.Elianaelianora
would s/&lt([^;])/<\1/gi work better?Gabbi
B
18

Some hacking required but we can do this with LIKE, PATINDEX, LEFT AND RIGHT and good old string concatenation.

create table test
(
    id int identity(1, 1) not null,
    val varchar(25) not null
)

insert into test values ('&lt; <- ok, &lt <- nok')

while 1 = 1
begin
    update test
        set val = left(val, patindex('%&lt[^;]%', val) - 1) +
                      '&lt;' +
                      right(val, len(val) - patindex('%&lt[^;]%', val) - 2)
    from test
    where val like '%&lt[^;]%'

    IF @@ROWCOUNT = 0 BREAK
end

select * from test

Better is that this is SQL Server version agnostic and should work just fine.

Bejewel answered 29/9, 2008 at 22:46 Comment(0)
R
12

I think this can be done much cleaner if you use different STUFF :)

create table test
(
    id int identity(1, 1) not null,
    val varchar(25) not null
)

insert into test values ('&lt; <- ok, &lt <- nok')

WHILE 1 = 1
BEGIN
    UPDATE test SET
        val = STUFF( val , PATINDEX('%&lt[^;]%', val) + 3 , 0 , ';' )
    FROM test
    WHERE val LIKE '%&lt[^;]%'

    IF @@ROWCOUNT = 0 BREAK
END

select * from test
Rashidarashidi answered 30/9, 2008 at 15:50 Comment(0)
P
7

How about:

    UPDATE tableName
    SET columName = REPLACE(columName , '&lt', '&lt;')
    WHERE columnName LIKE '%lt%'
    AND columnName NOT LIKE '%lt;%'

Edit:

I just realized this will ignore columns with partially correct &lt; strings.

In that case you can ignore the second part of the where clause and call this afterward:

    UPDATE tableName
    SET columName = REPLACE(columName , '&lt;;', '&lt;')
Plutonium answered 30/9, 2008 at 16:5 Comment(0)
C
1

If MSSQL's regex flavor supports negative lookahead, that would be The Right Way to approach this.

s/&lt(?!;)/&lt;/gi

will catch all instances of &lt which are not followed by a ; (even if they're followed by nothing, which [^;] would miss) and does not capture the following non-; character as part of the match, eliminating the issue mentioned in the comments on the original question of that character being lost in the replacement.

Unfortunately, I don't use MSSQL, so I have no idea whether it supports negative lookahead or not...

Chortle answered 30/9, 2008 at 16:26 Comment(1)
It doesn't. You can only match the same kinds of expressions the LIKE operator uses.Doughty
L
1

Very specific to this pattern, but I have done similar to this in the past:

REPLACE(REPLACE(columName, '&lt;', '&lt'), '&lt', '&lt;')

broader example (encode characters which may be inappropriate in a TITLE attribute)

REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
    columName
    -- Remove existing encoding:
    , '&amp;', '&')
    , '&#34;', '"')
    , '&#39;', '''')
    -- Reinstate/Encode:
    , '&', '&amp;')
    -- Encode:
    , '"', '&#34;')
    , '''', '&#39;')
    , ' ', '%20')
    , '<', '%3C')
    , '>', '%3E')
    , '/', '%2F')
    , '\', '%5C')
Lordsandladies answered 3/1, 2018 at 11:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.