SQL find-and-replace regular-expression capturing-group limit?
Asked Answered
B

1

6

I need to convert data from a spreadsheet into insert statements in SQL. I've worked out most of the regular expressions for using the find and replace tool in SSMS, but I'm running into an issue when trying to reference the 9th parenthesized item in my final replace.

Here is the original record:

Blue Doe 12/21/1967 1126 Queens Highway Torrance CA 90802 N 1/1/2012

And this is what I need (for now):

select 'Blue','Doe','19671221','1126 Queens Highway','Torrance','CA','90802','N','20120101'

Due to limitations on the number of parenthesized items allowed I have to run through the replace three times. This may work into a stored procedure if I can make first make this work as a POC.

This is the first matching expression:

^{:w:b:w:b}{:z}/{:z}/{:z:b[0-9A-Za-z:b]+:b:w:b[A-Z]+:b:z:b:w:b}{:z}/{:z}/{:z}

And the replace: \10\2/0\3/\40\5/0\6/\7

This adds zeros to the months and days so that they have at least two characters.

The next match reformats the dates into the format required in the query (no comments about not using a date field. This is a client requirement for the database).

Matching expression:

^{:w:b:w:b}[0-9]*{[0-9]^2}/[0-9]*{[0-9]^2}/{:z}{:b[0-9A-Za-z:b]+:b:w:b[A-Z]+:b:z:b:w:b}[0-9]*{[0-9]^2}/[0-9]*{[0-9]^2}/{:z}

And the replace: \1\4\(2,2)\(2,3)\5\8\(2,6)\(2,7)

Finally, the final match inserts the results into the SQL statement that will get used in an insert statement.

Matching expression:

^{:w}:b{:w}:b{:z}:b{[0-9A-Za-z:b]+}:b{:w}:b{[A-Z]+}:b{:z}:b{:w}:b{:z}

And the replace: select '\1','\2','\3','\4','\5','\6','\7','\8','\9'

It all works except the last replacement. For some reason the \9 is NOT getting the data from the match. If I just replace the whole replace expression with \9 I get a blank space. If I use \8, I get N. If I eliminate the 8th parenthesized item, thus making my 9th item eighth, it returns what I want, 20120101.

So my question is, does SSMS / SQL allow for 9 tagged expressions when using find / replace and regular expressions? Or am I missing something here? I know there are other ways to do this. I'm just trying to get it done quickly as a POC before we move this into a sproc or application.

Thanks for any assistance. -Peter

Bolus answered 29/3, 2012 at 17:32 Comment(6)
You can import from a spreadsheet directly. Is the data already in separate columns?Horse
Why do you need to use replace? If the data is from a spread sheet and you are using SSMS, why not use the import/export manager? Also, why reformat the data, does the spread sheet not support mm/dd/yyyy date formats? Couldn't you write some cell formulas to concatenate a string which inserts the data for you?Ardeb
Please edit your question to add proper formatting of code expressions. You can do this by: surrounding with backticks (""), selecting all and clicking the toolbar button {}`, or by marking a block of code and pressing Ctrl+K. You can preview your post (before posting it) immediately below the "Submit question" button as you're entering it; the preview updates in real-time, so it's a WYSIWYG view. Properly formatting makes your question easier to read and understand, and therefore makes it much more likely you'll get an answer. Thanks. :)Luminance
Thanks guys, but these are limitations that I have to work with. The data is coming in the format I stated. There are other pieces of data that are added on in the insert statement that are NOT coming from this sheet, so a direct import will not work. I'm not looking for an alternate workaround. I need to make this work. ThanksBolus
@Peter Anderson Do not forget to accept an answer and let us know what you decided. Also, if you know why the find/replace expressions are not working that would be nice to have.Ardeb
@Trisped: I'll accept an answer when someone answers the question. Everyone is giving me alternatives to my process as opposed to answering the question about regular expressions in SQL using the find and replace system and most notably, the \9 parameter for the ninth parenthesized item.Bolus
A
2

None of your matching expressions work with the record you provided in my MS SQL Server Management Studio 2008r2.

From your description it sounds like there is an issue with the Tagged Expression 9 since the desired result is returned when using Tagged Expression 8, but not 9. You may want to ask Microsoft or report it as a bug.

A quicker solution would be to move the text you are performing the Find/Replace on in SSMS to a spread sheet and use cell formulas to parse the data into insert commands. If you have MS Excel the CONCATENATE, FIND, and MID functions will probably be useful. Also, it helps to split the values into their own columns so you can format the date, then use one concatenate to build your insert.

Please let me know if you need an example.

Update: I tried your example in MS SQL Server Management Studio 2008r2, Visual Studio 2005, and Visual Studio 2010 with the same result you get, \9 returns an empty string. Checking around I found that others are also having this issue (see the community content from Henrique Evaristo) and that the whole system has been replaced in the new editors.

So in answer to your question, SSMS does not support 9 tagged expressions due to a bug.

If you are unable to use the Spreadsheet idea you could try splitting the action into two parts, setting the first 8 values, then swinging back again to do the last. For example:

^{:w}:b{:w}:b{:z}:b{[0-9A-Za-z:b]+}:b{:w}:b{[A-Z]+}:b{:z}:b{:w}:b:z
select '\1','\2','\3','\4','\5','\6','\7','\8','\0'

:w:b:w:b:z:b[0-9A-Za-z:b]+:b:w:b[A-Z]+:b:z:b:w:b{:z}
\1
Ardeb answered 29/3, 2012 at 18:30 Comment(3)
Thanks for your reply. I think the reason they don't work for you is a problem with the way they were pasted into SO. I've updated the original text to reflect the 'singular space' between each item that came across as multiple when pasted from SQL to SO.Bolus
@Peter Anderson updated with results. I would switch to excel or code up your own solution, but I have supplied one in case you want it. You might want to prefix the '\0' with a special character since you will no longer be able to search by line. Or, you could change the second query to find the '\0' in the select and work from there.Ardeb
Thanks for your answer on this. That's what I needed. And thanks for the additional option at the end. I'm already splitting this thing three times to handle the various replaces, I hadn't thought of just using the \0 and taking the last bit on a fourth replace. Thanks again.Bolus

© 2022 - 2024 — McMap. All rights reserved.