How to Replace Multiple Characters in Access SQL?
Asked Answered
G

9

1

I'm a novice at SQL, so hopefully someone can spell this out for me. I tried following the "Replace Multiple Strings in SQL Query" posting, but I got stuck.

I'm trying to do the same thing as the originator of the above posting but with a different table and different fields. Let's say that the following field "ShiptoPlant" in table "BTST" has three records (my table actually has thousands of records)...

Table Name: BTST

   ---------------
   | ShiptoPlant |
   | ----------- |
   | Plant #1    |
   | Plant - 2   |
   | Plant/3     |
   ---------------

Here's what I'm trying to type in the SQL screen:

SELECT CASE WHEN ShipToPlant IN ("#", "-", "/") Then ""
ELSE ShipToPlant END FROM BTST;

I keep getting the message (Error 3075)...

"Syntax error (missing operator) in query expression 
'CASE WHEN ShiptoPlant IN (";","/"," ") Then "" ELSE ShipToPlant END'."

I want to do this operation for every character on the keyboard, with exception of "*" since it is a wildcard.

Any help you could provide would be greatly appreciated!

EDIT: Background Information added from the comments

I have collected line-item invoice-level data from each our 14 suppliers for the 2008 calendar year. I am trying to normalize the plant names that are given to us by our suppliers.

Each supplier can call a plant by a different name e.g.

Signode Service on our master list could be called by suppliers

Signode Service 
Signode - Service.
SignodeSvc
SignodeService

I'm trying to strip non-alphanumeric chars so that I can try to identify the plant using our master listing by creating a series of links that look at the first 10 char, if no match, 8 char, 6, 4...

My basic hang-up is that I don't know how to strip the alphanumeric characters from the table. I'll be doing this operation on several columns, but I planned on creating separate queries to edit the other columns.

Perhaps I need to do a mass update query that strips all the alphanumerics. I'm still unclear on how to write it. Here's what I started out with to take out all the spaces. It worked great, but failed when I tried to nest the replace

UPDATE BTST SET ShipToPlant = replace(ShipToPlant," ","");

EDIT 2: Further Information taken from Comments

Every month, up to 100 new permutations of our plant names appear in our line item invoice data- this could represent thousands of invoice records. I'm trying to construct a quick and dirty way to assign a master_id of the definitive name to each plant name permutation. The best way I can see to do so is to look at the plant, address, city and state fields, but the problem with this is that these fields have various permutations as well, for example,

128 Brookview Drive
128 Brookview Lane

By taking out alphanumerics and doing

LEFT(PlantName,#chars) & _
LEFT(Address,#chars) & _
LEFT(City,#chars) & _
LEFT(State,#chars) 

and by changing the number of characters until a match is found between the invoice data and the Master Plant Listing (both tables contain the Plant, Address, City and State fields), you can eventually find a match. Of course, when you start dwindling down the number of characters you are LEFTing, the accuracy becomes compromised. I've done this in excel and had decent yield. Can anyone recommend a better solution?

Glasgow answered 14/4, 2009 at 17:48 Comment(13)
@Glasgow - Do you need to permanently update the records or simply select a resultset with the characters replaced? Also, when you say every character, do you literally mean every character e.g. a-z, A-Z, 1-9, !"£$ etc...? Can I ask what you need to do this for?Ostrogoth
This is a bit confusing - Are you just trying strip non-alphanumeric characters from the plant name?Hellman
Russ - It's a bit complicated. The simple answer to your question is that I can do either. I have collected line-item invoice-level data from each ouf our fourteen suppliers for the 2008 calendar year. I am trying to normalize the plant names that are given to us by our suppliers.Glasgow
Each supplier can call a plant by a different name (ie. "Signode Service" vs "Signode - Service"). I'm trying to strip non-alphanumeric chars so that I can try to identify the plant (using our master listing) by creating a series of links that look at the first 10 char, if no match, 8 char, 6, 4...Glasgow
For example, our master plant listing could say "Signode Service". One supplier may list this plant as "Signode Svc". Another may use "Signode - Service". If I strip non-alphanumerics from each of these, I can begin looking for matches on various lengths of the string.Glasgow
1. SignodeService (say this is is the master listing w/ PK = 1) 2. SignodeSvc (this is Supplier A) 3. SignodeService (this is Supplier B) - (1) and (3) match left(plantname,9) - All match left(plantname,8) - Goal = Link tbls so that both suppliers to bring back primary key # 1.Glasgow
My basic hang-up is that I don't know how to strip the alphanumeric characters from the table. I'll be doing this operation on several columns, but I planned on creating separate queries to edit the other columns.Glasgow
Perhaps I need to do a mass update query that strips all the alphanumerics. I'm still unclear on how to write it. Here's what I started out with to take out all the spaces. It worked great, but failed when I tried to nest the replace: UPDATE BTST SET ShipToPlant = replace(ShipToPlant," ","");Glasgow
@Glasgow - See my update for a way to do this. You could also handle using a lookup table. Essentially, map all of the different supplier permutations for a plant name to one id.Ostrogoth
Alternatively, Remou's regular expression approach will work well. I would advise cleaning the data instead of running database queries on partial string matches which will not be performantOstrogoth
@Glasgow - there are three good solutions here: mine, Remou's or Russ' - pick one of them...Hellman
@DJ - Please see my response to Russ's posting. Thanks for your help, by the way.Glasgow
You got plenty of answers. For filtering the records you are going to update and save time, don't forget the power of Like: WHERE ShipToPlant LIKE "*[#-/]*"Benefield
M
8

You may wish to consider a User Defined Function (UDF)

SELECT ShiptoPlant, CleanString([ShiptoPlant]) AS Clean
FROM Table


Function CleanString(strText)
Dim objRegEx As Object

Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True

objRegEx.Pattern = "[^a-z0-9]"
CleanString = objRegEx.Replace(strText, "")

End Function
Melva answered 14/4, 2009 at 20:1 Comment(0)
O
4

You could use the built in Replace function within Access

SELECT
    Replace(Replace(Replace(ShipToPlant, "#", ""), "-", ""), "/", "") AS ShipToPlant
FROM
    BTST

As others have said, within Access you can write your own functions in VBA and use them in your queries.

EDIT:

Here's a way to handle the nested Replace limit by wrappering the Replace function within our own function. It feels dirty but it works- put this in a module within Access

Public Function SuperReplace(ByRef field As String, ByVal ReplaceString As String) As String
    ' Size this as big as you need... it is zero-based by default' 
    Dim ReplaceArray(3) As String

    'Fill each element with the character you need to replace'  
    ReplaceArray(0) = "#"
    ReplaceArray(1) = "-"
    ReplaceArray(2) = "/"
    ReplaceArray(3) = " "

    Dim i As Integer
    For i = LBound(ReplaceArray) To UBound(ReplaceArray)    
       field = Replace(field, ReplaceArray(i), ReplaceString)
    Next i

    SuperReplace = field    
End Function

Then test it with this query

SELECT 
    SuperReplace(ShipToPlant,"") AS ShipToPlant
FROM
    BTST

You might want to take this an expand it so that you can pass in an array of strings instead of hard-coding them into the function.

EDIT 2:

In response to the additional information in the comments on the question, here's a suggestion for how you might want to handle the situation differently. The advantage to this apprach is that once you have mapped in a plant name permutation, you won't need to perform a string replace on future data in future years, only add new plant names and permutations to the map.

Start with creating another table, let's call it plant_map

CREATE TABLE plant_map (id AUTOINCREMENT PRIMARY KEY, name TEXT, master_id LONG)

into plant_map, add all of the permutations for plant names and insert the id for the name you wish to use to refer to a particular plant name permutation group with, into the master_id field. From your comments, I'll use Signode Service

INSERT INTO plant_map(name, master_id) VALUES ("Signode Service", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode Svc", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode - Service", 1);
INSERT INTO plant_map(name, master_id) VALUES ("Signode svc", 1);
INSERT INTO plant_map(name, master_id) VALUES ("SignodeService", 1);

Now when you query BTST table, you can get data for Signode Service using

SELECT
    field1,
    field2
FROM
    BTST source
INNER JOIN
    (
    plant_map map1      
    INNER JOIN
    plant_map map2
    ON map1.master_id = map2.id
    )
    ON source.ShipToPlant = map1.name
WHERE
    map2.name = "Signode Service"

Data within table BTST can remain unchanged.

Essentially, this is joining on the plant name in BTST to the name in plant_map then, using master_id, self joining on id within plant_map so that you need only pass in one "common" name. I would advise putting an index on each of the columns name and master_id in plant_map as both fields will be used in joins.

Ostrogoth answered 14/4, 2009 at 18:41 Comment(7)
Hi Russ, Evidently Access has a limit to the number times you can nest the Replace function. I tried doing 35 nests. Of course, I could break it up across several queries, but I'd rather have everything in one query if possible.Glasgow
Russ -- We're thinking along the same track. Every month, up to 100 new permutations of our plant names appear in our line item invoice data. I'm trying to construct a quick and dirty way to assign the master_id you mentioned...Glasgow
Rather, 100 unique permutations of our plant names appear -- this could represent thousands of invoice records. Anyway, I'm trying to create a process to assign the master_id that I have in our "Master Plant Listing". The best way to do so is to look at the plant, address, city and state fields...Glasgow
but these fields have various permutations as well (Ex. 128 Brookview Drive vs. 128 Brookview Lane). By taking out alphanumerics and doing LEFT(PlantName,#chars)&LEFT(Address,#chars)&LEFT(City,#chars)&LEFT(State,#chars) and by changing the # chars until a match is found between the invoice data...Glasgow
and the Master Plant Listing (both tables contain the Plant, Address, City and State fields), you can eventually find a match. Of course, when you start dwindling down the # of chars you are lefting, the accuracy becomes compromised. I've done this in excel and had decent yield.Glasgow
I don't envy having 100 new permutations of the plant names every month - can the process of customers detailing plant names be tightened up in any way, or is it entirely out of your control? It sounds like you might want to look at pattern matching to ascertain whether 2 plants are the same plantOstrogoth
+1 for the _map table approach. FWIW in our shop we identify such tables by appending the element name 'Synonyms'.Northrup
D
2

Don't think Access supports the CASE statement. Consider using iif:

iif ( condition, value_if_true, value_if_false )

For this case you can use the REPLACE function:

SELECT 
    REPLACE(REPLACE(REPLACE(yourfield, '#', ''), '-', ''), '/', '') 
    as FieldName
FROM
    ....
Dorris answered 14/4, 2009 at 18:26 Comment(2)
Hi Andomar, I just tried to nest 35 replace statements and received a message saying that the query was "too complex". I'm going to try the IIF option next. Will post a response within a few minutes. Thanks for your help.Glasgow
IIf() will be much worse. If you have 35 values you want to replace, then you need to write a function to do this.Matrilateral
H
1

OK, your question has changed, so the solution will too. Here are two ways to do it. The quick and dirty way will only partially solve your issue because it won't be able to account for the more odd permutations like missing spaces or misspelled words. The quick and dirty way:

  1. Create a new table - let's call it tChar.
  2. Put a text field in it - the char(s) you want to replace - we'll call it char for this example
  3. Put all the char or char combinatios that you want removed in this table.
  4. Create and run the query below. Note that it will only remove one item at a time, but you can also put different versions of the same replacement in it too like ' -' or '-' For this example I created a table called tPlant with a field called ShipToPlant.

    SELECT tPlant.ShipToPlant, Replace([ShipToPlant], (SELECT top 1 char FROM tChar WHERE instr(ShipToPlant,char)<>0 ORDER BY len(char) Desc),"" ) AS New FROM tPlant;

The better (but much more complex) way. This explanation is going to be general because it would be next to impossible to put the whole thing in here. If you want to contact me directly use my user name at gmail.:

  1. Create a table of Qualifiers - mistakes that people enter like svc instead of service. Here you would enter every wierd permutation you get.
  2. Create a table with QualifierID and Plant ID. Here you would say which qualifier goes to which plant.
  3. Create a query that joins the two and your table with mistaken plant names in it. Use instr so say what is in the fields.
  4. Create a second query that aggragates the first. Count the instr field and use it as a score. The entry with the highest score is the plant.
  5. You will have to hand enter the ones it can't find, but pretty soon that will be next to none as you have more and more entries in the table.

ughh


You have a couple different choices. In Access there is no CASE in sql, you need to use IIF. It's not quite as elegant as the solutions in the more robust db engines and needs to be nested for this instance, but it will get the job done for you.

SELECT
    iif(instr(ShipToPlant,"#")<>0,"",
    iif(instr(ShipToPlant,"-")<>0,"",
    iif(instr(ShipToPlant,"/")<>0,"",ShipToPlant ))) AS FieldName
FROM BTST;

You could also do it using the sql to limit your data.

SELECT YourID, nz(aBTST.ShipToPlant,"") AS ShipToPlant  
FROM BTST LEFT JOIN (
    SELECT YourID, ShipToPlant 
    FROM BTST 
    WHERE ShipToPlant NOT IN("#", "-", "/")
    ) as aBTST ON BTST.YourID=aBTST.YourID

If you know VB you can also create your own functions and put them in the queries...but that is another post. :) HTH

Hemphill answered 14/4, 2009 at 18:30 Comment(2)
Hi Praesagus, I tried using your first option. When I ran the query, it did not show the contents of the cells that had #, -, or /. In other words, a cell that showed "Bob/Johnson" returned blank. I was expecting to see "BobJohnson".Glasgow
Now that you have updated your question it is a much different scenario. You are on the right track - Replace is the answer. UPDATE BTST SET ShipToPlant = replace(replace(replace(ShipToPlant,"#",""),"-",""),"/",""). Or run 3 queries with one replace each.Hemphill
H
1

Create a public function in a Code module.

Public Function StripChars(ByVal pStringtoStrip As Variant, ByVal pCharsToKeep As String) As String

Dim sChar As String
Dim sTemp As String
Dim iCtr As Integer

  sTemp = ""

  For iCtr = 1 To Len(pStringtoStrip)
    sChar = Mid(pStringtoStrip, iCtr, 1)
    If InStr(pCharsToKeep, sChar) > 0 Then
      sTemp = sTemp & sChar
    End If
  Next

  StripChars = sTemp

End Function

Then in your query

SELECT
    StripChars(ShipToPlant, "abcdefghijklmnopqrstuvwxyz0123456789") AS ShipToPlantDisplay  
FROM 
    BTST

Notes - this will be slow for lots of records - if you what this to be permanent then create an update query using the same function.

EDIT: to do an Update:

UPDATE BTST
    SET ShipToPlant = StripChars(ShipToPlant, "abcdefghijklmnopqrstuvwxyz0123456789")
Hellman answered 14/4, 2009 at 19:54 Comment(2)
Hi DJ -- I tried doing this but for non-alphanumerics. The resultset yielded only blank cells. Any guidance?Glasgow
Sorry - slight typo -> If InStr(sCharsToKeep, sChar) > 0 Then should be If InStr(pCharsToKeep, sChar) > 0 Then - I updated my answer - also you list the characters you want to keep - anything else will get stripped.Hellman
S
0

SELECT 
IIF
(
    Instr(1,ShipToPlant , "#") > 0 
    OR Instr(1,ShipToPlant , "/") > 0 
    OR Instr(1,ShipToPlant , "-") > 0, ""
    , ShipToPlant 
)
FROM BTST

Silique answered 14/4, 2009 at 18:28 Comment(1)
Hi Shahkalpesh, I tried using your first option. When I ran the query, it did not show the contents of the cells that had #, -, or /. In other words, a cell that showed "Bill-Jones" returned blank. I was expecting to see "BillJones".Glasgow
G
0

All - I wound up nesting the REPLACE() function in two separate queries. Since there's upwards of 35 non-alphanumeric characters that I needed to replace and Access limits the complexity of the query to somewhere around 20 nested functions, I merely split it into two processes. Somewhat clunky, but it worked. Should have followed the KISS principle in this case. Thanks for your help!

Glasgow answered 16/4, 2009 at 21:44 Comment(1)
Did you see the query string example I made for you? It will essentially take the place of as many replaces as you want in your recordset, plus you can manage them all from a table. very easy. :)Hemphill
C
0

I know this is a really old question, but I stumbled over it whilst looking for a solution to this problem, but ended up using a different approach.

The field that I wish to update is called 'Customers'. There are 20-odd accented characters in the 'CustName' field for which I wish to remove the diacritics - so (for example) ã > a.

For each of these characters I created a new table 'recodes' with 2 fields 'char' and 'recode'. 'char' contains the character I wish to remove, and 'recode' houses the replacement.

Then for the replace I did a full outer join inside the update statement

UPDATE Customers, Recodes SET Customers.CustName = Replace([CustName],[char],[recode]);

This has the same effect as nesting all of the replace statements, and is a lot easier to manage.

Cestar answered 14/9, 2015 at 16:10 Comment(0)
V
0

This query grabs the 3 first characters and replace them with Blanks

Example: BO-1234
Output: 1234

BO: IIf(IsNumeric(Left([sMessageDetails],3)),[sMessageDetails],Replace([sMessageDetails],Left([sMessageDetails],3),""))
Variate answered 20/5, 2019 at 14:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.