Find and Replace credit card numbers
Asked Answered
J

6

5

We have a large database with a lot of data in it. I found out recently our sales and shipping department have been using a part of the application to store clients credit card numbers in the open. We've put a stop to it, but now there are thousands of rows with the numbers.

We're trying to figure out how to scan certain columns for 16 digits in a row (or dash separation) and replace them with X's.

It's not a simple UPDATE statement because the card numbers are stored among large amounts of text. So far I've been unable to figure out if SQL Server is capable of regex (it would seem not).

All else fails i will do this through PHP since that is what i'm best at... but it'll be painful.

Jacobo answered 26/6, 2012 at 13:23 Comment(6)
RegEx is certainly possible in SQL Server, just not natively - you need to use CLR. There are lots of examples of this online: - codeproject.com/Articles/42764/… - justgeeks.blogspot.com/2008/08/… - and probably the most authoritative resource here: - sqlmag.com/article/tsql3/clr-or-not-clr-is-that-the-question- In SQL Server 2008 R2+, you can access some RegEx functionality through Master Data Services, without having to write your own CLR: - thefirstsql.com/2011/02/07/rOsteoporosis
This is the kind of thing that application code tends to be easier to design to handle. You are right, reg ex seems the way to go. You could do it in C# and put it in a CLR stored proc so it can easily be run from SSMS. Sadly since this is in a text field, you may need to identify which records are potentially bad and then have someone manually examine them. After all a part number could end up with the right number of digits as well. I suggest the people who did the data entry incorrectly be the ones to do the manual examining.Endamoeba
You might also search for the phrases credit card, card number, CCN as clues to records that might have the card number in them.Endamoeba
I agree with HLGEM. Also, consider adding validation on the front-end to prevent this from happening in the future. Rather than just telling users not to do it, you could attempt to detect that a card number was entered and then require, say, a badge scan or whatever from a supervisor. That way, there's less chance of it becoming a problem again later.Brisling
Yeah, you should be clear at the legal implications with your users as well. If your systems fail to be PCI compliant you can lose the ability to take cards at all.Mariellemariellen
@Endamoeba "I suggest the people who did the data entry incorrectly be the ones to do the manual examining." ---- THISMariellemariellen
Q
4

I think you are better off doing this programatically, especially since you mentioned the data can be in a couple of different formats. Do keep in mind that not all credit card numbers are 16 digits long (Amex is 15, Visa is 13 or 16, etc).

The ability to check for various regexes and validate code will probably be best served at a cleanup job level, if possible.

Quota answered 26/6, 2012 at 13:33 Comment(1)
Agreed - we do this programmatically. I added an example below for anyone who wants to use PHP.Stoned
A
5

Sounds like you need to use PATINDEX with a WHERE LIKE clause.

Something like this. Create a stored proc with something similar, then call it with a bunch of different parameters (make @pattern & @patternlength the params) that you have identified, until you've replaced all of the instances.

declare @pattern varchar(100), @patternlength int
set @pattern = '[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]'
set @patternlength = 19

update  tableName
set fieldName = 
    LEFT(fieldName, patindex('%'+ @pattern + '%', fieldName)-1) 
    + 'XXXX-XXXX-XXXX-XXXX' 
    + SUBSTRING(fieldName, PATINDEX('%'+ @pattern + '%', fieldName)+@patternlength, LEN(fieldName))
from tableName
where fieldName like '%'+ @pattern + '%'

The trick is just finding the appropriate patterns, and setting the appropriate @patternlength value (not the length of @pattern as that won't work!)

Assagai answered 26/6, 2012 at 13:26 Comment(0)
Q
4

I think you are better off doing this programatically, especially since you mentioned the data can be in a couple of different formats. Do keep in mind that not all credit card numbers are 16 digits long (Amex is 15, Visa is 13 or 16, etc).

The ability to check for various regexes and validate code will probably be best served at a cleanup job level, if possible.

Quota answered 26/6, 2012 at 13:33 Comment(1)
Agreed - we do this programmatically. I added an example below for anyone who wants to use PHP.Stoned
T
1

Improvised Sean's answer.

The following will find all the occurrences of @maskPattern in @text and replace them with 'x'. Example, If @maskPattern = XXXX-XXXX-XXXX-XXXX, it will find this pattern in @text and replace all occurrences with XXXX-XXXX-XXXX-XXXX. If it does not find any occurrence, it will leave the text as is.

This stored procedure can also be manipulated to only mask 3/4th of the beginning of the maskPattern. Cheers!

  ALTER PROCEDURE [dbo].[SP_MaskCharacters] @text nvarchar(max),
  @maskPattern nvarchar(500)
  AS
   BEGIN
  DECLARE @numPattern nvarchar(max) = REPLACE(@maskPattern, 'x', '[0-9]')
  DECLARE @patternLength int = LEN(@maskPattern)
  WHILE (@text IS NOT NULL)
  BEGIN
  IF PATINDEX('%' + @numPattern + '%', @text) = 0  BREAK;
  SET @text =
    LEFT(@text, PATINDEX('%' + @numPattern + '%', @text)-1) --Get beginning chars of the input text until first occurance of pattern is found
    + @maskPattern --Append aasking pattern
    + SUBSTRING(@text, PATINDEX('%' + @numPattern + '%', @text) + @patternLength, LEN(@text)) -- Get & append rest of the text found after masking attern
    END
    SELECT @text
END
Timon answered 27/8, 2016 at 0:29 Comment(0)
A
1

I faced this situation recently. Using Patindex and Stuff should help, but you would need to repeat for CC numbers with different number of digits separately.

-- For 16 digits CC numbers
UPDATE table
SET    columnname = Stuff (columnname, Patindex(
'%[3-6][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
, columnname), 16, '################')
WHERE  Patindex(
'%[3-6][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
, columnname) > 0 

Ardor answered 13/4, 2021 at 9:39 Comment(0)
T
0

You can use patindex. It won't be pretty and there might be a more concise way to write it. But you can use sets ie [0-9]

patindex: http://msdn.microsoft.com/en-us/library/ms188395.aspx

similar question: SQL Server Regular expressions in T-SQL

Trafalgar answered 26/6, 2012 at 13:31 Comment(0)
S
0

For anyone finding this question who does want to use PHP, here's a function I use that takes a credit card number (all digits, with dashes, or with spaces) and replaces all but the first and last 4 digits with 'X'.

To accept credit card numbers with dashes as well, use this regex pattern instead:

$cc_regex_pattern = '/(\d{4})(-)?(\d{4})(-)?(\d{4})(-)?(\d{4})/'

and remove the preprocessing of the cc number that removes the dashes:

$compressed_cc_number = preg_replace('/(\ |-)/', '', $credit_card_number);

and so the replacement string becomes (because we've changed the index of patterns - note the $7):

$cc_regex_replacement = '$1' . $cc_middle_pattern . '$7';

or if you want, simply replace the whole cc number, like in the original question:

$cc_regex_replacement = 'XXXX$2XXXX$4XXXX$6XXXX';

Here's the original function for credit card numbers with or without spaces or dashes, which obfuscates and removes any dashes:

/**
 * @param integer|string $credit_card_number
 * @return mixed
 */
static function obfuscate_credit_card($credit_card_number)
{

    $compressed_cc_number = preg_replace('/(\ |-)/', '', $credit_card_number);

    $cc_length = strlen($compressed_cc_number);
    $cc_middle_length = $cc_length >= 9 ? $cc_length - 8 : 0;

    //create middle pattern
    $cc_middle_pattern = '';
    for ($i = 0; $i < $cc_middle_length; $i++) {
        $cc_middle_pattern .= 'X';
    }

    //replace cc middle digits with middle pattern
    $cc_regex_pattern = '/(\d{4})(\d+)(\d{4})/';
    $cc_regex_replacement = '$1' . $cc_middle_pattern . '$3';
    $obfuscated_cc = preg_replace($cc_regex_pattern, $cc_regex_replacement, $compressed_cc_number);

    return $obfuscated_cc;
}
Stoned answered 25/5, 2018 at 20:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.