Some changes on Soundex Algorithm
Asked Answered
B

3

7

This algorithm is set to run over the first word or till it fills the four encoded strings. For instance, the result of the input "Horrible Great" is: H612. It neglects the second word, or in other words it takes only the first letter from the second word to fill the encoded string.

I would like to change it by taking the first word and find its encoded string and THEN take the second word and find its encoded string; the output should be "H614 G600". Kindly i would like to know if there's a way to do that by doing some changing to **this code.
Thank you so much :)

    private string Soundex(string data)
    {
        StringBuilder result = new StringBuilder();
        if (data != null && data.Length > 0)
        {
            string previousCode = "", currentCode = "", currentLetter = "";
            result.Append(data.Substring(0, 1));
            for (int i = 1; i < data.Length; i++)
            {
                currentLetter = data.Substring(i,1).ToLower();
                currentCode = "";

                if ("bfpv".IndexOf(currentLetter) > -1)
                    currentCode = "1";
                else if ("cgjkqsxz".IndexOf(currentLetter) > -1)
                    currentCode = "2";
                else if ("dt".IndexOf(currentLetter) > -1)
                    currentCode = "3";
                else if (currentLetter == "l")
                    currentCode = "4";
                else if ("mn".IndexOf(currentLetter) > -1)
                    currentCode = "5";
                else if (currentLetter == "r")
                    currentCode = "6";

                if (currentCode != previousCode)
                    result.Append(currentCode);

                if (result.Length == 4) break;

                if (currentCode != "")
                    previousCode = currentCode;
            }
        }

        if (result.Length < 4)
            result.Append(new String('0', 4 - result.Length));

        return result.ToString().ToUpper();
    }
Beestings answered 4/10, 2011 at 18:15 Comment(0)
P
4

Sure, here is the solution I came up with. I wrapped the existing algorithm with another method that splits the strings and calls the original method. To use this, you would call SoundexByWord("Horrible Great") instead of calling Soundex("Horrible Great") and get the output of "H614 G630".

private string SoundexByWord(string data)
{
    var soundexes = new List<string>();
    foreach(var str in data.Split(' ')){
        soundexes.Add(Soundex(str));
    }
#if Net35OrLower
    // string.Join in .Net 3.5 and before require the second parameter to be an array.
    return string.Join(" ", soundexes.ToArray());
#endif
    // string.Join in .Net 4 has an overload that takes IEnumerable<string>
    return string.Join(" ", soundexes);
}
Photoelectron answered 5/10, 2011 at 17:13 Comment(3)
YOU ARE AMAZING ! thank you so much for sharing. allow me to edit the code you wrote above : private string SoundexByWord(string data) { var soundexes = new List<string>(); foreach(var str in data.Split(' ')){ soundexes.Add(Soundex(str)); } return string.Join(" ",soundexes.ToArray()); // Convert the } //list to an array coz join Fun. takes string array[] :)Beestings
That's a good point. The original answer was based on .Net 4. Based on your suggestion I expanded the answer to include earlier versions as well.Photoelectron
I admire your last edit and way of explaining :) thanks againBeestings
G
0

yes - first parse the string into an array of words (after you pick a separator)

then do this on each word

then assemble the results in some acceptable way and return.

Gallion answered 4/10, 2011 at 18:19 Comment(1)
i thought of a Split string then a concat string to seperate and assemble them but i want to change the algorithm itself somehow. i appreciate your anser though :))Beestings
J
0

The implementation in the question is correct but creates excess garbage with string operations. Here's a Char-array based implementation that's faster and creates very little garbage. It's designed as an extension method, and it handles phrases (words separated by spaces) as well:

    public static String Soundex( this String input )
    {
        var words = input.Split( ' ' );
        var result = new String[ words.Length ];
        for( var i = 0; i < words.Length; i++ )
            result[ i ] = words[ i ].SoundexWord();

        return String.Join( ",", result );
    }

    private static String SoundexWord( this String input )
    {
        var result = new Char[ 4 ] { '0', '0', '0', '0' };
        var inputArray = input.ToUpper().ToCharArray();

        if( inputArray.Length > 0 )
        {
            var previousCode = ' ';
            var resultIndex = 0;

            result[ resultIndex ] = inputArray[ 0 ];

            for( var i = 1; i < inputArray.Length; i++ )
            {
                var currentLetter = inputArray[ i ];
                var currentCode = ' ';

                if( "BFPV".IndexOf( currentLetter ) > -1 )
                    currentCode = '1';
                else if( "CGJKQSXZ".IndexOf( currentLetter ) > -1 )
                    currentCode = '2';
                else if( "DT".IndexOf( currentLetter ) > -1 )
                    currentCode = '3';
                else if( currentLetter == 'L' )
                    currentCode = '4';
                else if( "MN".IndexOf( currentLetter ) > -1 )
                    currentCode = '5';
                else if( currentLetter == 'R' )
                    currentCode = '6';

                if( currentCode != ' ' && currentCode != previousCode )
                    result[ ++resultIndex ] = currentCode;

                if( resultIndex == 3 ) break;

                if( currentCode != ' ' )
                    previousCode = currentCode;
            }
        }

        return new String( result );
    }
Jemmy answered 25/6, 2014 at 17:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.