How can I convert text to Pascal case?
Asked Answered
L

11

48

I have a variable name, say "WARD_VS_VITAL_SIGNS", and I want to convert it to Pascal case format: "WardVsVitalSigns"

WARD_VS_VITAL_SIGNS -> WardVsVitalSigns

How can I make this conversion?

Leuctra answered 5/9, 2013 at 3:11 Comment(3)
Do you really need to use regular expressions, or is a method without regular expressions fine?Infliction
If you have a problem and If you want to use regular expression to solve that, you now have two problems. ;-)Ceramics
@AshishGupta ;-) you're right , I do make the problem more complicated to use RegEx to solve .Leuctra
C
24

First off, you are asking for title case and not camel-case, because in camel-case the first letter of the word is lowercase and your example shows you want the first letter to be uppercase.

At any rate, here is how you could achieve your desired result:

string textToChange = "WARD_VS_VITAL_SIGNS";
System.Text.StringBuilder resultBuilder = new System.Text.StringBuilder();

foreach(char c in textToChange)
{
    // Replace anything, but letters and digits, with space
    if(!Char.IsLetterOrDigit(c))
    {
        resultBuilder.Append(" ");
    }
    else 
    { 
        resultBuilder.Append(c); 
    }
}

string result = resultBuilder.ToString();

// Make result string all lowercase, because ToTitleCase does not change all uppercase correctly
result = result.ToLower();

// Creates a TextInfo based on the "en-US" culture.
TextInfo myTI = new CultureInfo("en-US",false).TextInfo;

result = myTI.ToTitleCase(result).Replace(" ", String.Empty);

Note: result is now WardVsVitalSigns.

If you did, in fact, want camel-case, then after all of the above, just use this helper function:

public string LowercaseFirst(string s)
{
    if (string.IsNullOrEmpty(s))
    {
        return string.Empty;
    }

    char[] a = s.ToCharArray();
    a[0] = char.ToLower(a[0]);

    return new string(a);
}

So you could call it, like this:

result = LowercaseFirst(result);
Coakley answered 5/9, 2013 at 3:27 Comment(5)
Why does this not make result = Wardvsvitalsigns?Tarango
@KarlAnderson if(!Char.IsLetterOrDigit(c)) { resultBuilder.Append(" "); } should be if (!Char.IsLetterOrDigit(c)) { resultBuilder.Append(" "); } else { resultBuilder.Append(c); }, otherwise , resultBuilder is always empty . ;-)Leuctra
I think technically it's called Pascal case (or Upper camel case)Makebelieve
@Makebelieve Thank you for reminding , I will update the question .Leuctra
The result is not Title Case since it removes spaces and probably doesn't have rules for "a", "the", "and", etc.Lyallpur
A
87

You do not need a regular expression for that.

var yourString = "WARD_VS_VITAL_SIGNS".ToLower().Replace("_", " ");
TextInfo info = CultureInfo.CurrentCulture.TextInfo;
yourString = info.ToTitleCase(yourString).Replace(" ", string.Empty);
Console.WriteLine(yourString);
Adkison answered 5/9, 2013 at 3:32 Comment(2)
simple, elegant, and worksUncovered
Does not work (except for the specific OP example), this will convert a "PascalCase" to "Pascalcase"Crocidolite
L
51

Here is my quick LINQ & regex solution to save someone's time:

using System;
using System.Linq;
using System.Text.RegularExpressions;

public string ToPascalCase(string original)
{
    Regex invalidCharsRgx = new Regex("[^_a-zA-Z0-9]");
    Regex whiteSpace = new Regex(@"(?<=\s)");
    Regex startsWithLowerCaseChar = new Regex("^[a-z]");
    Regex firstCharFollowedByUpperCasesOnly = new Regex("(?<=[A-Z])[A-Z0-9]+$");
    Regex lowerCaseNextToNumber = new Regex("(?<=[0-9])[a-z]");
    Regex upperCaseInside = new Regex("(?<=[A-Z])[A-Z]+?((?=[A-Z][a-z])|(?=[0-9]))");

    // replace white spaces with undescore, then replace all invalid chars with empty string
    var pascalCase = invalidCharsRgx.Replace(whiteSpace.Replace(original, "_"), string.Empty)
        // split by underscores
        .Split(new char[] { '_' }, StringSplitOptions.RemoveEmptyEntries)
        // set first letter to uppercase
        .Select(w => startsWithLowerCaseChar.Replace(w, m => m.Value.ToUpper()))
        // replace second and all following upper case letters to lower if there is no next lower (ABC -> Abc)
        .Select(w => firstCharFollowedByUpperCasesOnly.Replace(w, m => m.Value.ToLower()))
        // set upper case the first lower case following a number (Ab9cd -> Ab9Cd)
        .Select(w => lowerCaseNextToNumber.Replace(w, m => m.Value.ToUpper()))
        // lower second and next upper case letters except the last if it follows by any lower (ABcDEf -> AbcDef)
        .Select(w => upperCaseInside.Replace(w, m => m.Value.ToLower()));

    return string.Concat(pascalCase);
}

Example output:

"WARD_VS_VITAL_SIGNS"          "WardVsVitalSigns"
"Who am I?"                    "WhoAmI"
"I ate before you got here"    "IAteBeforeYouGotHere"
"Hello|Who|Am|I?"              "HelloWhoAmI"
"Live long and prosper"        "LiveLongAndProsper"
"Lorem ipsum dolor..."         "LoremIpsumDolor"
"CoolSP"                       "CoolSp"
"AB9CD"                        "Ab9Cd"
"CCCTrigger"                   "CccTrigger"
"CIRC"                         "Circ"
"ID_SOME"                      "IdSome"
"ID_SomeOther"                 "IdSomeOther"
"ID_SOMEOther"                 "IdSomeOther"
"CCC_SOME_2Phases"             "CccSome2Phases"
"AlreadyGoodPascalCase"        "AlreadyGoodPascalCase"
"999 999 99 9 "                "999999999"
"1 2 3 "                       "123"
"1 AB cd EFDDD 8"              "1AbCdEfddd8"
"INVALID VALUE AND _2THINGS"   "InvalidValueAnd2Things"
Lachus answered 7/9, 2017 at 11:59 Comment(4)
This answer does not have enough upvotes. Nice utility. Thanks!Cytoplasm
yeah, this or some other efficient version (not sure if it's possible to implement it without regex) should be part of .NETThulium
This answer is too complex to be considered quick. Also there is not a true understanding of how to handle character classes in regular expressions in an efficient way. I show how to do such a replacement more efficiently in my answer below.Glutelin
I think that replace all invalid chars with undescore is better as: var pascalCase = invalidCharsRgx.Replace(whiteSpace.Replace(original, ""), "")Mcquiston
C
24

First off, you are asking for title case and not camel-case, because in camel-case the first letter of the word is lowercase and your example shows you want the first letter to be uppercase.

At any rate, here is how you could achieve your desired result:

string textToChange = "WARD_VS_VITAL_SIGNS";
System.Text.StringBuilder resultBuilder = new System.Text.StringBuilder();

foreach(char c in textToChange)
{
    // Replace anything, but letters and digits, with space
    if(!Char.IsLetterOrDigit(c))
    {
        resultBuilder.Append(" ");
    }
    else 
    { 
        resultBuilder.Append(c); 
    }
}

string result = resultBuilder.ToString();

// Make result string all lowercase, because ToTitleCase does not change all uppercase correctly
result = result.ToLower();

// Creates a TextInfo based on the "en-US" culture.
TextInfo myTI = new CultureInfo("en-US",false).TextInfo;

result = myTI.ToTitleCase(result).Replace(" ", String.Empty);

Note: result is now WardVsVitalSigns.

If you did, in fact, want camel-case, then after all of the above, just use this helper function:

public string LowercaseFirst(string s)
{
    if (string.IsNullOrEmpty(s))
    {
        return string.Empty;
    }

    char[] a = s.ToCharArray();
    a[0] = char.ToLower(a[0]);

    return new string(a);
}

So you could call it, like this:

result = LowercaseFirst(result);
Coakley answered 5/9, 2013 at 3:27 Comment(5)
Why does this not make result = Wardvsvitalsigns?Tarango
@KarlAnderson if(!Char.IsLetterOrDigit(c)) { resultBuilder.Append(" "); } should be if (!Char.IsLetterOrDigit(c)) { resultBuilder.Append(" "); } else { resultBuilder.Append(c); }, otherwise , resultBuilder is always empty . ;-)Leuctra
I think technically it's called Pascal case (or Upper camel case)Makebelieve
@Makebelieve Thank you for reminding , I will update the question .Leuctra
The result is not Title Case since it removes spaces and probably doesn't have rules for "a", "the", "and", etc.Lyallpur
O
13

Single semicolon solution:

public static string PascalCase(this string word)
{
    return string.Join("" , word.Split('_')
                 .Select(w => w.Trim())
                 .Where(w => w.Length > 0)
                 .Select(w => w.Substring(0,1).ToUpper() + w.Substring(1).ToLower()));
}
Ogle answered 3/8, 2017 at 13:51 Comment(4)
Not sure what this was intending to be but this certainly does not result in PascalCase. PascalCase doesn't contain spaces...Hypomania
This is joining every bit of the text, don't see empty spacesMauer
It's been 5 years since I wrote this but I'd agree, don't really know what the spaces problem was. It's splitting on the underscore, trimming the spaces off, filtering out any zero length strings, then uppercasing the first and lowercasing the rest, then using string.join to put everything back together (should use string builder underneath). Meh!Ogle
@Ogle FYI, The spaces problem would occur if word contains spaces instead of _. Since Trim() only removes spaces before and after the split wordPauiie
S
9

Extension method for System.String with .NET Core compatible code by using System and System.Linq.

Does not modify the original string.

.NET Fiddle for the code below

using System;
using System.Linq;

public static class StringExtensions
{
    /// <summary>
    /// Converts a string to PascalCase
    /// </summary>
    /// <param name="str">String to convert</param>

    public static string ToPascalCase(this string str){

        // Replace all non-letter and non-digits with an underscore and lowercase the rest.
        string sample = string.Join("", str?.Select(c => Char.IsLetterOrDigit(c) ? c.ToString().ToLower() : "_").ToArray());

        // Split the resulting string by underscore
        // Select first character, uppercase it and concatenate with the rest of the string
        var arr = sample?
            .Split(new []{'_'}, StringSplitOptions.RemoveEmptyEntries)
            .Select(s => $"{s.Substring(0, 1).ToUpper()}{s.Substring(1)}");

        // Join the resulting collection
        sample = string.Join("", arr);

        return sample;
    }
}

public class Program
{
    public static void Main()
    {
        Console.WriteLine("WARD_VS_VITAL_SIGNS".ToPascalCase()); // WardVsVitalSigns
        Console.WriteLine("Who am I?".ToPascalCase()); // WhoAmI
        Console.WriteLine("I ate before you got here".ToPascalCase()); // IAteBeforeYouGotHere
        Console.WriteLine("Hello|Who|Am|I?".ToPascalCase()); // HelloWhoAmI
        Console.WriteLine("Live long and prosper".ToPascalCase()); // LiveLongAndProsper
        Console.WriteLine("Lorem ipsum dolor sit amet, consectetur adipiscing elit.".ToPascalCase()); // LoremIpsumDolorSitAmetConsecteturAdipiscingElit
    }
}
Spacetime answered 21/10, 2016 at 6:19 Comment(0)
T
2
var xs = "WARD_VS_VITAL_SIGNS".Split('_');

var q =

    from x in xs

    let first_char = char.ToUpper(x[0]) 
    let rest_chars = new string(x.Skip(1).Select(c => char.ToLower(c)).ToArray())

    select first_char + rest_chars;
Tittup answered 5/9, 2013 at 3:33 Comment(0)
I
2

Some answers are correct but I really don't understand why they set the text to LowerCase first, because the ToTitleCase will handle that automatically:

var text = "WARD_VS_VITAL_SIGNS".Replace("_", " ");

TextInfo textInfo = CultureInfo.CurrentCulture.TextInfo;
text = textInfo.ToTitleCase(text).Replace(" ", string.Empty);

Console.WriteLine(text);
Inez answered 8/8, 2019 at 21:39 Comment(2)
Because ToTitleCase is not efficient in many use cases, see this answer in current page.Elitism
ToTitleCase doesnt make other characters lowercase, at least, not in .NET Core 3.1 where i just needed it. So had to do a ToLower first, then it was correct.Coquelicot
V
2

You can use this:

public static string ConvertToPascal(string underScoreString)
    {
        string[] words = underScoreString.Split('_');

        StringBuilder returnStr = new StringBuilder();

        foreach (string wrd in words)
        {
            returnStr.Append(wrd.Substring(0, 1).ToUpper());
            returnStr.Append(wrd.Substring(1).ToLower());

        }
        return returnStr.ToString();
    }
Valerianaceous answered 7/12, 2020 at 9:26 Comment(0)
T
2

This answer understands that there are Unicode categories which can be tapped while processing the text to ignore the connecting characters such as - or _. In regex parlance it is \p (for category) then the type which is {Pc} for punctuation and connector type character; \p{Pc} using our MatchEvaluator which is kicked off for each match within a session.

So during the match phase, we get words and ignore the punctuations, so the replace operation handles the removal of the connector character. Once we have the match word, we can push it down to lowercase and then only up case the first character as the return for the replace:

public static class StringExtensions
{
    public static string ToPascalCase(this string initial)
        => Regex.Replace(initial, 
                       // (Match any non punctuation) & then ignore any punctuation
                         @"([^\p{Pc}]+)[\p{Pc}]*", 
                         new MatchEvaluator(mtch =>
        {
            var word = mtch.Groups[1].Value.ToLower();

            return $"{Char.ToUpper(word[0])}{word.Substring(1)}";
        }));
}

Usage:

"TOO_MUCH_BABY".ToPascalCase(); // TooMuchBaby
"HELLO|ITS|ME".ToPascalCase();  // HelloItsMe

See Word Character in Character Classes in Regular Expressions

Pc Punctuation, Connector. This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F.

Terminus answered 20/10, 2021 at 20:48 Comment(1)
Have you tested with "AlreadyPascalCase" :)Matz
L
2

If you did want to replace any formatted string into a pascal case then you can do

    public static string ToPascalCase(this string original)
    {
        string newString = string.Empty;
        bool makeNextCharacterUpper = false;
        for (int index = 0; index < original.Length; index++)
        {
            char c = original[index];
            if(index == 0)
                newString += $"{char.ToUpper(c)}";
            else if (makeNextCharacterUpper)
            {
                newString += $"{char.ToUpper(c)}";
                makeNextCharacterUpper = false;
            }
            else if (char.IsUpper(c))
                newString += $" {c}";
            else if (char.IsLower(c) || char.IsNumber(c))
                newString += c;
            else if (char.IsNumber(c))
                newString += $"{c}";
            else
            {
                makeNextCharacterUpper = true;   
                newString += ' ';
            }
        }

        return newString.TrimStart().Replace(" ", "");
    }

Tested with strings I|Can|Get|A|String ICan_GetAString i-can-get-a-string i_can_get_a_string I Can Get A String ICanGetAString

Leathers answered 21/4, 2022 at 15:29 Comment(4)
Apart from its generality, I prefer this approach because it's more efficient and arguably clearer than other suggestions. Its efficiency can be improved, though, by making newString a StringBuilder, and consistently appending individual characters (rather than sometimes strings).Caterwaul
@Caterwaul Thanks, and I agree with using a string builder. Would be insightful to performance test between the twoLeathers
As a heads-up, "any formatted string" is technically incorrect: it fails for I|CAN|GET|A|STRING, I-CAN-GET-A-STRING, etc. While I'd say it's not possible to handle an all-caps string with no delimiter, in the event of a delimiter being present it would be good to handle this case (since it does handle all lower-case cases).Poliard
As a correction to above: it "does handle all-lower-case cases with a delimiter". It also fails with multiple non-' '-character delimiters (i.e. i__can__get__a__string).Poliard
K
1

I found this gist useful after adding a ToLower() to it.

"WARD_VS_VITAL_SIGNS"
.ToLower()
.Split(new [] {"_"}, StringSplitOptions.RemoveEmptyEntries)
.Select(s => char.ToUpperInvariant(s[0]) + s.Substring(1, s.Length - 1))
.Aggregate(string.Empty, (s1, s2) => s1 + s2)
Kirkpatrick answered 18/9, 2020 at 15:31 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.