Separating CamelCase string into space-separated words
Asked Answered
B

14

35

I would like to separate a CamelCase string into space-separated words in a new string. Here is what I have so far:

var camelCaps: String {
    guard self.count > 0 else { return self }
    var newString: String = ""

    let uppercase = CharacterSet.uppercaseLetters
    let first = self.unicodeScalars.first!
    newString.append(Character(first))
    for scalar in self.unicodeScalars.dropFirst() {
        if uppercase.contains(scalar) {
            newString.append(" ")
        }
        let character = Character(scalar)
        newString.append(character)
    }

    return newString
}

let aCamelCaps = "aCamelCaps"
let camelCapped = aCamelCaps.camelCaps // Produce: "a Camel Caps"

let anotherCamelCaps = "ÄnotherCamelCaps"
let anotherCamelCapped = anotherCamelCaps.camelCaps // "Änother Camel Caps"

I'm inclined to suspect that this may not be the most efficient way to convert to space-separated words, if I call it in a tight loop, or 1000's of times. Are there more efficient ways to do this in Swift?

[Edit 1:] The solution I require should remain general for Unicode scalars, not specific to Roman ASCII "A..Z".

[Edit 2:] The solution should also skip the first letter, i.e. not prepend a space before the first letter.

[Edit 3:] Updated for Swift 4 syntax, and added caching of uppercaseLetters, which improves performance in very long strings and tight loops.

Bremble answered 22/12, 2016 at 22:40 Comment(3)
Single line return unicodeScalars.reduce("") { CharacterSet.uppercaseLetters.contains($1) ? $0 + " " + String($1) : $0 + String($1)}Squall
How to deal with string with consecutive capital characters? for eg: with above code "upperCased LETTERS" is returned as "upper Cased L E T T E R S". While the expected output is "upper Cased Letters".Bezant
@Bezant Simply check the string being created, $0 in our case, and see if the last letter is uppercase also. If yes, you just add the character, $1, with no space.Perversity
A
10

As far as I tested on my old MacBook, your code seems to be efficient enough for short strings:

import Foundation

extension String {

    var camelCaps: String {
        var newString: String = ""

        let upperCase = CharacterSet.uppercaseLetters
        for scalar in self.unicodeScalars {
            if upperCase.contains(scalar) {
                newString.append(" ")
            }
            let character = Character(scalar)
            newString.append(character)
        }

        return newString
    }

    var camelCaps2: String {
        var newString: String = ""

        let upperCase = CharacterSet.uppercaseLetters
        var range = self.startIndex..<self.endIndex
        while let foundRange = self.rangeOfCharacter(from: upperCase,range: range) {
            newString += self.substring(with: range.lowerBound..<foundRange.lowerBound)
            newString += " "
            newString += self.substring(with: foundRange)

            range = foundRange.upperBound..<self.endIndex
        }
        newString += self.substring(with: range)

        return newString
    }

    var camelCaps3: String {
        struct My {
            static let regex = try! NSRegularExpression(pattern: "[A-Z]")
        }
        return My.regex.stringByReplacingMatches(in: self, range: NSRange(0..<self.utf16.count), withTemplate: " $0")
    }
}
let aCamelCaps = "aCamelCaps"

assert(aCamelCaps.camelCaps == aCamelCaps.camelCaps2)
assert(aCamelCaps.camelCaps == aCamelCaps.camelCaps3)

let t0 = Date().timeIntervalSinceReferenceDate

for _ in 0..<1_000_000 {
    let aCamelCaps = "aCamelCaps"

    let camelCapped = aCamelCaps.camelCaps
}

let t1 = Date().timeIntervalSinceReferenceDate
print(t1-t0) //->4.78703999519348

for _ in 0..<1_000_000 {
    let aCamelCaps = "aCamelCaps"

    let camelCapped = aCamelCaps.camelCaps2
}

let t2 = Date().timeIntervalSinceReferenceDate
print(t2-t1) //->10.5831440091133

for _ in 0..<1_000_000 {
    let aCamelCaps = "aCamelCaps"

    let camelCapped = aCamelCaps.camelCaps3
}

let t3 = Date().timeIntervalSinceReferenceDate
print(t3-t2) //->14.2085000276566

(Do not try to test the code above in the Playground. The numbers are taken from a single trial executed as a CommandLine app.)

Armagh answered 23/12, 2016 at 6:53 Comment(3)
The last method, camelCaps3, appears to be the fastest because it caches the regular expression in a static. However, it does not handle non-ASCII capital letters (e.g., Ä).Bremble
@BrianArnold, this article's main intension is to show that using regex is not so fast as expected. If it was faster than your code, it could be improved to include non-ASCII capital letters. For example, you can use "\\p{Lu}" instead of "[A-Z]". Please try by yourself.Armagh
Thanks, using "\\p{Lu}"for the regex is more general, and when I compile this as a command line tool for optimal performance, regex is indeed not so fast as expected. Oddly, the optimized single line suggestion by Leo Daubus is also not so fast as expected when compiled as a command line tool. So, I am left with keeping the code I suggested, as it is fast enough and readable.Bremble
K
29
extension String {
    func camelCaseToWords() -> String {
        return unicodeScalars.dropFirst().reduce(String(prefix(1))) {
            return CharacterSet.uppercaseLetters.contains($1)
                ? $0 + " " + String($1)
                : $0 + String($1)
        }
    }
}
print("ÄnotherCamelCaps".camelCaseToWords()) // Änother Camel Caps

May be helpful for someone :)

Kinnie answered 4/4, 2017 at 15:52 Comment(4)
Thank you so much for your code... I modified it a bit to address an issueCasemaker
@Casemaker happy to hear code was useful for you. Can you please give a hint about what problem you solved using the same logic? May be helpful for me and others too.Kinnie
sure! I posted an answer some days ago ;) https://mcmap.net/q/423278/-separating-camelcase-string-into-space-separated-wordsCasemaker
This will not work for something like "TestABC"Selfabsorption
M
20

One Line Solution

I concur with @aircraft, regular expressions can solve this problem in one LOC!

// Swift 5 (and probably 4?)
extension String {
    func titleCase() -> String {
        return self
            .replacingOccurrences(of: "([A-Z])",
                                  with: " $1",
                                  options: .regularExpression,
                                  range: range(of: self))
            .trimmingCharacters(in: .whitespacesAndNewlines)
            .capitalized // If input is in llamaCase
    }
}

Props to this JS answer.

P.S. I have a gist for snake_case → CamelCase here.

P.P.S. I updated this for New Swift (currently 5.1), then saw @busta's answer, and swapped out my startIndex..<endIndex for his range(of: self). Credit where it's due y'all!

Matthia answered 6/5, 2018 at 18:9 Comment(2)
Thank you, I appreciate this answer, and perhaps it will be useful to some. However, my goal wasn't to implement one line of code, but to find a more efficient, maintainable implementation. The regex solutions run slower than the other solutions provided above, and the fastest solution requires more lines of code.Bremble
The parameter range: range(of: self) is not required, since we want to search the whole string.Jacobsohn
C
13

I might be late but I want to share a little improvement to Augustine P A answer or Leo Dabus comment.
Basically, that code won't work properly if we are using upper camel case notation (like "DuckDuckGo") because it will add a space at the beginning of the string.
To address this issue, this is a slightly modified version of the code, using Swift 3.x, and it's compatible with both upper and lower came case:

extension String {

    func camelCaseToWords() -> String {
        return unicodeScalars.reduce("") {
            if CharacterSet.uppercaseLetters.contains($1) {
                if $0.count > 0 {
                    return ($0 + " " + String($1))
                }
            }
            return $0 + String($1)
        }
    }
}
Casemaker answered 31/8, 2017 at 12:1 Comment(0)
S
11

a better full swifty solution... based on AmitaiB answer

extension String {
    func titlecased() -> String {
        return self.replacingOccurrences(of: "([A-Z])", with: " $1", options: .regularExpression, range: self.range(of: self))
            .trimmingCharacters(in: .whitespacesAndNewlines)
            .capitalized
    }
}
Singspiel answered 12/6, 2018 at 15:43 Comment(0)
A
10

As far as I tested on my old MacBook, your code seems to be efficient enough for short strings:

import Foundation

extension String {

    var camelCaps: String {
        var newString: String = ""

        let upperCase = CharacterSet.uppercaseLetters
        for scalar in self.unicodeScalars {
            if upperCase.contains(scalar) {
                newString.append(" ")
            }
            let character = Character(scalar)
            newString.append(character)
        }

        return newString
    }

    var camelCaps2: String {
        var newString: String = ""

        let upperCase = CharacterSet.uppercaseLetters
        var range = self.startIndex..<self.endIndex
        while let foundRange = self.rangeOfCharacter(from: upperCase,range: range) {
            newString += self.substring(with: range.lowerBound..<foundRange.lowerBound)
            newString += " "
            newString += self.substring(with: foundRange)

            range = foundRange.upperBound..<self.endIndex
        }
        newString += self.substring(with: range)

        return newString
    }

    var camelCaps3: String {
        struct My {
            static let regex = try! NSRegularExpression(pattern: "[A-Z]")
        }
        return My.regex.stringByReplacingMatches(in: self, range: NSRange(0..<self.utf16.count), withTemplate: " $0")
    }
}
let aCamelCaps = "aCamelCaps"

assert(aCamelCaps.camelCaps == aCamelCaps.camelCaps2)
assert(aCamelCaps.camelCaps == aCamelCaps.camelCaps3)

let t0 = Date().timeIntervalSinceReferenceDate

for _ in 0..<1_000_000 {
    let aCamelCaps = "aCamelCaps"

    let camelCapped = aCamelCaps.camelCaps
}

let t1 = Date().timeIntervalSinceReferenceDate
print(t1-t0) //->4.78703999519348

for _ in 0..<1_000_000 {
    let aCamelCaps = "aCamelCaps"

    let camelCapped = aCamelCaps.camelCaps2
}

let t2 = Date().timeIntervalSinceReferenceDate
print(t2-t1) //->10.5831440091133

for _ in 0..<1_000_000 {
    let aCamelCaps = "aCamelCaps"

    let camelCapped = aCamelCaps.camelCaps3
}

let t3 = Date().timeIntervalSinceReferenceDate
print(t3-t2) //->14.2085000276566

(Do not try to test the code above in the Playground. The numbers are taken from a single trial executed as a CommandLine app.)

Armagh answered 23/12, 2016 at 6:53 Comment(3)
The last method, camelCaps3, appears to be the fastest because it caches the regular expression in a static. However, it does not handle non-ASCII capital letters (e.g., Ä).Bremble
@BrianArnold, this article's main intension is to show that using regex is not so fast as expected. If it was faster than your code, it could be improved to include non-ASCII capital letters. For example, you can use "\\p{Lu}" instead of "[A-Z]". Please try by yourself.Armagh
Thanks, using "\\p{Lu}"for the regex is more general, and when I compile this as a command line tool for optimal performance, regex is indeed not so fast as expected. Oddly, the optimized single line suggestion by Leo Daubus is also not so fast as expected when compiled as a command line tool. So, I am left with keeping the code I suggested, as it is fast enough and readable.Bremble
H
9
extension String {
    func titlecased() -> String {
        return self
            .replacingOccurrences(of: "([a-z])([A-Z](?=[A-Z])[a-z]*)", with: "$1 $2", options: .regularExpression)
            .replacingOccurrences(of: "([A-Z])([A-Z][a-z])", with: "$1 $2", options: .regularExpression)
            .replacingOccurrences(of: "([a-z])([A-Z][a-z])", with: "$1 $2", options: .regularExpression)
            .replacingOccurrences(of: "([a-z])([A-Z][a-z])", with: "$1 $2", options: .regularExpression)
    }
}

In

 "ThisStringHasNoSpacesButItDoesHaveCapitals"
 "IAmNotAGoat"
 "LOLThatsHilarious!"
 "ThisIsASMSMessage"

Out

"This String Has No Spaces But It Does Have Capitals" 
"I Am Not A Goat" 
"LOL Thats Hilarious!" 
"This Is ASMS Message" // (Difficult tohandle single letter words when they are next to acronyms.)

enter link description here

Herewith answered 26/8, 2019 at 0:0 Comment(1)
This should be the accepted answer. This worked perfectly for what I was looking for as it handles capitalized letters that should stay together, such as "LOL".Dragnet
H
5

I can do this extension in less lines of code (and without a CharacterSet), but yes, you basically have to enumerate each String if you want to insert spaces in front of capital letters.

extension String {
    var differentCamelCaps: String {
        var newString: String = ""
        for eachCharacter in self {
            if "A"..."Z" ~= eachCharacter {
                newString.append(" ")
            }
            newString.append(eachCharacter)
        }
        return newString
    }
}

print("ÄnotherCamelCaps".differentCamelCaps) // Änother Camel Caps
Halutz answered 23/12, 2016 at 3:34 Comment(1)
if "A"..."Z" ~= eachCharacter {Squall
S
3

Swift 5 solution

extension String {

    func camelCaseToWords() -> String {
        return unicodeScalars.reduce("") {
            if CharacterSet.uppercaseLetters.contains($1) {
                if $0.count > 0 {
                    return ($0 + " " + String($1))
                }
            }
            return $0 + String($1)
        }
    }
}
Stream answered 17/10, 2019 at 13:45 Comment(0)
M
3

Here's what I came up with using Unicode character classes: (Swift 5)

extension String {
    var titleCased: String {
        self
            .replacingOccurrences(of: "(\\p{UppercaseLetter}\\p{LowercaseLetter}|\\p{UppercaseLetter}+(?=\\p{UppercaseLetter}))",
                                  with: " $1",
                                  options: .regularExpression,
                                  range: range(of: self)
            )
            .capitalized
    }
}

Output:

fillPath                ➝ Fill Path
ThisStringHasNoSpaces   ➝ This String Has No Spaces
IAmNotAGoat             ➝ I Am Not A Goat
LOLThatsHilarious!      ➝ Lol Thats Hilarious!
ThisIsASMSMessage       ➝ This Is Asms Message
Mainis answered 8/9, 2020 at 8:59 Comment(0)
E
3

Swift 5+

Small style improvements on previous answers

import Foundation

extension String {
    func camelCaseToWords() -> String {
        unicodeScalars.reduce("") {
            guard CharacterSet.uppercaseLetters.contains($1),
                  $0.count > 0
            else { return $0 + String($1) }
            return ($0 + " " + String($1))
        }
    }
}

Using guard let statements is usually recommended, as they provide an "early exit" for non matching cases and decrease the overall nesting levels of your code (which usually improves readability quite a lot... and remember, readability counts!)

Expository answered 13/5, 2022 at 2:56 Comment(0)
H
2

If you want to make it more efficient, you can use Regular Expressions.

 extension String {
    func replace(regex: NSRegularExpression, with replacer: (_ match:String)->String) -> String {
    let str = self as NSString
    let ret = str.mutableCopy() as! NSMutableString

    let matches = regex.matches(in: str as String, options: [], range: NSMakeRange(0, str.length))
    for match in matches.reversed() {
        let original = str.substring(with: match.range)
        let replacement = replacer(original)
        ret.replaceCharacters(in: match.range, with: replacement)
    }
        return ret as String
    }
}

let camelCaps = "aCamelCaps"  // there are 3 Capital character
let pattern = "[A-Z]"
let regular = try!NSRegularExpression(pattern: pattern)
let camelCapped:String = camelCaps.replace(regex: regular) { " \($0)" }
print("Uppercase characters replaced: \(camelCapped)")
Hispania answered 23/12, 2016 at 5:50 Comment(1)
What makes you think that this is more efficient than OP's code?Caroncarotene
W
1

Swift way:

extension String {
    var titlecased: String {
        map { ($0.isUppercase ? " " : "") + String($0) }
            .joined(separator: "")
            .trimmingCharacters(in: .whitespaces)
    }
}
Wigwam answered 8/4, 2022 at 22:7 Comment(0)
K
1

Here's a succinct solution using the new regex APIs in iOS 16 and macOS 13:

extension String {
    var camelToTitleCase: String {
        replacing(#/[[:upper:]]/#) { " " + $0.output }.capitalized
    }
}
"camelToTitleCase".camelToTitleCase -> "Camel To Title Case"

If the first character might be also uppercased, we could just add

.trimmingCharacters(in: .whitespaces)

A more general (but slightly less succinct) solution:

extension StringProtocol {
    var string: String {
        String(self)
    }

    func prepending(_ other: Self) -> String {
        other.appending(self)
    }
}

extension String {
    var camelToTitleCase: String {
        replacing(#/(\b[[:lower:]])|(\B[[:upper:]])/#) {
            ($0.output.1?.string ?? $0.output.2?.string.prepending(" ") ?? "")
                .uppercased()
        }
    }
}

Here the regex matches lowercase characters at word boundaries (including the beginning of the string) or uppercase characters, and the result need not be capitalized nor trimmed.

"ÄnotherCamelCaps".camelToTitleCase -> "Änother Camel Caps"
"änotherCamelCaps".camelToTitleCase -> "Änother Camel Caps"
Kinghood answered 17/10, 2023 at 11:45 Comment(0)
K
0

Solution with REGEX

let camelCase = "SomeATMInTheShop"
let regexPattern = "[A-Z-_&](?=[a-z0-9]+)|[A-Z-_&]+(?![a-z0-9])"
let newValue = camelCase.replacingOccurrences(of: regexPattern, with: " $0", options: .regularExpression, range: nil)

Otuput ==> Some ATM In The Shop

Kooky answered 2/9, 2022 at 13:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.