Swift 5: Index of a Character in String
Asked Answered
H

1

8

Before Swift 5, I had this extension working:

  fileprivate extension String {
        func indexOf(char: Character) -> Int? {
            return firstIndex(of: char)?.encodedOffset
        }
    }

Now, I get a deprecated message:

'encodedOffset' is deprecated: encodedOffset has been deprecated as most common usage is incorrect. Use `utf16Offset(in:)` to achieve the same behavior.

Is there a simpler solution to this instead of using utf16Offset(in:)?

I just need the index of the character position passed back as an Int.

Hengist answered 27/3, 2019 at 19:2 Comment(1)
Don't use encodedOffset. Use the collection distance https://mcmap.net/q/23526/-how-to-convert-quot-index-quot-to-type-quot-int-quot-in-swiftSidelong
Q
11

After some time I have to admit that my original answer was incorrect.

In Swift are two methods: firstIndex(of:) and lastIndex(of:)

Both returns Int? representing index of first/last element in Array which is equal to passed element (if there is any, otherwise it returns nil).

So, you should avoid using your custom method to get index because there could be two same elements and you wouldn't know which index you need. So try to think about your usage and decide which index is more suitable for you; first or last. 🙏


Original answer:

And what is wrong with utf16Offset(in:)? This is way to go with Swift 5

fileprivate extension String {
    func indexOf(char: Character) -> Int? {
        return firstIndex(of: char)?.utf16Offset(in: self)
    }
}
Quixotism answered 27/3, 2019 at 19:5 Comment(7)
I rest my case!. I was having issues with utf16Offset(in: self) part.Hengist
You should use Collection distance method distance(from start: String.Index, to end: String.Index) -> String.IndexDistanceSidelong
try "🇧🇷🇺🇸".indexOf(char: "🇺🇸") // 4Sidelong
Can confirm this is not the correct answer; Leo Dabus’ comments should be an answer and marked as accepted. The UTF-16 offset is, due to Character representing grapheme clusters of varying sizes. Treating every string as though it were UTF-16 regardless of the mix of character-sizes is decidedly incorrect. The only correct way is to use the Collection functions for converting between Int offsets and String.Index.Fixity
Felt I should back up my statement above with a handy explanatory link: utf8everywhere.org/#myths - this section of this very informative site describes very clearly the reason you cannot rely on fixed-width offsets for any Unicode text (and Swift.String is an array of Swift.Character, which is generally a wrapper around grapheme clusters).Fixity
(shakes head at deleted my-lang-does-it-better-but-not-really drama) It’s a complex issue, friends. If you’ve solved it even for one platform (both performance and ergonomics), you are a super-genius and should publish your work. If not, maybe don’t attack others.Fixity
Just wanted to point out that @LeoDabus gets it right and points out the flaws in the simplest way possible. I keep seeing the UTF-16 offset answer and KNOW it is incorrect for this very reason. I plan to implement the indexDistance for a parser.Circassian

© 2022 - 2024 — McMap. All rights reserved.