We know that String.utf16 provides the codeunits or String.unicodeScalars provides the scalars.
If we manipulate the codeunits and unicodeScales by removing some elements etc. is there a way to construct back the resulting string?
We know that String.utf16 provides the codeunits or String.unicodeScalars provides the scalars.
If we manipulate the codeunits and unicodeScales by removing some elements etc. is there a way to construct back the resulting string?
Update for Swift 2.1:
You can create a String
from an array of UTF-16 characters
with the
public init(utf16CodeUnits: UnsafePointer<unichar>, count: Int)
initializer. Example:
let str = "H€llo 😄"
// String to UTF16 array:
let utf16array = Array(str.utf16)
print(utf16array)
// Output: [72, 8364, 108, 108, 111, 32, 55357, 56836]
// UTF16 array to string:
let str2 = String(utf16CodeUnits: utf16array, count: utf16array.count)
print(str2)
// H€llo 😄
Previous answer:
There is nothing "built-in" (as far as I know), but you can use the UTF16
struct
which provides a decode()
method:
extension String {
init?(utf16chars:[UInt16]) {
var str = ""
var generator = utf16chars.generate()
var utf16 : UTF16 = UTF16()
var done = false
while !done {
let r = utf16.decode(&generator)
switch (r) {
case .EmptyInput:
done = true
case let .Result(val):
str.append(Character(val))
case .Error:
return nil
}
}
self = str
}
}
Example:
let str = "H€llo 😄"
// String to UTF16 array:
let utf16array = Array(str.utf16)
print(utf16array)
// Output: [72, 8364, 108, 108, 111, 32, 55357, 56836]
// UTF16 array to string:
if let str2 = String(utf16chars: utf16array) {
print(str2)
// Output: H€llo 😄
}
Slightly more generic, you could define a method that creates a string from an array (or any sequence) of code points, using a given codec:
extension String {
init?<S : SequenceType, C : UnicodeCodecType where S.Generator.Element == C.CodeUnit>
(codeUnits : S, var codec : C) {
var str = ""
var generator = codeUnits.generate()
var done = false
while !done {
let r = codec.decode(&generator)
switch (r) {
case .EmptyInput:
done = true
case let .Result(val):
str.append(Character(val))
case .Error:
return nil
}
}
self = str
}
}
Then the conversion from UTF16 is done as
if let str2a = String(codeUnits: utf16array, codec: UTF16()) {
print(str2a)
}
Here is another possible solution. While the previous methods are "pure Swift", this one uses the Foundation framework and the automatic
bridging between NSString
and Swift String
:
extension String {
init?(utf16chars:[UInt16]) {
let data = NSData(bytes: utf16chars, length: utf16chars.count * sizeof(UInt16))
if let ns = NSString(data: data, encoding: NSUTF16LittleEndianStringEncoding) {
self = ns as String
} else {
return nil
}
}
}
The answer is as simple as:
/// An array of the UTF-16 for "Hello, world!".
let a: [UTF16.CodeUnit] = Array("Hello, world!".utf16)
/// A string representation of a, interpreted as UTF-16
let s = String(decoding: a, as: UTF16.self) // <=== The API you want
print(s)
String(utf16CodeUnits: a, count: a.count)
–
Mambo Here it is.
extension String {
static func fromUTF16Chars(utf16s:UInt16[]) -> String {
var str = ""
for var i = 0; i < utf16s.count; i++ {
let hi = Int(utf16s[i])
switch hi {
case 0xD800...0xDBFF:
let lo = Int(utf16s[++i])
let us = 0x10000
+ (hi - 0xD800)*0x400 + (lo - 0xDC00)
str += Character(UnicodeScalar(us))
default:
str += Character(UnicodeScalar(hi))
}
}
return str
}
}
let str = "aαあ🐣aαあ🐣"
var utf16cs = UInt16[]()
for utf16c in str.utf16 {
utf16cs += utf16c
}
let str2 = String.fromUTF16Chars(utf16cs)
assert(str2 == str)
println(str2)
© 2022 - 2024 — McMap. All rights reserved.
while !done
part is one of the few times I’ve found labelled breaks useful in Swift i.e.end: while true … case .EmptyInput: break end
– Weakminded