round trip Swift number types to/from Data
Asked Answered
A

3

114

With Swift 3 leaning towards Data instead of [UInt8], I'm trying to ferret out what the most efficient/idiomatic way to encode/decode swifts various number types (UInt8, Double, Float, Int64, etc) as Data objects.

There's this answer for using [UInt8], but it seems to be using various pointer APIs that I can't find on Data.

I'd like to basically some custom extensions that look something like:

let input = 42.13 // implicit Double
let bytes = input.data
let roundtrip = bytes.to(Double) // --> 42.13

The part that really eludes me, I've looked through a bunch of the docs, is how I can get some sort of pointer thing (OpaquePointer or BufferPointer or UnsafePointer?) from any basic struct (which all of the numbers are). In C, I would just slap an ampersand in front of it, and there ya go.

Advisement answered 25/6, 2016 at 0:31 Comment(1)
A
299

Note: The code has been updated for Swift 5 (Xcode 10.2) now. (Swift 3 and Swift 4.2 versions can be found in the edit history.) Also possibly unaligned data is now correctly handled.

How to create Data from a value

As of Swift 4.2, data can be created from a value simply with

let value = 42.13
let data = withUnsafeBytes(of: value) { Data($0) }

print(data as NSData) // <713d0ad7 a3104540>

Explanation:

  • withUnsafeBytes(of: value) invokes the closure with a buffer pointer covering the raw bytes of the value.
  • A raw buffer pointer is a sequence of bytes, therefore Data($0) can be used to create the data.

How to retrieve a value from Data

As of Swift 5, the withUnsafeBytes(_:) of Data invokes the closure with an “untyped” UnsafeMutableRawBufferPointer to the bytes. The load(fromByteOffset:as:) method the reads the value from the memory:

let data = Data([0x71, 0x3d, 0x0a, 0xd7, 0xa3, 0x10, 0x45, 0x40])
let value = data.withUnsafeBytes {
    $0.load(as: Double.self)
}
print(value) // 42.13

There is one problem with this approach: It requires that the memory is property aligned for the type (here: aligned to a 8-byte address). But that is not guaranteed, e.g. if the data was obtained as a slice of another Data value.

It is therefore safer to copy the bytes to the value:

let data = Data([0x71, 0x3d, 0x0a, 0xd7, 0xa3, 0x10, 0x45, 0x40])
var value = 0.0
let bytesCopied = withUnsafeMutableBytes(of: &value, { data.copyBytes(to: $0)} )
assert(bytesCopied == MemoryLayout.size(ofValue: value))
print(value) // 42.13

Explanation:

  • withUnsafeMutableBytes(of:_:) invokes the closure with a mutable buffer pointer covering the raw bytes of the value.
  • The copyBytes(to:) method of DataProtocol (to which Data conforms) copies bytes from the data to that buffer.

The return value of copyBytes() is the number of bytes copied. It is equal to the size of the destination buffer, or less if the data does not contain enough bytes.

Generic solution #1

The above conversions can now easily be implemented as generic methods of struct Data:

extension Data {

    init<T>(from value: T) {
        self = Swift.withUnsafeBytes(of: value) { Data($0) }
    }

    func to<T>(type: T.Type) -> T? where T: ExpressibleByIntegerLiteral {
        var value: T = 0
        guard count >= MemoryLayout.size(ofValue: value) else { return nil }
        _ = Swift.withUnsafeMutableBytes(of: &value, { copyBytes(to: $0)} )
        return value
    }
}

The constraint T: ExpressibleByIntegerLiteral is added here so that we can easily initialize the value to “zero” – that is not really a restriction because this method can be used with “trival” (integer and floating point) types anyway, see below.

Example:

let value = 42.13 // implicit Double
let data = Data(from: value)
print(data as NSData) // <713d0ad7 a3104540>

if let roundtrip = data.to(type: Double.self) {
    print(roundtrip) // 42.13
} else {
    print("not enough data")
}

Similarly, you can convert arrays to Data and back:

extension Data {

    init<T>(fromArray values: [T]) {
        self = values.withUnsafeBytes { Data($0) }
    }

    func toArray<T>(type: T.Type) -> [T] where T: ExpressibleByIntegerLiteral {
        var array = Array<T>(repeating: 0, count: self.count/MemoryLayout<T>.stride)
        _ = array.withUnsafeMutableBytes { copyBytes(to: $0) }
        return array
    }
}

Example:

let value: [Int16] = [1, Int16.max, Int16.min]
let data = Data(fromArray: value)
print(data as NSData) // <0100ff7f 0080>

let roundtrip = data.toArray(type: Int16.self)
print(roundtrip) // [1, 32767, -32768]

Generic solution #2

The above approach has one disadvantage: It actually works only with "trivial" types like integers and floating point types. "Complex" types like Array and String have (hidden) pointers to the underlying storage and cannot be passed around by just copying the struct itself. It also would not work with reference types which are just pointers to the real object storage.

So solve that problem, one can

  • Define a protocol which defines the methods for converting to Data and back:

    protocol DataConvertible {
        init?(data: Data)
        var data: Data { get }
    }
    
  • Implement the conversions as default methods in a protocol extension:

    extension DataConvertible where Self: ExpressibleByIntegerLiteral{
    
        init?(data: Data) {
            var value: Self = 0
            guard data.count == MemoryLayout.size(ofValue: value) else { return nil }
            _ = withUnsafeMutableBytes(of: &value, { data.copyBytes(to: $0)} )
            self = value
        }
    
        var data: Data {
            return withUnsafeBytes(of: self) { Data($0) }
        }
    }
    

    I have chosen a failable initializer here which checks that the number of bytes provided matches the size of the type.

  • And finally declare conformance to all types which can safely be converted to Data and back:

    extension Int : DataConvertible { }
    extension Float : DataConvertible { }
    extension Double : DataConvertible { }
    // add more types here ...
    

This makes the conversion even more elegant:

let value = 42.13
let data = value.data
print(data as NSData) // <713d0ad7 a3104540>

if let roundtrip = Double(data: data) {
    print(roundtrip) // 42.13
}

The advantage of the second approach is that you cannot inadvertently do unsafe conversions. The disadvantage is that you have to list all "safe" types explicitly.

You could also implement the protocol for other types which require a non-trivial conversion, such as:

extension String: DataConvertible {
    init?(data: Data) {
        self.init(data: data, encoding: .utf8)
    }
    var data: Data {
        // Note: a conversion to UTF-8 cannot fail.
        return Data(self.utf8)
    }
}

or implement the conversion methods in your own types to do whatever is necessary so serialize and deserialize a value.

Byte order

No byte order conversion is done in the above methods, the data is always in the host byte order. For a platform independent representation (e.g. “big endian” aka “network” byte order), use the corresponding integer properties resp. initializers. For example:

let value = 1000
let data = value.bigEndian.data
print(data as NSData) // <00000000 000003e8>

if let roundtrip = Int(data: data) {
    print(Int(bigEndian: roundtrip)) // 1000
}

Of course this conversion can also be done generally, in the generic conversion method.

Ard answered 25/6, 2016 at 1:12 Comment(34)
Does the fact that we have to make a var copy of the initial value, mean that we're copying the bytes twice? In my current use case, I'm turning them into Data structs, so I can append them to a growing stream of bytes. In straight C, this is as easy as *(cPointer + offset) = originalValue. So the bytes are copied just once.Advisement
@TravisGriggs: Copying an int or float will most probably not be relevant, but you can do similar things in Swift. If you have an ptr: UnsafeMutablePointer<UInt8> then you can assign to the referenced memory via something like UnsafeMutablePointer<T>(ptr + offset).pointee = value which closely corresponds to your Swift code. There is one potential problem: Some processors allow only aligned memory access, e.g. you cannot store an Int at a odd memory location. I don't know if that applies to the currently used Intel and ARM processors.Ard
@TravisGriggs: (cont'd) ... Also this requires that a sufficiently large Data object has already been created, and in Swift you can only create and initialize the Data object, so you might have an additional copy of zero bytes during the initialization. – If you need more details then I would suggest that you post a new question.Ard
I love the generic solution #2 and am using it. But anytime I subscript a Data, it doesn't work. Is there an elegant way to make it work with subscripted Data ranges? E.g. Int32(data: input[0...3])?Advisement
@TravisGriggs: Int32(data: data.subdata(in: 0 ..< 3)) should work, but I don't know if that involves another copy. It should also be possible to expand the above methods by another parameter, e.g. Int32(data: data, atOffset: 4).Ard
@MartinR how would array types work with generic solution #2?Papen
@HansBrende: I am afraid that is currently not possible. It would require an extension Array: DataConvertible where Element: DataConvertible. That is not possible in Swift 3, but planned for Swift 4 (as far as I know). Compare "Conditional conformances" in github.com/apple/swift/blob/master/docs/…Ard
I am wondering if there are any potential issues with byte alignment here. Could it be that the pointer casting causes a non-aligned pointer to be returned which is then dereferenced as, for example, a double, causing a memory fault. If that is the case, might it be better to prefer a solution that byte copies the source bytes to the target?Gormand
@rghome: That is a good point. I strongly assume that the buffer of a newly created data object is aligned for all scalar types, similar to what malloc() returns, but I don't know if that is guaranteed.Ard
I am pretty sure a newly allocated Data is long aligned and I see your code only ever gets/sets stuff from the entire object, so it is OK. Where you might have problems is attempting to reuse parts of the code (as I am) to get/set stuff from offsets in the Data. In that case, you can't just cast the byte pointer to a non-byte pointer and dereference; you will have to do a byte copy into the target. Maybe I will post an answer if I get something working.Gormand
Are these solutions still valid for Swift 4?Chesty
@user965972: They should. Do you have any problems with it in Swift 4?Ard
@Chesty Generic solution #2 works perfect with Swift 4.Dealt
One more nice trick you want to add to your answer. Once you have your extension set up, you can create a mutating append function that takes any type.Nucleoside
I was looking for a little more safety with Generic Solution 1, so added some type restrictions and preconditions func to<T>(_ type: T.Type) -> T where T: FixedWidthInteger and then inside the function precondition(self.count == T.bitWidth >> 2). This prevents memory corruption with something like data[0..4].to(UInt32.self)Langan
oops that should be a >> 3Langan
Generic solution #1 crashes on Swift 4.2.Galilee
@Moebius: Can you provide more details? It works for me (just double-checked in Xcode 10.1, Swift 4.2).Ard
@MartinR Just a heads up (because I know you have lots of answers that use this API), the withUnsafeBytes method on Data that gives you back a typed pointer is to be deprecated in Swift 5. It will now give you back a raw buffer pointer to the underlying bytes.Novanovaculite
@Hamish: Thank you for the information, it seems that I have to update them all :( – Can you tell me what the reason for the deprecation was?Ard
@MartinR The old API made it far too easy to accidentally write code that wasn't memory safe – it was implemented using assumingMemoryBound(to:), which can yield undefined behaviour for withUnsafeBytes if the user gets two pointers of unrelated type to Data's underlying buffer. Even if it were switched to use bindMemory(to:), that would cause memory soundness issues with Data's init(bytesNoCopy:) initialiser.Novanovaculite
There's some further discussion on this issue over at: forums.swift.org/t/…Novanovaculite
The missing piece to this answer is that for UInts you should use one of the CFSwap functions to ensure you get the right byte order because this answer will implicitly load the data in host order.Hierolatry
@Max: I think that I addressed the byte order issue at the very end of this answer. There are “native” Swift methods which can be used instead of CFSwapXXX. Please let me know if something is missing or unclear.Ard
@MartinR yeah, you're right. I didn't realize that UInt16(littleEndian:) does the same thing as CFSwapInt16HostToLittle. The documentation is written in a way I find confusing but they are equivalent.Hierolatry
In Swift 5, solution #1 does not seem to work for Ints of any kind (Int8, Int16 etc) since they do not conform to ExpressibleByIntegerLiteral. I don't know if conformance was never there, or it was removed in later versions of Swift.Academe
@m_katsifarakis: I am fairly sure that it works with all integer types. let data = Data([1, 2]); if let value = data.to(type: Int16.self) { print(value) } compiles and runs as expected in Xcode 11 with Swift 5.Ard
@MartinR on Xcode 10.2.1 I get Instance method 'to(type:)' requires that 'Int.Type' conform to 'ExpressibleByIntegerLiteral'. Perhaps I should update Xcode and try again.Academe
@m_katsifarakis: I do not have Xcode 10.2.1 anymore, but it works with Xcode 10.3, which is the current released version.Ard
@m_katsifarakis: Could it be that you mistyped Int.self as Int.Type ?Ard
@MartinR that was it! Such a stupid mistake. Thanks very much for your help and the brilliant solution!Academe
@MartinR Hello, thank you for the great solution. What about using solution #2 for types that don't conform to ExpressibleByIntegerLiteral? I was using your Swift 4 solution #2 with Bool type (and others like Struct) and it was working nicely but now I can't use it anymore because they don't conform to ExpressibleByIntegerLiteral.Ifc
@ciclopez: One option would be to write a extension Bool: DataConvertible. – This ExpressibleByIntegerLiteral requirement is actually just a workaround for the fact that there is no protocol that all “simple” types conform to. There may be better solutions in Swift 5.3, I'll think about it some time ...Ard
Thanks for the detailed explanation @MartinR , I have been trying to chase down information regarding Data and how to use it, as I have been working in Metal, but wanted to do some byte comparisons, while not leveraging Metal. I was wondering if you might have some suggested reading material/blog that could help in gaining a deeper understanding. My goal is to compare two class instances and compare there byte differences.Spoondrift
B
3

You can get an unsafe pointer to mutable objects by using withUnsafePointer:

withUnsafePointer(&input) { /* $0 is your pointer */ }

I don't know of a way to get one for immutable objects, because the inout operator only works on mutable objects.

This is demonstrated in the answer that you've linked to.

Blucher answered 25/6, 2016 at 0:55 Comment(0)
A
2

In my case, Martin R's answer helped but the result was inverted. So I did a small change in his code:

extension UInt16 : DataConvertible {

    init?(data: Data) {
        guard data.count == MemoryLayout<UInt16>.size else { 
          return nil 
        }
    self = data.withUnsafeBytes { $0.pointee }
    }

    var data: Data {
         var value = CFSwapInt16HostToBig(self)//Acho que o padrao do IOS 'e LittleEndian, pois os bytes estavao ao contrario
         return Data(buffer: UnsafeBufferPointer(start: &value, count: 1))
    }
}

The problem is related with LittleEndian and BigEndian.

Ascogonium answered 9/12, 2016 at 19:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.