Converting TrueDepth data to grayscale image produces distorted image

I'm getting the depth data from the TrueDepth camera, and converting it to a grayscale image. (I realize I could pass the AVDepthData to a CIImage constructor, however, for testing purposes, I want to make sure my array is populated correctly, therefore manually constructing an image would ensure that is the case.)

I notice that when I try to convert the grayscale image, I get weird results. Namely, the image appears in the top half, and the bottom half is distorted (sometimes showing the image twice, other times showing nonsense).

For example:

Expected output (i.e. CIImage(depthData: depthData)):

Actual output (20% of the time):

Actual output (80% of the time):

I started with Apple's sample code and tried to extract the pixel in the CVPixelBuffer.

let depthDataMap: CVPixelBuffer = ...
let width = CVPixelBufferGetWidth(depthDataMap) // 640
let height = CVPixelBufferGetHeight(depthDataMap) // 480
let bytesPerRow = CVPixelBufferGetBytesPerRow(depthDataMap) // 1280
let baseAddress = CVPixelBufferGetBaseAddress(depthDataMap)
assert(kCVPixelFormatType_DepthFloat16 == CVPixelBufferGetPixelFormatType(depthDataMap))
let byteBuffer = unsafeBitCast(baseAddress, to: UnsafeMutablePointer<Float16>.self)

var pixels = [Float]()
for row in 0..<height {
  for col in 0..<width {
    let byteBufferIndex = col + row * bytesPerRow
    let distance = byteBuffer[byteBufferIndex]
    pixels += [distance]
  }
}

// TODO: render pixels as a grayscale image

Any idea what is wrong here?

TL;DR

You should always unwrap the call to CVPixelBufferGetBaseAddress so that you don't miss important warnings.

Turns out the problem is how the value inside the byteBuffer is being accessed. If instead of using unsafeBitCast() you use the method Apple uses in their example (assumingMemoryBound), you will get the correct results.

Although it looks like:

// BAD CODE

let byteBuffer = unsafeBitCast(baseAddress, to: UnsafeMutablePointer<Float16>.self)
// ...
let byteBufferIndex = col + row * bytesPerRow
let distance = byteBuffer[byteBufferIndex]

... should behave the same as:

// GOOD CODE

let rowData = baseAddress! + row * bytesPerRow
let distance = rowData.assumingMemoryBound(to: Float16.self)[col]

... the two are in fact very different, with the former producing the bad results mentioned above, and the latter producing good results.

The final (fixed) code should look like this:

let depthDataMap: CVPixelBuffer = ...
let width = CVPixelBufferGetWidth(depthDataMap) // 640
let height = CVPixelBufferGetHeight(depthDataMap) // 480
let bytesPerRow = CVPixelBufferGetBytesPerRow(depthDataMap) // 1280
let baseAddress = CVPixelBufferGetBaseAddress(depthDataMap)!
assert(kCVPixelFormatType_DepthFloat16 == CVPixelBufferGetPixelFormatType(depthDataMap))

var pixels = [Float]()
for row in 0..<height {
  for col in 0..<width {
    let rowData = baseAddress + row * bytesPerRow
    let distance = rowData.assumingMemoryBound(to: Float16.self)[col]
    pixels += [distance]
  }
}

// TODO: render pixels as a grayscale image

I'm actually not sure why this is the case because we know:

assert(MemoryLayout<Float16>.size == 2)
assert(width == 640)
assert(bytesPerRow == 1280)
assert(width * 2 == bytesPerRow)

This seems to imply that there are no extra bytes at the end of a row, and we should be able to read it as one giant array.

If anyone knows why the former fails, please share!

Update:

If you force unwrap the call to CVPixelBufferGetBaseAddress:

let baseAddress = CVPixelBufferGetBaseAddress(depthDataMap)!

... things start to make a bit more sense.

Namely, you will see a warning on this line:

let byteBuffer = unsafeBitCast(baseAddress, to: UnsafeMutablePointer<Float16>.self)

⚠️ 'unsafeBitCast' from 'UnsafeMutableRawPointer' to 'UnsafeMutablePointer' gives a type to a raw pointer and may lead to undefined behavior

⚠️ Use the 'assumingMemoryBound' method if the pointer is known to point to an existing value or array of type 'Float16' in memory

I guess the results I seeing were related to the "undefined behavior" warning.

The lesson, therefore, is that you should always unwrap the result of CVPixelBufferGetBaseAddress before attempting to use it (e.g. in unsafeBitCast).

Recommended topics

Hot tags