How to implement ktf.image.resize_images in CoreML Custom Layer?
Asked Answered
T

0

11

I am trying to implement a custom layer in CoreML that solves the Lambda layer that performs ktf.image.resize_images function as a custom layer class in Swift.

This is my Phyton script:

def resizeImage(x, size):
    return ktf.image.resize_images(x, size)

def convert_lambda(layer):
    if layer.function == resizeImage:
        params = NeuralNetwork_pb2.CustomLayerParams()

        params.className = "resizeImage"
        params.description = "Decoder Resizing"

        params.parameters["scale"].intValue = layer.arguments["size"][0].value

        print("LAMBDA CONVERSION = Size embedded to CoreML Model: %d" % layer.arguments["size"][0].value)

        return params
    else:
        return None

...

for i in range(decoder_n):
    strides = 1
    reverse_i = decoder_n - i - 1
    size = encoder_layers[decoder_n - i - 1].shape[1:3]
    out_channels = 2 ** ((decoder_n - i - 2) // 3 + 5) if i != decoder_n - 1 else 2 ** 5

    x = Lambda(resizeImage, arguments={'size':size})(x)
    x = Convolution2D(out_channels, kernel_size=(3, 3), activation='relu', strides=strides, padding='same')(x)

    x = concatenate([x, encoder_layers[decoder_n - i - 1]], axis=3)
    out_channels = 2 ** ((decoder_n - i - 2) // 3 + 5) if i != decoder_n - 1 else channels
    activation = 'relu' if i != decoder_n - 1 else 'sigmoid'
    x = Convolution2D(out_channels, kernel_size=(3, 3), activation=activation, strides=strides, padding='same')(x)

And so far, this is the Swift class:

import Foundation
import CoreML
import Accelerate
import UIKit

@objc(resizeImage) class resizeImage: NSObject, MLCustomLayer {

    let scale: Float

    required init(parameters: [String : Any]) throws {
        if let scale = parameters["scale"] as? Float {
            self.scale = scale
        } else {
            self.scale = 1.0
        }

        print(#function, parameters)

        super.init()
    }

    func setWeightData(_ weights: [Data]) throws {
        print(#function, weights)
    }

    func outputShapes(forInputShapes inputShapes: [[NSNumber]]) throws -> [[NSNumber]] {
        print(#function, inputShapes)

        return inputShapes
    }

    func evaluate(inputs: [MLMultiArray], outputs: [MLMultiArray]) throws {
        print(#function, inputs.count, outputs.count)

        for i in 0..<inputs.count {
            let input = inputs[i]
            let output = outputs[i]

            for j in 0..<input.count {
                let x = input[j].floatValue
                let y = x * self.scale
                output[j] = NSNumber(value: y)
            }
        }
    }
}

Any suggestions on why the output predicted image is not encoded correctly?

enter image description here

Talapoin answered 16/2, 2018 at 13:34 Comment(2)
tf.image.resize_images() resizes the image using bilinear interpolation. But your custom layer appears not to do resizing at all, it just multiplies each pixel with some value. That's a completely different operation... If you're going to be resizing the images by a fixed scale factor at each step (for example, making it twice as large), then I suggest make the Core ML layer an UpsampleLayerParams instead.Loftin
@MatthijsHollemans, apologies. This was just how my last try looked before publishing the question here. I actually intend to replicate the bilinear interpolation in Swift. But I've got no clue how to actually implement it. That's why so far, I was just trying with a simple scale operation.Talapoin

© 2022 - 2024 — McMap. All rights reserved.