How to use Apples Machine Learning to Lift subjects from the background?
Asked Answered
H

3

5

In iOS 16 you can lift the subject from an image or isolate the subject by removing the background.

You can see it in action here: https://developer.apple.com/wwdc22/101?time=1101

I wonder whether this feature is also available for developers to use in their own apps. One could probably train a machine learning model and use it with the Vision Framework.

Here's an example on how to implement this, however Apple's solution is already good and I don't want to spent time reinventing the wheel when there's a shortcut.

Harry answered 16/9, 2022 at 10:31 Comment(6)
So, Daniel, you need a working example to "list subjects from the background," right?Reitz
Which Apple example do you mean? I was hoping to get the same results as with the implementation in iOS 16. I’m sure Apple users their own model and also does some extra image processing in order to get good results. For example there are higher level APIs from Apple to recognize barcodes. I want the same for background removal.Harry
@Harry Hmm, that's gonna require some very experienced expertise if you want a high level API, especially if apple doens't want to reveal the super high end onesVentilator
So unfortunately they best you can probably get is the code Apple themselves providedVentilator
Not sure if the article is new, but I think I have found what I am looking for, I just have to test it: developer.apple.com/documentation/vision/…Harry
@Harry can you please report back what solution worked for you? It seems that the Vision framework only works for certain kind of subjects which is not the case of the Photo app?Mediocre
A
6

Apple’s CoreML library and DeepLabV3 image segmentation model are the things you are looking for.

The DeepLabV3 model has been trained to recognize and segment these items:

  • aeroplane
  • bicycle
  • bird
  • Boat
  • bottle
  • bus
  • car
  • cat
  • chair
  • cow
  • dining table
  • dog
  • horse
  • motorbike
  • person
  • potted plant
  • sheep
  • sofa
  • train
  • tv or monitor

VNCoreMLRequest is the API from the CoreML to use. It accepts a callback function that is to be used to get the features of an image, namely the VNCoreMLFeatureValueObservation object.

The VNCoreMLFeatureValueObservation object gives the segmentation map of the picture, which is what you were looking for. Removing the background is masking one of these segments.

A complete, nicely put, step by step guide is here.

From that link the main part is as below:


// use DeepLabV3
func runVisionRequest() {
        
        guard let model = try? VNCoreMLModel(for: DeepLabV3(configuration: .init()).model)
        else { return }
        
        let request = VNCoreMLRequest(model: model, completionHandler: visionRequestDidComplete)
        request.imageCropAndScaleOption = .scaleFill
        DispatchQueue.global().async {

            let handler = VNImageRequestHandler(cgImage: inputImage.cgImage!, options: [:])
            
            do {
                try handler.perform([request])
            }catch {
                print(error)
            }
        }
    }

// extract the segmentation map and convert to an image using a third party library

func visionRequestDidComplete(request: VNRequest, error: Error?) {
            DispatchQueue.main.async {
                if let observations = request.results as? [VNCoreMLFeatureValueObservation],
                    let segmentationmap = observations.first?.featureValue.multiArrayValue {
                    
                    let segmentationMask = segmentationmap.image(min: 0, max: 1)

                    self.outputImage = segmentationMask!.resizedImage(for: self.inputImage.size)!

                    maskInputImage()
                }
            }
    }

Alpine answered 19/10, 2022 at 7:42 Comment(0)
F
0

The segmentation tasks you're talking about comes under the Computer Vision (Vision Framework) of apple which is listed here

There are various topics listed for each specific task.

Furring answered 18/10, 2022 at 12:27 Comment(0)
A
0

First of all, i just want to share what i understand from this feature in apple.

On the Apple Developer side, there are CreateML and CoreML projects.

https://developer.apple.com/machine-learning/

You can download trained data from here. Or you can create your own data or update one of them as according the segmentation requirements.

Here is sample datas: https://developer.apple.com/machine-learning/models/#text

Developer may connect to a trained data by an API (you are right), but since the data will be very large about 40 billion, it is recommended to write a ML project.

Also: WWDC sample: https://github.com/vincentspitale/SSC2022

CoreML sample: https://blog.devgenius.io/foreground-background-separation-using-core-ml-82efbe7e7fc8

Apropos answered 18/10, 2022 at 12:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.