Apple’s CoreML
library and DeepLabV3
image segmentation model are the things you are looking for.
The DeepLabV3
model has been trained to recognize and segment these items:
- aeroplane
- bicycle
- bird
- Boat
- bottle
- bus
- car
- cat
- chair
- cow
- dining table
- dog
- horse
- motorbike
- person
- potted plant
- sheep
- sofa
- train
- tv or monitor
VNCoreMLRequest
is the API from the CoreML
to use. It accepts a callback function that is to be used to get the features of an image, namely the VNCoreMLFeatureValueObservation
object.
The VNCoreMLFeatureValueObservation
object gives the segmentation map of the picture, which is what you were looking for. Removing the background is masking one of these segments.
A complete, nicely put, step by step guide is here.
From that link the main part is as below:
// use DeepLabV3
func runVisionRequest() {
guard let model = try? VNCoreMLModel(for: DeepLabV3(configuration: .init()).model)
else { return }
let request = VNCoreMLRequest(model: model, completionHandler: visionRequestDidComplete)
request.imageCropAndScaleOption = .scaleFill
DispatchQueue.global().async {
let handler = VNImageRequestHandler(cgImage: inputImage.cgImage!, options: [:])
do {
try handler.perform([request])
}catch {
print(error)
}
}
}
// extract the segmentation map and convert to an image using a third party library
func visionRequestDidComplete(request: VNRequest, error: Error?) {
DispatchQueue.main.async {
if let observations = request.results as? [VNCoreMLFeatureValueObservation],
let segmentationmap = observations.first?.featureValue.multiArrayValue {
let segmentationMask = segmentationmap.image(min: 0, max: 1)
self.outputImage = segmentationMask!.resizedImage(for: self.inputImage.size)!
maskInputImage()
}
}
}