OpenCV, Python: How to use mask parameter in ORB feature detector
Asked Answered
R

1

8

By reading a few answers on stackoverflow, I've learned this much so far:

The mask has to be a numpy array (which has the same shape as the image) with data type CV_8UC1 and have values from 0 to 255.

What is the meaning of these numbers, though? Is it that any pixels with a corresponding mask value of zero will be ignored in the detection process and any pixels with a mask value of 255 will be used? What about the values in between?

Also, how do I initialize a numpy array with data type CV_8UC1 in python? Can I just use dtype=cv2.CV_8UC1

Here is the code I am using currently, based on the assumptions I'm making above. But the issue is that I don't get any keypoints when I run detectAndCompute for either image. I have a feeling it might be because the mask isn't the correct data type. If I'm right about that, how do I correct it?

# convert images to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGRA2GRAY)
curr_gray = cv2.cvtColor(self.curr, cv2.COLOR_BGRA2GRAY)

# initialize feature detector
detector = cv2.ORB_create()

# create a mask using the alpha channel of the original image--don't
# use transparent or partially transparent parts
base_cond = self.base[:,:,3] == 255
base_mask = np.array(np.where(base_cond, 255, 0))

curr_cond = self.base[:,:,3] == 255
curr_mask = np.array(np.where(curr_cond, 255, 0), dtype=np.uint8)

# use the mask and grayscale images to detect good features
base_keys, base_desc = detector.detectAndCompute(base_gray, mask=base_mask)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, mask=curr_mask)

 print("base keys: ", base_keys)
 # []
 print("curr keys: ", curr_keys)
 # []
Rabelais answered 22/8, 2017 at 6:33 Comment(3)
"how do I initialize a numpy array" -- Did you try reading the numpy documentation on data types?Eldin
The question is, to what data type on that list does CV_8UC1 correspond? I'm inclined to believe that it's uint8 because of the 8 and the U, although I haven't found any documentation confirming that. The issue is that I'm not getting any keypoints from thatRabelais
docs.opencv.org/2.4/modules/core/doc/basic_structures.html -- first paragraph. You got it right, uint8. | Inspect the masks and make sure they make sense.Eldin
R
12

So here is most, if not all, of the answer:

What is the meaning of those numbers

0 means to ignore the pixel and 255 means to use it. I'm still unclear on the values in between, but I don't think all nonzero values are considered "equivalent" to 255 in the mask. See here.

Also, how do I initialize a numpy array with data type CV_8UC1 in python?

The type CV_8U is the unsigned 8-bit integer, which, using numpy, is numpy.uint8. The C1 postfix means that the array is 1-channel, instead of 3-channel for color images and 4-channel for rgba images. So, to create a 1-channel array of unsigned 8-bit integers:

import numpy as np
np.zeros((480, 720), dtype=np.uint8)

(a three-channel array would have shape (480, 720, 3), four-channel (480, 720, 4), etc.) This mask would cause the detector and extractor to ignore the entire image, though, since it's all zeros.

how do I correct [the code]?

There were two separate issues, each separately causing each keypoint array to be empty.

First, I forgot to set the type for the base_mask

base_mask = np.array(np.where(base_cond, 255, 0)) # wrong
base_mask = np.array(np.where(base_cond, 255, 0), dtype=uint8) # right

Second, I used the wrong image to generate my curr_cond array:

curr_cond = self.base[:,:,3] == 255 # wrong
curr_cond = self.curr[:,:,3] == 255 # right

Some pretty dumb mistakes.

Here is the full corrected code:

# convert images to grayscale
base_gray = cv2.cvtColor(self.base, cv2.COLOR_BGRA2GRAY)
curr_gray = cv2.cvtColor(self.curr, cv2.COLOR_BGRA2GRAY)

# initialize feature detector
detector = cv2.ORB_create()

# create a mask using the alpha channel of the original image--don't
# use transparent or partially transparent parts
base_cond = self.base[:,:,3] == 255
base_mask = np.array(np.where(base_cond, 255, 0), dtype=np.uint8)

curr_cond = self.curr[:,:,3] == 255
curr_mask = np.array(np.where(curr_cond, 255, 0), dtype=np.uint8)

# use the mask and grayscale images to detect good features
base_keys, base_desc = detector.detectAndCompute(base_gray, mask=base_mask)
curr_keys, curr_desc = detector.detectAndCompute(curr_gray, mask=curr_mask)

TL;DR: The mask parameter is a 1-channel numpy array with the same shape as the grayscale image in which you are trying to find features (if image shape is (480, 720), so is mask).

The values in the array are of type np.uint8, 255 means "use this pixel" and 0 means "don't"

Thanks to Dan Mašek for leading me to parts of this answer.

Rabelais answered 23/8, 2017 at 0:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.