Extract object from the image of a box having object

Asked 28/3, 2017 at 5:19 Answered 28/3, 2017 at 12:16

I have a box, transparent from the front and i am placing camera on the front transparent panel to capture the image of the internal, most of the time the box is empty, but suppose someone places an object inside this box, then i have to just extract this object from the image captured.

(My real aim is to recognize the object placed inside the box, but first step is to extract the object and then extract features to generate a training model, and for now i am only focusing in extracting the object from the image)

I am new to OpenCV and using it with Python and i found few OpenCV functions which can help me.

GrabCut, this works perfectly for me, i am able to just extract the object, provided that, i mark the rectangle over the object, but as object can be anywhere inside the box so its not possible to draw the exact size rectangle of the object, and if there is a way please suggest me.
Difference of Image, Since i have empty cavity box image and when the object is present, i can use cv2.absdiff function to calculate the difference between the image, but this doesn't work properly in most of the cases, as it uses pixel by pixel difference calculations, and due to this results are weird, plus change in light conditions also makes it difficult.
Back Ground Subtraction, i read few post on this and it looks this is what i need, but the example i got is for video, and i didn't understand how to make it work with just two images, one empty box and another with object.

The code for back ground subtraction is as follow, even it doesn't work that much properly for short distances

cap = cv2.VideoCapture(0)
fgbg = cv2.createBackgroundSubtractorMOG2()
fgbg2 = cv2.createBackgroundSubtractorKNN()

while True:
    ret, frame = cap.read()
    cv2.namedWindow('Real', cv2.WINDOW_NORMAL)
    cv2.namedWindow('MOG2', cv2.WINDOW_NORMAL)
    cv2.namedWindow('KNN', cv2.WINDOW_NORMAL)
    cv2.namedWindow('MOG2_ERODE', cv2.WINDOW_NORMAL)
    cv2.namedWindow('KNN_ERODE', cv2.WINDOW_NORMAL)
    cv2.imshow('Real', frame)
    fgmask = fgbg.apply(frame)
    fgmask2 = fgbg2.apply(frame)
    kernel = np.ones((3,3), np.uint8)
    fgmask_erode = cv2.erode(fgmask,kernel,iterations = 1)
    fgmask2_erode = cv2.erode(fgmask2,kernel,iterations = 1)

    cv2.imshow('MOG2',fgmask)
    cv2.imshow('KNN',fgmask2)
    cv2.imshow('MOG2_ERODE',fgmask_erode)
    cv2.imshow('KNN_ERODE',fgmask2_erode)
    k = cv2.waitKey(30) & 0xff
    if k == 27:
        break
cap.release()
cv2.destroyAllWindows()

Can anyone please help in this topic, and also how to modify the above code to just use the two images, when i tried i get blank images. Thanks in Advance

Sample Images from Camera are as follow: (I am using 8MP Camera that's why the image size is large, i reduced the size and then uploading it here)

Dorkus answered 28/3, 2017 at 5:19 Comment(6)

To improve performance of GrabCut, perform edge detection to get the edge of the box. After obtaining that you can fit a rectangle and then perform GrabCut – Petrous 28/3, 2017 at 5:54

How are you expecting people to help you with an image processing task if you don't provide images? – Morie 28/3, 2017 at 6:38

@JeruLuke: Thanks, i just updated some sample images, and will try what you are suggesting. – Dorkus 28/3, 2017 at 9:3

@m3h0w: I just updated three images, one empty box, another two with two objects of Lunch Boxes. – Dorkus 28/3, 2017 at 9:4

@Dorkus so you ask mainly about image difference and background subtraction and then you accept an answer that uses edges? Jeru Luke's solution is cool and interesting but does it actually answer your question and provide the reliability you'll need? – Morie 28/3, 2017 at 20:28

@m3h0w I accepted your answer, though i wanted to accept both the answers. Your approach seems much better to me, as the images shared by me are captured by holding the camera in my hand, so there is maximum possible variations, & to me the results achieved by ur method is better as compared to other one, but i will definitely consider the other method suggested by JeruLuke and will tune certain parameters to see how can i further improve my results. Thank You guys for your time, i really appreciate your help, one thing is clear i have to put lots of effort in understanding this subject – Dorkus 29/3, 2017 at 4:44

You have mentioned subtraction and I believe that in this case it is the best approach. I have implemented a very simple algorithm that takes care of the cases you have provided us with. I explained the code with comments. On the images, I present the most important steps that you had problems with - the clue of the algorithm.

Difference between the images:

Difference threshold inverse:

Both of the above combined:

Result no.1:

Result no.2:

Code with explanation:

import cv2
import numpy as np

# load the images
empty = cv2.imread("empty.jpg")
full = cv2.imread("full_2.jpg")

# save color copy for visualization
full_c = full.copy()

# convert to grayscale
empty_g = cv2.cvtColor(empty, cv2.COLOR_BGR2GRAY)
full_g = cv2.cvtColor(full, cv2.COLOR_BGR2GRAY)

# blur to account for small camera movement
# you could try if maybe different values will maybe
# more reliable for broader cases
empty_g = cv2.GaussianBlur(empty_g, (41, 41), 0)
full_g = cv2.GaussianBlur(full_g, (41, 41), 0)

# get the difference between full and empty box
diff = full_g - empty_g
cv2.imwrite("diff.jpg", diff)

# inverse thresholding to change every pixel above 190
# to black (that means without the bag)
_, diff_th = cv2.threshold(diff, 190, 255, 1)
cv2.imwrite("diff_th.jpg", diff_th)

# combine the difference image and the inverse threshold
# will give us just the bag
bag = cv2.bitwise_and(diff, diff_th, None)
cv2.imwrite("just_the_bag.jpg", bag)

# threshold to get the mask instead of gray pixels
_, bag = cv2.threshold(bag, 100, 255, 0)

# dilate to account for the blurring in the beginning
kernel = np.ones((15, 15), np.uint8)
bag = cv2.dilate(bag, kernel, iterations=1)

# find contours, sort and draw the biggest one
_, contours, _ = cv2.findContours(bag, cv2.RETR_TREE,
                                  cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key=cv2.contourArea, reverse=True)[:3]
cv2.drawContours(full_c, [contours[0]], -1, (0, 255, 0), 3)

# show and save the result
cv2.imshow("bag", full_c)
cv2.imwrite("result2.jpg", full_c)
cv2.waitKey(0)

Now, of course the algorithm can be improved and will have to be adjusted to whatever conditions you'll have to deal with. You've mentioned the difference in lighting for example - you'll have to handle that to make sure that the background is similar for subtracted images. To do that you'll probably have to look at some contrast enhancement algorithms, maybe registration if the camera moves - that could be a completely separate issue on its own.

I would also consider GrabCut that JeruLuke mentioned with a bounding rectangle of the contour found by my approach. To make sure the object is contained within it just expand the rectangle.

Tahoe answered 28/3, 2017 at 12:16 Comment(1)

I am working on another problem with similar kind of approach. The problem I'm facing is, I can't get close contours. My this question may help what I want to say #54956110 I am trying to extract leaves from an image. If there are 10 leaves in image, I want 10 new images each having one leaf. @m3h0w – Confess 7/5, 2019 at 14:30

I have a rough solution in place. You will have to refine it to suit your needs in case you wish to take it further.

First, I performed edge detected using cv2.Canny() on the blurred version of the gray-scale image:

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) #---convert image to gray---
blur = cv2.GaussianBlur(gray, (5, 5), 0)   #---blurred the image---
edges = cv2.Canny(blur, lower, upper)   #---how to find perfect edges see link below---

I dilated the edges to make them more visible:

kernel = np.ones((3, 3), np.uint8)
dilated = cv2.morphologyEx(edges, cv2.MORPH_DILATE, kernel)

Next, I found contours present on the edge detected image.

_, contours, hierarchy = cv2.findContours(king, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

Note I used cv2.RETR_EXTERNAL to get outer contours only

Then I found the contour having largest area a put a bounding box around it.

Now I used the GrabCut Algorithm to segment the lunch box. To do so I got all the help I needed from THIS LINK HERE

I used the coordinates of the bounding rectangle I obtained after finding the contour as input for the GrabCut Algorithm

FINAL OUTPUT:

As you can see it is not perfect but this is the best I could get to.

Hope it helps. Do post if you get a better solution!!! :D

Petrous answered 28/3, 2017 at 10:49 Comment(5)

To get better edge detection SEE THIS ANSWER I POSTED – Petrous 28/3, 2017 at 10:51

Nice ! I feel the "weak point" is in the "biggest contour detection". Because if your object contour isn't closed by Canny detection, then the biggest contour may well be one of the box contours. Maybe if your object is never "box colored", you could also go for color segmentation to detect the countour ? It would still probably be pretty noisy and you'd have to select the biggest contour, but could be another approach if Canny + findContour fails. – Chemoreceptor 28/3, 2017 at 11:16

@Chemoreceptor I thought about the same thing too. K-Means clustering would help in that case. Since the background is constant I also thought of adaptive threshold. – Petrous 28/3, 2017 at 11:37

Interesting solution. I think though, that not using background subtraction in any form would be a waste in this situation. Edge-based approach might prove to be unreliable. – Morie 28/3, 2017 at 12:24

@m3h0w I considered the fact given there was no reference image. How would you proceed in such a case? – Petrous 28/3, 2017 at 12:36

Recommended topics

Hot tags