Python OpenCV: Detecting a general direction of movement?
Asked Answered
O

2

8

I'm still hacking together a book scanning script, and for now, all I need is to be able to automagically detect a page turn. The book fills up 90% of the screen (I'm using a cruddy webcam for the motion detection), so when I turn a page, the direction of motion is basically in that same direction.

I have modified a motion-tracking script, but derivatives are getting me nowhere:

#!/usr/bin/env python

import cv, numpy

class Target:
    def __init__(self):
        self.capture = cv.CaptureFromCAM(0)
        cv.NamedWindow("Target", 1)

    def run(self):
        # Capture first frame to get size
        frame = cv.QueryFrame(self.capture)
        frame_size = cv.GetSize(frame)
        grey_image = cv.CreateImage(cv.GetSize(frame), cv.IPL_DEPTH_8U, 1)
        moving_average = cv.CreateImage(cv.GetSize(frame), cv.IPL_DEPTH_32F, 3)
        difference = None
        movement = []

        while True:
            # Capture frame from webcam
            color_image = cv.QueryFrame(self.capture)

            # Smooth to get rid of false positives
            cv.Smooth(color_image, color_image, cv.CV_GAUSSIAN, 3, 0)

            if not difference:
                # Initialize
                difference = cv.CloneImage(color_image)
                temp = cv.CloneImage(color_image)
                cv.ConvertScale(color_image, moving_average, 1.0, 0.0)
            else:
                cv.RunningAvg(color_image, moving_average, 0.020, None)

            # Convert the scale of the moving average.
            cv.ConvertScale(moving_average, temp, 1.0, 0.0)

            # Minus the current frame from the moving average.
            cv.AbsDiff(color_image, temp, difference)

            # Convert the image to grayscale.
            cv.CvtColor(difference, grey_image, cv.CV_RGB2GRAY)

            # Convert the image to black and white.
            cv.Threshold(grey_image, grey_image, 70, 255, cv.CV_THRESH_BINARY)

            # Dilate and erode to get object blobs
            cv.Dilate(grey_image, grey_image, None, 18)
            cv.Erode(grey_image, grey_image, None, 10)

            # Calculate movements
            storage = cv.CreateMemStorage(0)
            contour = cv.FindContours(grey_image, storage, cv.CV_RETR_CCOMP, cv.CV_CHAIN_APPROX_SIMPLE)
            points = []

            while contour:
                # Draw rectangles
                bound_rect = cv.BoundingRect(list(contour))
                contour = contour.h_next()

                pt1 = (bound_rect[0], bound_rect[1])
                pt2 = (bound_rect[0] + bound_rect[2], bound_rect[1] + bound_rect[3])
                points.append(pt1)
                points.append(pt2)
                cv.Rectangle(color_image, pt1, pt2, cv.CV_RGB(255,0,0), 1)

            num_points = len(points)

            if num_points:
                x = 0
                for point in points:
                    x += point[0]
                x /= num_points

                movement.append(x)

            if len(movement) > 0 and numpy.average(numpy.diff(movement[-30:-1])) > 0:
              print 'Left'
            else:
              print 'Right'

            # Display frame to user
            cv.ShowImage("Target", color_image)

            # Listen for ESC or ENTER key
            c = cv.WaitKey(7) % 0x100
            if c == 27 or c == 10:
                break

if __name__=="__main__":
    t = Target()
    t.run()

It detects the average motion of the average center of all of the boxes, which is extremely inefficient. How would I go about detecting such motions quickly and accurately (i.e. within a threshold)?

I'm using Python, and I plan to stick with it, as my whole framework is based on Python.

And help is appreciated, so thank you all in advance. Cheers.

Outstrip answered 21/12, 2010 at 3:38 Comment(4)
Do you really need the motion tracking? Why not just detect changes over some threshold? (i.e. something along the lines of sum(abs(img2 - img1)) > threshold)Schwejda
Hmm, I'll fiddle with that. But how would I tell whether the page was turned forwards or backwards, or even worse, turned half way and then back. I'll play with the graphs, as that's how I work. Thanks!Outstrip
Ah, true, I just assumed that you needed to know that a page had been turned... If you need to know the direction, my comment above clearly isn't a good option!Schwejda
You seem to know what you are talking about. Do you mind if I ask if you know how to use cv.CalcOpticalFlowLK()? I have it working properly, but it gives me completely unusable results (it's like a slowed-down version of the threshold method you mentioned).Outstrip
S
3

I haven't used OpenCV in Python before, just a bit in C++ with openframeworks.

For this I presume OpticalFlow's velx,vely properties would work.

For more on how Optical Flow works check out this paper.

HTH

Schou answered 21/12, 2010 at 3:51 Comment(5)
Oooooh! That looks shiny. I will check this out, definitely, as this seems to be what I'm looking for.Outstrip
I've got it, but I can't figure out what's happening. I get a fluctuation in velx, but it's randomly in the positive or negative direction. Do you have anything I can look at? This seems like something I can use, but I just can't figure out how...Outstrip
A bit late, but I modified a demo package from OpenCV's Python bindings to suit my needs. Thanks!Outstrip
@Outstrip - I'm trying to do exactly what you want, would you mind sharing your code? Or if not, would you be so kind to point me in the direction of the demo package you used? You would really make my day :-)Adept
@kramer65: I don't think I kept the code around, unfortunately. I just modified the opt_flow.py demo script and set a threshold for the x and y directions. It didn't work as well as I had hoped.Outstrip
A
1

why don't you use cv.GoodFeaturesToTrack ? it may solve the script runtime ... and shorten the code ...

Aristophanes answered 15/3, 2011 at 17:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.