What do I do with the fundamental matrix?
Asked Answered
O

2

11

I am trying to reconstruct a 3d shape from multiple 2d images. I have calculated a fundamental matrix, but now I don't know what to do with it.

I am finding multiple conflicting answers on stack overflow and academic papers. For example, Here says you need to compute the rotation and translation matrices from the fundamental matrix.

Here says you need to find the camera matrices.

Here says you need to find the homographies.

Here says you need to find the epipolar lines.

Which is it?? (And how do I do it? I have read the H&Z book but I do not understand it. It says I can 'easily' use the 'direct formula' in result 9.14, but result 9.14 is neither easy nor direct to understand.)

Stack overflow wants code so here's what I have so far:

    # let's create some sample data

    Wpts = np.array([[1, 1, 1, 1],  # A Cube in world points
                     [1, 2, 1, 1],
                     [2, 1, 1, 1],
                     [2, 2, 1, 1],
                     [1, 1, 2, 1],
                     [1, 2, 2, 1],
                     [2, 1, 2, 1],
                     [2, 2, 2, 1]])


    Cpts = np.array([[0, 4, 0, 1],  #slightly up
                     [4, 0, 0, 1],
                     [-4, 0, 0, 1],
                     [0, -4, 0, 1]])
    Cangles = np.array([[0, -1, 0],  #slightly looking down
                        [-1, 0, 0],
                        [1, 0, 0],
                        [0,1,0]])



    views = []
    transforms = []
    clen = len(Cpts)
    for i in range(clen):
        cangle = Cangles[i]
        cpt = Cpts[i]

        transform = cameraTransformMatrix(cangle, cpt)
        transforms.append(transform)
        newpts = np.dot(Wpts, transform.T)
        view = cameraView(newpts)
        views.append(view)



H = cv2.findFundamentalMat(views[0], views[1])[0]
## now what???  How do I recover the cube shape?

Edit: I do not know the camera parameters

Outrigger answered 24/11, 2019 at 3:27 Comment(3)
might be wrong since I didn't work in that field yet, but doesn't the fundamenral matrix give you information about the camera movement/displacement and with that you can use stereo reconstruction by ray-intersection (in the most naive way)?Corniculate
That's what I thought, but I am seeing conflicting reports on how to do thatOutrigger
This is a great post! I've gotten this very question before in interviews for computer vision / autonomous vehicle positions.Hurlow
M
18

Fundamental Matrix

At first, listen to the fundamental matrix song ;).

The Fundamental Matrix only shows the mathematical relationship between your point correspondences in 2 images (x' - image 2, x - image 1). "That means, for all pairs of corresponding points holds eq1 " (Wikipedia). This also means, that if you are having outlier or incorrect point correspondences, it directly affects the quality of your fundamental matrix.

Additionally, a similar structure exists for the relationship of point correspondences between 3 images which is called Trifocal Tensor.

A 3d reconstruction using exclusively the properties of the Fundamental Matrix is not possible because "The epipolar geometry is the intrinsic projective geometry between two views. It is independent of scene structure, and only depends on the cameras’ internal parameters and relative pose." (HZ, p.239).

Camera matrix

Refering to your question how to reconstruct the shape from multiple images you need to know the camera matrices of your images (K', K). The camera matrix is a 3x3 matrix composed of the camera focal lengths or principal distance (fx, fy) as well as the optical center or principal point (cx, cy).


eq2

You can derive your camera matrix using camera calibration.

Essential matrix

When you know your camera matrices you can extend your Fundamental Matrix to a Essential Matrix E.


eq3

You could say quite sloppy that your Fundamental Matrix is now "calibrated".

The Essential Matrix can be used to get the rotation (rotation matrix R) and translation (vector t) of your second image in comparison to your first image only up to a projective reconstruction. t will be a unit vector. For this purpose you can use the OpenCV functions decomposeEssentialMat or recoverPose (that uses the cheirality check) or read further detailed explanations in HZ.

Projection matrix

Knowing your translation and rotation you can build you projection matrices for your images. The projection matrix is defined as eq4. Finally, you can use triangulation (triangulatePoints) to derive the 3d coordinates of your image points. I recommend using a subsequent bundle adjustment to receive a proper configuration. There is also a sfm module in openCV.

Since homography or epipolar line knowledge is not essentially necessary for the 3d reconstruction I did not explain these concepts.

Miyasawa answered 25/11, 2019 at 8:58 Comment(9)
Sorry, I do not have the camera parameters/matricesOutrigger
I assume your image is 100x100 pixels. If you do not have the camera parameters you could assume a focal length (e.g. 0.75 of your image size fx,fy = 75) and you can set the optical center to the center of your image (cx = 50, cy =50). I want to point out that it is definitely better to calibrate the camera or refer to real values (e.g. in the EXIF of the image)Miyasawa
I have been assuming the focal length is 1 unit and trying calibration from there but have not been able to produce any recognizeable results.Outrigger
It is strange that you say H&Z says you cannot reconstruct the scene from the fundamental matrix when pg 265 explicitly says the opposite. "If a set of point correspondences in two views determine the fundamental matrix uniquely, then the scene and cameras may be reconstructed from these correspondences alone"Outrigger
Yeah, but only up to a projective scale and not a metric reconstruction which can be seen in the example in Fig. 10.3 on page 267. There are ways to go from a projective reconstruction to a metric reconstruction using autocalibration (youtube.com/watch?v=37QM0I2jDYo&feature=youtu.be).Miyasawa
I asked the author if this is useful today. He replied: "... I think a better approach is to guess the K matr. for each img, then use corresp. to solve for an initial metric reco. using the 5 point method between pairs of imgs. Even if you guess f incorrectly, it won't prevent you from making an initial metric reco, because f is largely ambiguous with t due to the dolly-zoom ambig, so you will still be able to form an initial metric reconstr., and since you never have to go through a dimensionality reduction your reproj. errors will likely be lower than if you had started with proj. approach.Miyasawa
if you could make an answer explaining this I will mark it as the answerOutrigger
What if we have a guess of the focal length, cx and cy, then start with that guess, then is it possible with tie points between two images improve the camera Matrix and also calculate the coefficients?Cyruscyst
This answer is partly wrong. If you know the fundamental matrix then you know everything. If you only have the essential matrix, then you don't have the camera intrinsics. So the fundamental matrix enriches the essential matrix with the camera intrinsic parameters.Dardani
M
0

With your fundamental matrix, you can determine the camera matrices P and P' in a canonical form as stated (HZ,pp254-256). From these camera matrices you can theoretically triangulate a projective reconstruction that differs to the real scene in terms of an unknown projective transformation.

It has to be noted that the linear triangulation methods aren't suitable for projective reconstruction as stated in (HZ,Discussion,p313) ["...neither of these two linear methods is quite suitable for projective reconstruction, since they are not projective-invariant."] and therefore, the mentioned recommended triangulation technique should be used to obtain valueable results (that is actually more work to implement).

From this projective reconstruction you could use self-calibration approaches that can work in some scenarios but will not yield the accuracy and robustness that you can obtain with a calibrated camera and the utilization of the essential matrix to compute the motion parameters.

Methylal answered 29/12, 2019 at 14:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.