How do orthographic and perspective camera models in structure from motion differ from each other?

T

2

0

Under the assumption that the camera model is orthographic, how do orthographic and perspective camera models in structure from motion?

Also, how do these techniques differ from each other?

Tatianatatianas answered 15/9, 2016 at 22:42 Comment(0)

A

1

Say you have a static scene and moving camera (or equivalently, rigidly moving scene and static camera) and you want to reconstruct the scene geometry and camera motion from two or more images. The reconstruction usually based on obtaining point correspondences, that is you have some equations which ones should be solved for the points and camera motion.

The solution can be either based on nonlinear minimization or on various approximations. The camera can be approximated by orthographic or perspective projection. In the simplest SFM case the camera can be approximated by orthographic projection (or more generally by weak perspective projection), where the scene can be recovered up to scale. But translation perpendicular to image plane can never be recovered due to the properties of orthographic projection.

Newer SfM methods use perspective projection, because with orthographic projection we can’t recover all information. With full perspective projection we can recover for example the translation along optical axis. That is the geometry and full motion can be recovered up to global scale factor.

Ashelyashen answered 16/9, 2016 at 10:22 Comment(3)

Thank you @Ashelyashen for your answer. So with assumption that no motion along optical axis, can we consider in this case that they would give close results ? – Tatianatatianas 19/9, 2016 at 5:17

It's hardly depend on the scene, but I think that SfM with orthographic projection is quite useful for small Z relative movements of 3-D objects (moving in the image). – Ashelyashen 19/9, 2016 at 6:46

have you ever bumped into some good tutorial applying the orthographic model which you could possible refer me to it ? – Tatianatatianas 19/9, 2016 at 11:16

N

1

To understand why each method is chosen we need to look at the model of the camera when we model it as orthographic and when we model it as perspective.

The orthographic camera model is a special case were we assume that the distance of the scene from the center of projection is infinite. This means that we assume there isn't any distortion caused by the distance between the object and the image. As a consequence we expect to get an identity between the object coordinate in the real world and in the image.

So for example if we have a triangle in the real world in coordinates (X1,Y1,Z1) ,(X2,Y2,Z2), (X3,Y3,Z3) we expect to see the triangle on the image (x1,y1),(x2,y2),(x3,y3) were X1=wx1 X2=wx2 .. Y1=w*y1.. and so on. where w is some scaling factor.

When this is a good assumption? Pay attention that i didn't took the Z values of each point into consideration. So this assumption is good when we look at a scene where the distance of the scene from the camera is almost constant.

Note: This is a very simplistic explanation that doesn't take into considerations a lot of other factor like the camera itself lens distortion and more.

Nosewheel answered 16/9, 2016 at 7:3 Comment(1)

Above all else, thank you for your reply. I am still wondering, how to decide for an application whether to use method assuming orthographic or perspective. could you possibly give example of applications when it is better to use orthographic model , and similarly for perspective model ? – Tatianatatianas 16/9, 2016 at 10:18

A

1