What do the elements in a homography matrix mean?

Asked 22/8, 2012 at 9:44 Answered 12/2, 2013 at 0:11

I'm new to image processing, but I'm using EMGU for C# image analysis. However, I know the homography matrix isn't unique to EMGU, and so perhaps someone with knowledge of another language can explain better.

Please (in as simplified as can be) can someone explain what each element does. I've looked this up online but can't find an answer that I can properly understand (as I said, I'm kinda new to all this!)

I analyse 2 images, both 2 dimensional. Therefore a 3x3 matrix is needed to account for the rotation / translation of the image. If no movement is detected, the homography matrix is: 100, 010, 001

I know from research (eg OpenCV Homography, Transform a point, what is this code doing?) that: 10Tx, 01Ty, XXX

The 10,01 bit is the rotation of the x and y coordinates. The Tx and Ty bits are the translational movement, but what is the XXX bit? This is what I don't understand? Is it something to do with affine transformations? Please can someone explain: 1. If I'm currently right in what I say above. 2. what the XXX bit means

Summertree answered 22/8, 2012 at 9:44 Comment(3)

Is my answer to this similar question any help to you? If so we can close this as a dupe of that. – Girl 22/8, 2012 at 10:7

Not really... As I said, I'm new to this, so you'll have to bear with me... I understand we can't use non-square matrices, but I still don't understand what the final row does? For example, in the research I did, it calculated a Z = 1/tz using the third row, but I have no clue whate this tz is, hence I don't get what the Z is. – Summertree 22/8, 2012 at 12:36

@Summertree Check this answer to see how homography is related to rotation and translation and projects a 2D point to 3D coords. https://mcmap.net/q/225212/-get-3d-coordinates-from-2d-image-pixel-if-extrinsic-and-intrinsic-parameters-are-known – Apathetic 23/8, 2012 at 12:20

It's not that difficult to understand if you have a grasp of matrix multiplication. Assume you point x is

/a\
\b/,

and you want to rotate the coordinate system by A:

/3 4\
\5 6/

and and "move it" it by t

/2\
\2/.

The latter matrices are the components of the affine transformation to get the new point y:

y = A*x + t = <a'; b'>T //(T means transposed).

As you know, to get that, one can construct a 3d matrix B and a vector x' looking like

    /3 4 2\         /a\
B = |5 6 2| ,  x' = |b|
    \0 0 1/         \1/

such that

     /a'\
y' = |b'| = B*x'
     \ 1/

from which you can extract y. Let's see how that works. In the original transformation (using addition), the first step would be to carry out the multiplication, ie. the rotating part y_r:

y_r = A*x = <3a+4b; 5a+6b>T

then you add the "absolute" part:

y = y_r + t = <3a+4b+2; 5a+6b+2>T

Now look at how B works. I'll calculate y' row by row:

1) a' = 3*a + 4*b + 2*1

2) b' = 5*a + 6*b + 2*1

3) the rest: 0*a + 0*b + 1*1 = 1

Just what we expected. First, the rotation part gets calculated--addition and multiplication. Then, the x-part of the translational part gets added, multiplied by 1--it stays the same. The same thing for the second row.

In the third row, a and b are dropped (multiplied by 0). The last part is kept the same, and happens to be 1. So, all about that last line is to "drop" the values of the point and keep the 1.

It could be argued, then, that a 2x3 matrix would be enough for that. That's partially true, but has one significant disadvantage: you loose composability. Suppose you are basically satisfied with B, but want to mirror one coordinate. Then you can choose another transformation matrix

    /-1 0 0\
C = | 0 1 0|
    \ 0 0 1/

and have a result

y'' = C*B*x' = <-3a+4b+2; 5a+6b+2; 1>T

This simple multiplication could not be done that easily with 2x3 matrices, simply because of the properties of matrix multiplication.

In principle, in the above, the last row (the XXX) could also be anything else of the form <0;0;x>. It was there just to drop the point values. It is however necessary exactly like this to make composition by multiplication work.

Finally, wikipedia seems quite informative to me in this case.

Gimp answered 22/8, 2012 at 13:45 Comment(1)

3x3 matrix is in homogeneous coordinates en.wikipedia.org/wiki/Homogeneous_coordinates – Xylidine 24/7, 2013 at 12:3

First of all affine transformation are those that preserve straight lines and can many of arbitrary dimensionality

Homography describes the mapping across two planes or what happens during pure camera rotation.

The last row represents various shears (that is when x is function of both x, y)

Shuttlecock answered 12/2, 2013 at 0:11 Comment(0)

Recommended topics

Hot tags