Why do I have to divide by Z?

Asked 8/1, 2014 at 22:8 Answered 15/10, 2021 at 15:16

Solved c++opengl matrix-multiplication glm-math

I needed to implement 'choosing an object' in a 3D environment. So instead of going with robust, accurate approach, such as raycasting, I decided to take the easy way out. First, I transform the objects world position onto screen coordinates:

glm::mat4 modelView, projection, accum;
glGetFloatv(GL_PROJECTION_MATRIX, (GLfloat*)&projection);
glGetFloatv(GL_MODELVIEW_MATRIX, (GLfloat*)&modelView);
accum = projection * modelView;
glm::mat4 transformed = accum * glm::vec4(objectLocation, 1);

Followed by some trivial code to transform the opengl coordinate system to normal window coordinates, and do a simple distance from the mouse check. BUT that doesn't quite work. In order to translate from world space to screen space, I need one more calculation added on to the end of the function shown above:

transformed.x /= transformed.z;
transformed.y /= transformed.z;

I don't understand why I have to do this. I was under the impression that, once one multiplied your vertex by the accumulated modelViewProjection matrix, you had your screen coordinates. But I have to divide by Z to get it to work properly. In my openGL 3.3 shaders, I never have to divide by Z. Why is this?

EDIT: The code to transform from from opengl coordinate system to screen coordinates is this:

int screenX = (int)((trans.x + 1.f)*640.f); //640 = 1280/2
int screenY = (int)((-trans.y + 1.f)*360.f); //360 = 720/2

And then I test if the mouse is near that point by doing:

float length = glm::distance(glm::vec2(screenX, screenY), glm::vec2(mouseX, mouseY));
if(length < 50) {//you can guess the rest

EDIT #2
This method is called upon a mouse click event:

glm::mat4 modelView;
glm::mat4 projection;
glm::mat4 accum;
glGetFloatv(GL_PROJECTION_MATRIX, (GLfloat*)&projection);
glGetFloatv(GL_MODELVIEW_MATRIX, (GLfloat*)&modelView);
accum = projection * modelView;
float nearestDistance = 1000.f;
gameObject* nearest = NULL;
for(uint i = 0; i < objects.size(); i++) {
    gameObject* o = objects[i];
    o->selected = false;
    glm::vec4 trans = accum * glm::vec4(o->location,1);
    trans.x /= trans.z;
    trans.y /= trans.z;
    int clipX = (int)((trans.x+1.f)*640.f);
    int clipY = (int)((-trans.y+1.f)*360.f);
    float length = glm::distance(glm::vec2(clipX,clipY), glm::vec2(mouseX, mouseY));
    if(length<50) {
        nearestDistance = trans.z;
        nearest = o;
    }
}
if(nearest) {
    nearest->selected = true;
}

mouseRightPressed = true;

The code as a whole is incomplete, but the parts relevant to my question works fine. The 'objects' vector contains only one element for my tests, so the loop doesn't get in the way at all.

Totality answered 8/1, 2014 at 22:8 Comment(1)

can you show the trivial code? – Unscreened 8/1, 2014 at 22:21

I've figured it out. As Mr David Lively pointed out,

Typically in this case you'd divide by .w instead of .z to get something useful, though.

My .w values were very close to my .z values, so in my code I change the statement:

transformed.x /= transformed.z;
transformed.y /= transformed.z;

to:

transformed.x /= transformed.w;
transformed.y /= transformed.w;

And it still worked just as before.

https://mcmap.net/q/1776743/-perspecitve-divide-in-vertex-shader explains that division by w will be done later in the pipeline. Obviously, because my code simply multiplies the matrices together, there is no 'later pipeline'. I was just getting lucky in a sense, because my .z value was so close to my .w value, there was the illusion that it was working.

Totality answered 8/1, 2014 at 23:12 Comment(0)

The divide-by-Z step effectively applies the perspective transformation. Without it, you'd have an iso view. Imagine two view-space vertices: A(-1,0,1) and B(-1,0,100).

Without the divide by Z step, the screen coordinates are equal (-1,0).

With the divide-by-Z, they are different: A(-1,0) and B(-0.01,0). So, things farther away from the view-space origin (camera) are smaller in screen space than things that are closer. IE, perspective.

That said: if your projection matrix (and matrix multiplication code) is correct, this should already be happening, as the projection matrix will contain 1/Z scaling components which do this. So, some questions:

Are you really using the output of a projection transform, or just the view transform?
Are you doing this in a pixel/fragment shader? Screen coordinates there are normalized (-1,-1) to (+1,+1), not pixel coordinates, with the origin at the middle of the viewport. Typically in this case you'd divide by .w instead of .z to get something useful, though.
If you're doing this on the CPU, how are you getting this information back to the host?

Lisp answered 8/1, 2014 at 22:26 Comment(6)

1: I am using both, as you can see from me composing the GL_MODELVIEW_MATRIX and GL_PROJECTION_MATRIX into one glm::mat4 2: I have no shaders at all, I am using gl 1.1 I think? I'll try dividing by w, but it works as is, which is what is puzzling me. 3: I don't understand the question – Totality 8/1, 2014 at 22:47

@Totality it'd be really, really useful if you'd post the whole method. Assuming it's not 500 lines long. Which would be another problem. – Lisp 8/1, 2014 at 22:49

@Totality okay, looks fine at first glance. This may be silly, but how are you creating and assigning the projection matrix? Can I assume that everything is drawing correctly? – Lisp 8/1, 2014 at 22:56

Yeah, everything draws just fine. I call glMatrixMode(GL_PROJECTION); glLoadIdentity(); gluPerspective(75.f, 1280.f/720.f, 0.1f, 500.f); glMatrixMode(GL_MODELVIEW); and then assign my camera transforms. – Totality 8/1, 2014 at 22:58

I've figured it out and answered my own question. Thank you for the insight into the problem though! – Totality 8/1, 2014 at 23:13

@Totality glad for whatever help I provided. Thanks for posting an answer (other than "Thanks! Got it!" xkcd.com/979 – Lisp 8/1, 2014 at 23:22

I've figured it out. As Mr David Lively pointed out,

Typically in this case you'd divide by .w instead of .z to get something useful, though.

My .w values were very close to my .z values, so in my code I change the statement:

transformed.x /= transformed.z;
transformed.y /= transformed.z;

to:

transformed.x /= transformed.w;
transformed.y /= transformed.w;

And it still worked just as before.

Totality answered 8/1, 2014 at 23:12 Comment(0)

I guess it is because you are going from 3 dimensions to 2 dimensions, so you are normalizing the 3 dimension world to a 2 dimensional coordinates.

P = (X,Y,Z) in 3D will be q = (x,y) in 2D where x=X/Z and y = Y/Z

So a circle in 3D will not be circle in 2D.

You can check this video out: https://www.youtube.com/watch?v=fVJeJMWZcq8

I hope I understand your question correctly.

Communize answered 15/10, 2021 at 15:16 Comment(1)

You're not wrong, but that's not the nature of the question. I'm aware that a divide by Z is necessary, but my question is why divide by Z didn't quite work correctly. In graphics pipelines, divide by W is the correct way to do it. This is so the Z value can be normalized into a correct range as well as the X and Y. – Totality 27/10, 2021 at 20:38

Recommended topics

Hot tags