Implementing Ray Picking
Asked Answered
T

6

34

I have a renderer using directx and openGL, and a 3d scene. The viewport and the window are of the same dimensions.

How do I implement picking given mouse coordinates x and y in a platform independent way?

Toaster answered 19/1, 2010 at 11:34 Comment(1)
Take a look at this guide, it might be helpfulRowan
P
30

If you can, do the picking on the CPU by calculating a ray from the eye through the mouse pointer and intersect it with your models.

If this isn't an option I would go with some type of ID rendering. Assign each object you want to pick a unique color, render the objects with these colors and finally read out the color from the framebuffer under the mouse pointer.

EDIT: If the question is how to construct the ray from the mouse coordinates you need the following: a projection matrix P and the camera transform C. If the coordinates of the mouse pointer is (x, y) and the size of the viewport is (width, height) one position in clip space along the ray is:

mouse_clip = [
  float(x) * 2 / float(width) - 1,
  1 - float(y) * 2 / float(height),
  0,
  1]

(Notice that I flipped the y-axis since often the origin of the mouse coordinates are in the upper left corner)

The following is also true:

mouse_clip = P * C * mouse_worldspace

Which gives:

mouse_worldspace = inverse(C) * inverse(P) * mouse_clip

We now have:

p = C.position(); //origin of camera in worldspace
n = normalize(mouse_worldspace - p); //unit vector from p through mouse pos in worldspace
Poetics answered 19/1, 2010 at 11:42 Comment(6)
@Tom That wasn't totally clear from the question. Anyways, I've edited my answer, hope it's some help.Poetics
Its worth noting that if you use DirectX-alike matrices then the multiplication order is reversed.Biskra
thanks :) I should've been clearer at first, but your edited answer is what I wanted!Toaster
I'm not sure if the error is in your work or mine, but I could only get this algorithm to work when I used the negative value of the near clipping plane for the z coordinate of mouse_clip. IE: mouse_clip = [float(x) * 2 / float(width) - 1, 1 - float(y) * 2 / float(height), -1 * near_clipping_plane, 1]Oviparous
Just to save others the trouble: This method only works if the 3. coordinate in mouse_clip is not 0 but -near_depth. Additionally for an orthogonal matrix p has to be computed differently.Pirali
Should mouseWorldSpace be divided by it's own w coordinate, since it has a perspective transformation applied to it?Tenatenable
R
26

Here's the viewing frustum:

viewing frustum

First you need to determine where on the nearplane the mouse click happened:

  1. rescale the window coordinates (0..640,0..480) to [-1,1], with (-1,-1) at the bottom-left corner and (1,1) at the top-right.
  2. 'undo' the projection by multiplying the scaled coordinates by what I call the 'unview' matrix: unview = (P * M).inverse() = M.inverse() * P.inverse(), where M is the ModelView matrix and P is the projection matrix.

Then determine where the camera is in worldspace, and draw a ray starting at the camera and passing through the point you found on the nearplane.

The camera is at M.inverse().col(4), i.e. the final column of the inverse ModelView matrix.

Final pseudocode:

normalised_x = 2 * mouse_x / win_width - 1
normalised_y = 1 - 2 * mouse_y / win_height
// note the y pos is inverted, so +y is at the top of the screen

unviewMat = (projectionMat * modelViewMat).inverse()

near_point = unviewMat * Vec(normalised_x, normalised_y, 0, 1)
camera_pos = ray_origin = modelViewMat.inverse().col(4)
ray_dir = near_point - camera_pos
Reduplication answered 7/6, 2011 at 11:43 Comment(2)
what is that "modelView" matrix you refer to? Is that the combination of the modelToWorld matrix of the model we're trying to hit and the camera's viewMatrix?Angevin
It's been a while since I wrote this, but I think it's the matrix that transforms world coordinates into camera coordinates. If there's a line in your vertex shader like gl_Position = projection * modelView * vertexPos;, it's the bit in the middle, where the projection matrix is the translation from camera to viewport coordinates. HTH :/Reduplication
C
2

Well, pretty simple, the theory behind this is always the same

1) Unproject two times your 2D coordinate onto the 3D space. (each API has its own function, but you can implement your own if you want). One at Min Z, one at Max Z.

2) With these two values calculate the vector that goes from Min Z and point to Max Z.

3) With the vector and a point calculate the ray that goes from Min Z to MaxZ

4) Now you have a ray, with this you can do a ray-triangle/ray-plane/ray-something intersection and get your result...

Congratulatory answered 19/1, 2010 at 21:35 Comment(0)
G
1

I have little DirectX experience, but I'm sure it's similar to OpenGL. What you want is the gluUnproject call.

Assuming you have a valid Z buffer you can query the contents of the Z buffer at a mouse position with:

// obtain the viewport, modelview matrix and projection matrix
// you may keep the viewport and projection matrices throughout the program if you don't change them
GLint viewport[4];
GLdouble modelview[16];
GLdouble projection[16];
glGetIntegerv(GL_VIEWPORT, viewport);
glGetDoublev(GL_MODELVIEW_MATRIX, modelview);
glGetDoublev(GL_PROJECTION_MATRIX, projection);

// obtain the Z position (not world coordinates but in range 0 - 1)
GLfloat z_cursor;
glReadPixels(x_cursor, y_cursor, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT, &z_cursor);

// obtain the world coordinates
GLdouble x, y, z;
gluUnProject(x_cursor, y_cursor, z_cursor, modelview, projection, viewport, &x, &y, &z);

if you don't want to use glu you can also implement the gluUnProject you could also implement it yourself, it's functionality is relatively simple and is described at opengl.org

Gipon answered 20/1, 2010 at 8:58 Comment(2)
@Tom as I said, if you don't want to use the glu function you can just implement it's functionality yourself, all you would then need is to get the modelview and projection matrices for each and get the z window position for each.Gipon
which would require me to figure out the same xyz, I could then post that xyz and mark that as the answer instead of this one, you see my reasoning? Someone else posted the math anywaysToaster
H
0

Ok, this topic is old but it was the best I found on the topic, and it helped me a bit, so I'll post here for those who are are following ;-)

This is the way I got it to work without having to compute the inverse of Projection matrix:

void Application::leftButtonPress(u32 x, u32 y){
    GL::Viewport vp = GL::getViewport(); // just a call to glGet GL_VIEWPORT
vec3f p = vec3f::from(                        
        ((float)(vp.width - x) / (float)vp.width),
        ((float)y / (float)vp.height),
            1.);
    // alternatively vec3f p = vec3f::from(                        
    //      ((float)x / (float)vp.width),
    //      ((float)(vp.height - y) / (float)vp.height),
    //      1.);

    p *= vec3f::from(APP_FRUSTUM_WIDTH, APP_FRUSTUM_HEIGHT, 1.);
    p += vec3f::from(APP_FRUSTUM_LEFT, APP_FRUSTUM_BOTTOM, 0.);

    // now p elements are in (-1, 1)
    vec3f near = p * vec3f::from(APP_FRUSTUM_NEAR);
    vec3f far = p * vec3f::from(APP_FRUSTUM_FAR);

    // ray in world coordinates
    Ray ray = { _camera->getPos(), -(_camera->getBasis() * (far - near).normalize()) };

    _ray->set(ray.origin, ray.dir, 10000.); // this is a debugging vertex array to see the Ray on screen

    Node* node = _scene->collide(ray, Transform());
   cout << "node is : " << node << endl;
}

This assumes a perspective projection, but the question never arises for the orthographic one in the first place.

Hock answered 29/4, 2012 at 16:24 Comment(0)
C
0

I've got the same situation with ordinary ray picking, but something is wrong. I've performed the unproject operation the proper way, but it just doesn't work. I think, I've made some mistake, but can't figure out where. My matix multiplication , inverse and vector by matix multiplications all seen to work fine, I've tested them. In my code I'm reacting on WM_LBUTTONDOWN. So lParam returns [Y][X] coordinates as 2 words in a dword. I extract them, then convert to normalized space, I've checked this part also works fine. When I click the lower left corner - I'm getting close values to -1 -1 and good values for all 3 other corners. I'm then using linepoins.vtx array for debug and It's not even close to reality.

unsigned int x_coord=lParam&0x0000ffff; //X RAW COORD
unsigned int y_coord=client_area.bottom-(lParam>>16); //Y RAW COORD

double xn=((double)x_coord/client_area.right)*2-1; //X [-1 +1]
double yn=1-((double)y_coord/client_area.bottom)*2;//Y [-1 +1]

_declspec(align(16))gl_vec4 pt_eye(xn,yn,0.0,1.0); 
gl_mat4 view_matrix_inversed;
gl_mat4 projection_matrix_inversed;
cam.matrixProjection.inverse(&projection_matrix_inversed);
cam.matrixView.inverse(&view_matrix_inversed);

gl_mat4::vec4_multiply_by_matrix4(&pt_eye,&projection_matrix_inversed);
gl_mat4::vec4_multiply_by_matrix4(&pt_eye,&view_matrix_inversed);

line_points.vtx[line_points.count*4]=pt_eye.x-cam.pos.x;
line_points.vtx[line_points.count*4+1]=pt_eye.y-cam.pos.y;
line_points.vtx[line_points.count*4+2]=pt_eye.z-cam.pos.z;
line_points.vtx[line_points.count*4+3]=1.0;
Concretion answered 20/11, 2014 at 7:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.