Calculate surface normals from depth image using neighboring pixels cross product
Asked Answered
A

2

30

As the title says I want to calculate the surface normals of a given depth image by using the cross product of neighboring pixels. I would like to use Opencv for that and avoid using PCL however, I do not really understand the procedure, since my knowledge is quite limited in the subject. Therefore, I would be grateful is someone could provide some hints. To mention here that I do not have any other information except the depth image and the corresponding rgb image, so no K camera matrix information.

Thus, lets say that we have the following depth image:

enter image description here

and I want to find the normal vector at a corresponding point with a corresponding depth value like in the following image:

enter image description here

How can I do that using the cross product of the neighbouring pixels? I do not mind if the normals are not highly accurate.

Thanks.


Update:

Ok, I was trying to follow @timday's answer and port his code to Opencv. With the following code:

Mat depth = <my_depth_image> of type CV_32FC1
Mat normals(depth.size(), CV_32FC3);

for(int x = 0; x < depth.rows; ++x)
{
    for(int y = 0; y < depth.cols; ++y)
    {

        float dzdx = (depth.at<float>(x+1, y) - depth.at<float>(x-1, y)) / 2.0;
        float dzdy = (depth.at<float>(x, y+1) - depth.at<float>(x, y-1)) / 2.0;

        Vec3f d(-dzdx, -dzdy, 1.0f);
        Vec3f n = normalize(d);

        normals.at<Vec3f>(x, y) = n;
    }
}

imshow("depth", depth / 255);
imshow("normals", normals);

I am getting the correct following result (I had to replace double with float and Vecd to Vecf, I do not know why that would make any difference though):

enter image description here

Artie answered 6/1, 2016 at 22:30 Comment(5)
Depends what OpenCV does when you dump an image of vectors. It doesn't look like your XYZ is mapping to RGB, and the positive/negative component range may not map well to 0-255 pixel values without some scaling. That's why my code also includes a simple shading model to produce a greyscale image from the normals.Capillary
Hi @Capillary I do not think that it is an Opencv problem because if I load the normals from a Matlab script to Opencv and imshow() them then I am getting a nice image compared to the one above.Artie
You might find this useful.Senility
@Senility really interesting link. Thanks.Artie
Thanks for the update. One small correction is that in most cases the cv::Mat is accessed (y,x) while an image point is cv::Point(x,y,z). Therefore, there is a mismatch in your rows being the 'x'? Keeping your loop nomenclature the same, the normal should be 'Vec3f d(-dzdy, -dzdx, 1.0f);'.Forrer
C
35

You don't really need to use the cross product for this, but see below.

Consider your range image is a function z(x,y).

The normal to the surface is in the direction (-dz/dx,-dz/dy,1). (Where by dz/dx I mean the differential: the rate of change of z with x). And then normals are conventionally normalized to unit length.

Incidentally, if you're wondering where that (-dz/dx,-dz/dy,1) comes from... if you take the 2 orthogonal tangent vectors in the plane parellel to the x and y axes, those are (1,0,dzdx) and (0,1,dzdy). The normal is perpendicular to the tangents, so should be (1,0,dzdx)X(0,1,dzdy) - where 'X' is cross-product - which is (-dzdx,-dzdy,1). So there's your cross product derived normal, but there's little need to compute it so explicitly in code when you can just use the resulting expression for the normal directly.

Pseudocode to compute a unit-length normal at (x,y) would be something like

dzdx=(z(x+1,y)-z(x-1,y))/2.0;
dzdy=(z(x,y+1)-z(x,y-1))/2.0;
direction=(-dzdx,-dzdy,1.0)
magnitude=sqrt(direction.x**2 + direction.y**2 + direction.z**2)
normal=direction/magnitude

Depending on what you're trying to do, it might make more sense to replace the NaN values with just some large number.

Using that approach, from your range image, I can get this:

enter image description here

(I'm then using the normal directions calculated to do some simple shading; note the "steppy" appearance due to the range image's quantization; ideally you'd have higher precision than 8-bit for the real range data).

Sorry, not OpenCV or C++ code, but just for completeness: the complete code which produced that image (GLSL embedded in a Qt QML file; can be run with Qt5's qmlscene) is below. The pseudocode above can be found in the fragment shader's main() function:

import QtQuick 2.2

Image {
  source: 'range.png'  // The provided image

  ShaderEffect {
    anchors.fill: parent
    blending: false

    property real dx: 1.0/parent.width
    property real dy: 1.0/parent.height
    property variant src: parent

    vertexShader: "
      uniform highp mat4 qt_Matrix;
      attribute highp vec4 qt_Vertex;
      attribute highp vec2 qt_MultiTexCoord0;
      varying highp vec2 coord;
      void main() {
        coord=qt_MultiTexCoord0;
        gl_Position=qt_Matrix*qt_Vertex;
      }"

   fragmentShader: "
     uniform highp float dx;
     uniform highp float dy;
     varying highp vec2 coord;
     uniform sampler2D src;
     void main() {
       highp float dzdx=( texture2D(src,coord+vec2(dx,0.0)).x - texture2D(src,coord+vec2(-dx,0.0)).x )/(2.0*dx);
       highp float dzdy=( texture2D(src,coord+vec2(0.0,dy)).x - texture2D(src,coord+vec2(0.0,-dy)).x )/(2.0*dy);
       highp vec3 d=vec3(-dzdx,-dzdy,1.0);
       highp vec3 n=normalize(d);
       highp vec3 lightDirection=vec3(1.0,-2.0,3.0);
       highp float shading=0.5+0.5*dot(n,normalize(lightDirection));
       gl_FragColor=vec4(shading,shading,shading,1.0);
     }"
  }
}
Capillary answered 6/1, 2016 at 23:40 Comment(11)
thanks for the answer I am trying now to port your code to opencv. However, how I would do the same by using the cross product since I need to do it but that way.Artie
in your example you are using 4-neighbours cross product, right? If I wanted to use 8-neighbours then I should apply the same procedure for the diagonal pixels, right?Artie
@theodore: Well one way to think of it is that the "input" to the cross product is two tangent vectors, and the cross-product generates the perpendicular normal. Above, I'm effectively using 4 neighbours as the end-points of two tangent vectors. It's not obvious how to extend that to 8 neighbouring points... however I do know there are more advanced methods which fit splines to more points in the neighbourhood, and then use the normal to the splined surface (example at bib.irb.hr/datoteka/150807.1-0239.pdf ).Capillary
thanks, I see. Actually what I had in my mind was something like that: dzdx=(z(x+1,y)-z(x-1,y))/2.0; dzdy=(z(x,y+1)-z(x,y-1))/2.0; direction=(-dxdz,-dydz,1.0) magnitude=sqrt(direction.x**2 + direction.y**2 + direction.z**2) dzdx1=(z(x+1,y+1)-z(x-1,y-1))/2.0; dzdy1=(z(x-1,y+1)-z(x+1,y-1))/2.0; direction1=(-dxdz1,-dydz1,1.0) magnitude1=sqrt(direction1.x**2 + direction1.y**2 + direction1.z**2) normal=(direction/magnitude) + (direction1/magnitude1) but I do not know how correct is this.Artie
Hmmm I could imagine doing something like computing one normal using the 4 x-y axis adjacent points, and then another normal using the 4 "diagonal" points... and then averaging those (and renormalizing). Hard to guess how much better that might or might not be than the 4-adjacent-points only version though. If your data is noisy and you're trying to remove the effect of spurious values, some sort of pre-filtering might be a better approach than building a lot of complication into normal generation.Capillary
ok so my thought above in my comment is not completely wrong. For the moment it does not have to do with noisy data or anything but just from curiosity what they mean computing the normals from the cross product of using 4 neighbors or 8 neighbors pixels respectively. I do not know though if you would like to add the 8 neighborhood pixels approach from your comment in your answer above just for completion reasons for the future readers. In any case thanks a lot for the help and the guidance ;-)Artie
Is this also true for range images obtained from a projective projection? I'm unable to make the connection to the following explanation: #30993711 Especially the atan is a mystery to me...Tristis
@ASML: A perspective projection would change things slightly as the rays aren't parallel and a constant "range" value becomes a spherical surface rather than a flat plane. The correction factor might well involve a tangent if done in spherical geometry. You should probably ask a separate question; comments aren't the place.Capillary
@timday: Thanks! I have asked a separate question some time ago, but haven't gotten any answers: #50242319Tristis
In the pseudo code, why is the z component in direction vector set to 1?Hackberry
@Hackberry because if for example dxdz was 1.0 (and dydz was 0.0), then that'd be a 45degree slope and you'd want to form a right isosceles triangle (hence with z=1). The normal direction lies along the hypotenuse of that triangle, and you want that to have unit length, hence the subsequent normalisation step.Capillary
S
0

The code (matrix calculation) I think is right:

def normalization(data):
   mo_chang =np.sqrt(np.multiply(data[:,:,0],data[:,:,0])+np.multiply(data[:,:,1],data[:,:,1])+np.multiply(data[:,:,2],data[:,:,2]))
   mo_chang = np.dstack((mo_chang,mo_chang,mo_chang))
   return data/mo_chang

x,y=np.meshgrid(np.arange(0,width),np.arange(0,height))
x=x.reshape([-1])
y=y.reshape([-1])
xyz=np.vstack((x,y,np.ones_like(x)))
pts_3d=np.dot(np.linalg.inv(K),xyz*img1_depth.reshape([-1]))
pts_3d_world=pts_3d.reshape((3,height,width))
f= pts_3d_world[:,1:height-1,2:width]-pts_3d_world[:,1:height-1,1:width-1]
t= pts_3d_world[:,2:height,1:width-1]-pts_3d_world[:,1:height-1,1:width-1]
normal_map=np.cross(f,l,axisa=0,axisb=0)
normal_map=normalization(normal_map)
normal_map=normal_map*0.5+0.5
alpha = np.full((height-2,width-2,1), (1.), dtype="float32")
normal_map=np.concatenate((normal_map,alpha),axis=2)
  1. We should use the camera intrinsics named 'K'. I think the value f and t is based on 3D points in camera coordinate.

  2. For the normal vector, the (-1,-1,100) and (255,255,100) are the same color in 8 bites images but they are totally different normal. So we should map the normal values to (0,1) by normal_map=normal_map*0.5+0.5.

Welcome to communication.

Sociopath answered 23/8, 2020 at 11:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.