How to align kinect's depth image with color image

F

2

11

The image produced by the color and depth sensor on the Kinect are slightly out of alignment. How can I transform them to make them line up?

Finbur answered 27/7, 2011 at 12:59 Comment(0)

F

8

The key to this is the call to 'Runtime.NuiCamera.GetColorPixelCoordinatesFromDepthPixel'

Here is an extension method for the Runtime class. It returns a WriteableBitmap object. This WriteableBitmap is automatically updated as new frames come in. So the usage of it is really simple:

    kinect = new Runtime();
    kinect.Initialize(RuntimeOptions.UseColor | RuntimeOptions.UseSkeletalTracking | RuntimeOptions.UseDepthAndPlayerIndex);
    kinect.DepthStream.Open(ImageStreamType.Depth, 2, ImageResolution.Resolution320x240, ImageType.DepthAndPlayerIndex);
    kinect.VideoStream.Open(ImageStreamType.Video, 2, ImageResolution.Resolution640x480, ImageType.Color);
    myImageControl.Source = kinect.CreateLivePlayerRenderer();

and here's the code itself:

public static class RuntimeExtensions
{
   public static WriteableBitmap CreateLivePlayerRenderer(this Runtime runtime)
   {
      if (runtime.DepthStream.Width == 0)
         throw new InvalidOperationException("Either open the depth stream before calling this method or use the overload which takes in the resolution that the depth stream will later be opened with.");
      return runtime.CreateLivePlayerRenderer(runtime.DepthStream.Width, runtime.DepthStream.Height);
   }
   public static WriteableBitmap CreateLivePlayerRenderer(this Runtime runtime, int depthWidth, int depthHeight)
   {
      PlanarImage depthImage = new PlanarImage();
      WriteableBitmap target = new WriteableBitmap(depthWidth, depthHeight, 96, 96, PixelFormats.Bgra32, null);
      var depthRect = new System.Windows.Int32Rect(0, 0, depthWidth, depthHeight);

      runtime.DepthFrameReady += (s, e) =>
            {
                depthImage = e.ImageFrame.Image;
                Debug.Assert(depthImage.Height == depthHeight && depthImage.Width == depthWidth);
            };

      runtime.VideoFrameReady += (s, e) =>
            {
                // don't do anything if we don't yet have a depth image
                if (depthImage.Bits == null) return;

                byte[] color = e.ImageFrame.Image.Bits;

                byte[] output = new byte[depthWidth * depthHeight * 4];

                // loop over each pixel in the depth image
                int outputIndex = 0;
                for (int depthY = 0, depthIndex = 0; depthY < depthHeight; depthY++)
                {
                    for (int depthX = 0; depthX < depthWidth; depthX++, depthIndex += 2)
                    {
                        // combine the 2 bytes of depth data representing this pixel
                        short depthValue = (short)(depthImage.Bits[depthIndex] | (depthImage.Bits[depthIndex + 1] << 8));

                        // extract the id of a tracked player from the first bit of depth data for this pixel
                        int player = depthImage.Bits[depthIndex] & 7;

                        // find a pixel in the color image which matches this coordinate from the depth image
                        int colorX, colorY;
                        runtime.NuiCamera.GetColorPixelCoordinatesFromDepthPixel(
                            e.ImageFrame.Resolution,
                            e.ImageFrame.ViewArea,
                            depthX, depthY, // depth coordinate
                            depthValue,  // depth value
                            out colorX, out colorY);  // color coordinate

                        // ensure that the calculated color location is within the bounds of the image
                        colorX = Math.Max(0, Math.Min(colorX, e.ImageFrame.Image.Width - 1));
                        colorY = Math.Max(0, Math.Min(colorY, e.ImageFrame.Image.Height - 1));

                        output[outputIndex++] = color[(4 * (colorX + (colorY * e.ImageFrame.Image.Width))) + 0];
                        output[outputIndex++] = color[(4 * (colorX + (colorY * e.ImageFrame.Image.Width))) + 1];
                        output[outputIndex++] = color[(4 * (colorX + (colorY * e.ImageFrame.Image.Width))) + 2];
                        output[outputIndex++] = player > 0 ? (byte)255 : (byte)0;
                    }
                }
                target.WritePixels(depthRect, output, depthWidth * PixelFormats.Bgra32.BitsPerPixel / 8, 0);
            };
            return target;
        }
    }

Footless answered 28/7, 2011 at 0:5 Comment(7)

Sadly that link its throwing a yellow screen of death my way right now. But I am looking into the method you mentioned – Finbur 28/7, 2011 at 1:48

@Mr-Bell - I've updated this post with the actual code instead of a link to it – Footless 28/7, 2011 at 3:23

This looks like it works. It does seem like calling GetColorPixelCoordinatesFromDepthPixel is killing my framerate. – Finbur 28/7, 2011 at 3:40

Is it possible to call GetColorPixelCoordinatesFromDepthPixel for a small number of calibration corners, then do interpolation or extrapolation inside your code? Are those misalignments mostly affine? – Wheelsman 19/8, 2011 at 6:45

@rwong, i don't know - that's a great question. if you post it as a separate question on this site, i'd vote it up – Footless 22/8, 2011 at 15:45

@Robert Levy: Please go ahead and ask it yourself. I'm just raising this question out of curiosity; as I do not have a use for Kinect yet. (If I ask it, I wouldn't have any means to verify those answers.) – Wheelsman 22/8, 2011 at 17:3

@robert thanks for your solution, the user looks great in this program with background removed – Northwestward 28/5, 2012 at 6:2

T

2

One way to do this is to assume that the color and depth images have similar variations in them, and to cross-correlate the two images (or smaller versions of them).

Pre-whiten the images to get at the underlying variations.
Cross-correlate the pre-whitened images or smaller versions of them.
The peak position of the cross-correlation will tell you the offset in x and y

Tacye answered 28/7, 2011 at 0:15 Comment(4)

Peter, those are interesting articles. However, I think that this solution might be significantly more empirical. I think it might just be an offset or something like that – Finbur 28/7, 2011 at 1:50

:-) OK. I'm probably over-thinking it. I've just been reading this sort of stuff... – Tacye 28/7, 2011 at 1:57

in the factory, each kinect device is calibrated and the offsets between the cameras is burned into the device's memory. the trick is in finding the right api to make use of that data. right now the official kinect sdk only provides one such api but others are being considered for future releases – Footless 28/7, 2011 at 3:25

@Robert: Thanks for the info! Sounds like fun. :-) – Tacye 28/7, 2011 at 11:51

Recommended topics

Hot tags