Kinect SDK 1 - Depth and Video Space
Written by Mike James   
Article Index
Kinect SDK 1 - Depth and Video Space
Problems with masks
Converting from depth to video

 

Converting from depth to video

To convert from depth to video coordinates is simply a matter in projective geometry. What we have are two perspective views of the same scene and so it is perfect possible to implement a function which converts between them - possible but not easy to get right.

For this reason the DepthImageFrame object has a set of coordinate conversion methods. In this case we need the MapToColorImagePoint method. This takes the depth image coordinates and takes into account the depth at the point and converts to the location in the video image. All you have to tell the method, in addition to the depth co-ordinates is the format of the video image.

(Note: you don't have to supply the depth of the pixel to the methods as in the early beta SDK.)

Next we need to compute the video co-ordinates vx,vy from the depth co-ordinates x,y:

ColorImagePoint p = DFrame.MapToColorImagePoint(
x, y,
ColorImageFormat.RgbResolution640x480Fps30);

The value is returned as a ColorImagePoint which is basically a structure with an X and Y field. You can unpack the result into vx and vy using:

vx = p.X;
vy = p.Y;

The bad news is that the returned co-ordinates aren't guaranteed to be within the video image. It is perfectly possible for points at the edge of the depth image to map outside  the video image because from the video cameras position they cannot be seen. The solution is to simply map such points to the edge of the video image:

vx = Math.Max(0, Math.Min(vx, 
VFrame.Width - 2));
vy = Math.Max(0, Math.Min(vy,
VFrame.Height - 2));

Finally we can use vx,vy to do the mask operation as before.

Now if you run the program you will find that the mask doesn't always fit perfectly, but there is no regular shift in its location.

You will also notice that there are areas that are not masked at all - this is just because their co-ordinates didn't occur in the depth image. If you want a full mask without holes and other artifacts you are going to have to put in a little more work.

 

The complete event handler is:

void FramesReady(object sender, 
AllFramesReadyEventArgs e)
{

DepthImageFrame DFrame =
e.OpenDepthImageFrame();
if (DFrame == null) return;
short[] depthimage =
new short[DFrame.PixelDataLength];
DFrame.CopyPixelDataTo(depthimage);

ColorImageFrame VFrame =
e.OpenColorImageFrame();
if (VFrame == null) return;
byte[] pixeldata =
new byte[VFrame.PixelDataLength];
VFrame.CopyPixelDataTo(pixeldata);

byte player;
int vx, vy;

for (int y = 0; y < DFrame.Height; y++)
{
for (int x = 0; x < DFrame.Width; x++)
{
player = (byte)(depthimage[
x + y * DFrame.Width] &
DepthImageFrame.PlayerIndexBitmask);
if (player != 0) player = 0xFF;

vx = x * 2;
vy = y * 2;
ColorImagePoint p =
DFrame.MapToColorImagePoint(
x, y,
ColorImageFormat.RgbResolution640x480Fps30);
vx = p.X;
vy = p.Y;

vx = Math.Max(0, Math.Min(vx,
VFrame.Width - 2));
vy = Math.Max(0, Math.Min(vy,
VFrame.Height - 2));

for (int k = 0; k < 8; k++)
{
pixeldata[(vx + vy * VFrame.Width) *
VFrame.BytesPerPixel + k] &= player;
pixeldata[(vx + (vy + 1) * VFrame.Width) *
 VFrame.BytesPerPixel + k] &= player;
}
}
}
pictureBox1.Image = ByteToBitmap(pixeldata,
VFrame.Width, VFrame.Height);
}

 

There are obviously lots of improvements that can be made to this code and many variations, but this is the basic algorithm for making the connection between depth and video pixels.

 

You can download the code for the Windows Forms version of this program from the CodeBin (note you have to register first).

Practical Windows Kinect in C#
Chapter List

  1. Introduction to Kinect
  2. Getting started with Microsoft Kinect SDK 1
  3. Using the Depth Sensor
  4. The Player Index
  5. Depth and Video Space
  6. Skeletons
  7. The Full Skeleton
  8. A 3D Point Cloud

If you would like to be informed about new articles on I Programmer you can either follow us on Twitter or Facebook or you can subscribe to our weekly newsletter.