Kinect SDK 1.0 – 4 – Kinect in depth with the DepthStream

20. April 2012 12:04 by Renaud in   //  Tags:   //   Comments (15)
 1. Introduction to the API
 2. Use the ColorImageStream
 3. Track the users with the SkeletonStream
 4. Kinect in depth!
 5. Speech recognition

This article will show you how to use one of the notable properties of the Kinect, which is its ability to give you a 3-dimensional view of the space.

Maybe it seems trivial today, because you can just go to the cinema to watch a 3D movie, but the technology used by the Kinect isn't the same than for the cinema. For a 3D movie, you'll need two cameras, to capture two images from different places, just like your eyes do. Then your brain will process those images and you'll know that there is an idea of depth in there.

But the Kinect can't give you two images! However it can use its infrared sensor to calculate the depth of each pixel that it sees.

So with the small project below, we will see how to get the DepthImageFrame, and how to use it to produce a 3-dimensional video. At the same time, we will see how to use the polling instead of the event model as in the previous examples.

You can download the sources right here:

[caption id="attachment_126" align="aligncenter" width="132" caption="XnaKinect - 3D vision using DepthStream"][/caption]

For that project, we will use the XNA 4.0 Framework. If you're not familiar with it, it's not a problem: me neither ;)

What you need to know is that an XNA application is executed into a loop, so we don't have to add some eventhandlers here to receives the frame. We will ask the Kinect directly each time we go through the loop.

3D scene: the principle

So the idea as explained above is to produce a video in 3 dimensions based on the information received from the Kinect. Then, when the video will be rendered, we will be able to move the position of the camera virtually: it means that the Kinect will stay at the same place, but we will see the scene from a different point of view!

Intialize the Kinect sensor

Once again, we will simply take the first connected Kinect and activate the DepthStream and the ColorStream :

        public Game1()
        {
            graphics = new GraphicsDeviceManager(this);
            Content.RootDirectory = "Content";
            // Use the first Connected Kinect
            Kinect = KinectSensor.KinectSensors.FirstOrDefault(k => k.Status == KinectStatus.Connected);
            if (Kinect != null)
            {
                // Activate the Near mode
                Kinect.DepthStream.Range = DepthRange.Near;
                // Enable the DepthStream
                Kinect.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);
                // ... and instantiate the needed arrays to store some data
                depthPixelData = new short[Kinect.DepthStream.FramePixelDataLength];
                colorCoordinates = new ColorImagePoint[depthPixelData.Length];
                vertexPositionColors = new VertexPositionColor[depthPixelData.Length * 2];

                // Enable the Color Stream
                Kinect.ColorStream.Enable(ColorImageFormat.RgbResolution1280x960Fps12);
                colorFramePixelData = new byte[Kinect.ColorStream.FramePixelDataLength];

                Kinect.Start();
            }
        }

In the Update method, we will retrieve the color and depth frames :

            if (Kinect != null)
            {
                // Ask for a DepthFrame
                using (DepthImageFrame depthFrame = Kinect.DepthStream.OpenNextFrame(0))
                {
                    if (depthFrame != null)
                    {
                        // Copy the data
                        depthFrame.CopyPixelDataTo(depthPixelData);

                        // And match each point of the depthframe to a position
                        // in the colorframe that we are about to receive
                        Kinect.MapDepthFrameToColorFrame(depthFrame.Format, depthPixelData, Kinect.ColorStream.Format, colorCoordinates);

                        using (ColorImageFrame colorFrame = Kinect.ColorStream.OpenNextFrame(0))
                        {
                            if (colorFrame != null)
                            {
                                // Copy the data
                                colorFrame.CopyPixelDataTo(colorFramePixelData);
                            }
                        }
                    }
                }
            }

Here, we are using the OpenNextFrame method directly on the streams to retrieve the frames. The value passed in parameter indicates how long in milliseconds you are willing to wait for the information. If the time is elapsed, you'll receive a null value.

Notice that after getting the depthFrame, we use the MapDepthFrameToColorFrame to have an array containing for each depth pixel the coordinates of the corresponding color pixel in a color frame with the given format!

Then, in the draw method, we will create the points we want to draw :

                // For each pixel index in the colorCoordinates array
                for (int i = 0; i < colorCoordinates.Length; i++)
                {
                    // If the coordinates are in the range of the colorstream frame size
                    if (colorCoordinates[i].X < Kinect.ColorStream.FrameWidth
                        && colorCoordinates[i].Y < Kinect.ColorStream.FrameHeight)
                    {
                        // calculate the X,Y coordinates of the pixel in the depthstream frame
                        var pixelY = (i) / Kinect.DepthStream.FrameHeight;
                        var pixelX = (i) % Kinect.DepthStream.FrameWidth;

                        // Find the corresponding value in the Skeleton referential
                        var skeletonPoint = Kinect.MapDepthToSkeletonPoint(Kinect.DepthStream.Format, pixelX, pixelY, depthPixelData[i]);

                        // Retrieve the first index of the fourth-bytes pixel in the ColorFrame.
                        var baseIndex = (colorCoordinates[i].Y * Kinect.ColorStream.FrameWidth + colorCoordinates[i].X) * 4;

                        // And finally we update the corresponding 
                        vertexPositionColors[i * 2].Color.R = colorFramePixelData[baseIndex + 2];
                        vertexPositionColors[i * 2].Color.G = colorFramePixelData[baseIndex + 1];
                        vertexPositionColors[i * 2].Color.B = colorFramePixelData[baseIndex + 0];
                        vertexPositionColors[i * 2].Position.X = skeletonPoint.X;
                        vertexPositionColors[i * 2].Position.Y = skeletonPoint.Y;
                        vertexPositionColors[i * 2].Position.Z = skeletonPoint.Z;

                        // Use another point right behind the first one
                        vertexPositionColors[i * 2 + 1] = vertexPositionColors[i * 2];
                        vertexPositionColors[i * 2 + 1].Position.Z = skeletonPoint.Z + 0.05f;
                    }
                }

In the above code, for each pixel that we want to draw, we will find its position in the Skeleton referential thanks to the MapDepthToSkeletonPoint method.

You have to know that in XNA 4.0, you can't draw a "point". So, what we are doing is to draw small lines, based on the actual pixel point, and a second point 5 cm behind the first one.

There are probably much more efficient ways to achieve what we are doing here, but I didn't find them. It's just for fun of course :)

And finally, draw a list of lines based on the points we just defined.

                foreach (EffectPass effectPass in basicEffect.CurrentTechnique.Passes)
                {
                    effectPass.Apply();

                    graphics.GraphicsDevice.DrawUserPrimitives<VertexPositionColor>(
                        PrimitiveType.LineList,
                        vertexPositionColors,
                        0,
                        vertexPositionColors.Length / 2
                    );
                }

Move the camera position

First, we initialize the position at the application startup:

        /// <summary>
        /// LoadContent will be called once per game and is the place to load
        /// all of your content.
        /// </summary>
        protected override void LoadContent()
        {
            basicEffect = new BasicEffect(GraphicsDevice);

            // The position of the camera
            cameraPosition = new Vector3(0, 0, -1);
            // creates the view based on the camera position, the target position, and the up orientation.
            viewMatrix = Matrix.CreateLookAt(cameraPosition, new Vector3(0, 0, 2), Vector3.Up);
            basicEffect.View = viewMatrix;
            basicEffect.Projection = Matrix.CreatePerspectiveFieldOfView(MathHelper.ToRadians(45f), GraphicsDevice.Viewport.AspectRatio, 1f, 1000f);
            basicEffect.VertexColorEnabled = true;
        }

And to be able to move it, we add the following code in the Update method:

           foreach (Keys key in Keyboard.GetState().GetPressedKeys())
            {
                if (key == Keys.D)
                {
                    cameraPosition.X -= 0.01f;
                }
                if (key == Keys.Q)
                {
                    cameraPosition.X += 0.01f;
                }
                if (key == Keys.Z)
                {
                    cameraPosition.Y += 0.01f;
                }
                if (key == Keys.S)
                {
                    cameraPosition.Y -= 0.01f;
                }
                if (key == Keys.E)
                {
                    cameraPosition.Z += 0.01f;
                }
                if (key == Keys.X)
                {
                    cameraPosition.Z -= 0.01f;
                }
            }
            viewMatrix = Matrix.CreateLookAt(cameraPosition, new Vector3(0, 0, 2), Vector3.Up);
            basicEffect.View = viewMatrix;

Those lines will change the camera position if you press a key. For example, if you press Z, it will go up on the Y axis, and if you press S it will go down.  

Comments (15) -

Salvatore russo
Salvatore russo
5/26/2012 11:05:07 AM #

Hi, great job!! I wouldn like to replay the same thing but with my asus xtion pro live that use openni itself!is there any source code??if don't never mind i'm interst only in the walk through!Smile however, did you test it alsi by adding a virtual object in the scene?? My porpouse is to manage this aspect!

Best regards,
Salvatore

salvatore
salvatore
5/26/2012 12:05:57 PM #

Regarding the source code, I misread...probably i need a pair of glasses Laughing i found it at the beginning of the page Laughing

secondly, I forgot to ask you, if this method can be applied also in real time. What I mean? wherear I move my kinect, can I display the rendering?

Renaud
Renaud
5/26/2012 6:05:30 PM #

Helle Salvatore,

Yes it's absolutely possible to move the kinect during the process! But here, as I'm trying to keep some pixels visible even when the Kinect can't see them, it would propbably result in some strange scenes! ^^

I never tried using the Asus Xtion nor OpenNI so I'm not sure if there are some code samples somewhere, sorry!

Salvatore russo
Salvatore russo
5/26/2012 7:05:17 PM #

Thanks for replying Smile don't worry i'm implementing the class! I've already done it for xna project Smilecan't wait to see the result Smile

Renaud
Renaud
5/26/2012 8:05:57 PM #

Nice! Laughing Let me know about what you achieved if you share it somewhere! I would be interested to see the result as well !

salvatore
salvatore
5/27/2012 12:05:07 AM #

Hi, I have a question:

the string:
var skeletonPoint = Kinect.MapDepthToSkeletonPoint(Kinect.DepthStream.Format, pixelX, pixelY, depthPixelData[i]);

allows to render the body? if so, how can i avoid this thing??

salvatore
salvatore
5/27/2012 12:05:19 AM #

my porpouse is to render only the background Smile

Renaud
Renaud
5/27/2012 12:05:35 AM #

No, this method will give me a SkeletonPoint corresponding to a pixel in the depth frame. It was just to work in the Skeleton referential where each values (X/Y/Z) are expressed in meters, instead of working in the depth frame referential where the values are in pixel/pixel/millimeters (X/Y/Z).

If you just want the background, for each pixels of the depth frame (in the Kinect SDK), you can know wether the pixels represents a player or not. If yes, then you can ignore it!

Hope it will help you Smile

venu
venu
6/18/2012 2:06:49 PM #

hi can the code is not working we are getting errors at kinect.start(); and at using (ColorImageFrame colorFrame = Kinect.ColorStream.OpenNextFrame(0)) like invalidOperationExpectation was unhandled and xna 3d contect file is could not be found.

venu
venu
6/19/2012 9:06:27 AM #

we are working on this project we got the error at DepthImageFrame depthFrame = sensor.DepthStream.OpenNextFrame(0) and 3D content is also not working properly and even at kinectsensor.start() can u help us

Renaud
Renaud
6/21/2012 1:06:38 AM #

Did you try using the Kinect in any other simple application to see if you get the same error when calling kinect.Start() ?

Renaud
Renaud
7/26/2012 1:07:18 AM #

Indeed, it makes sense : The Near Mode is only allowed with the Kinect for PC !

Thanks for your comment ;)

OutlawLemur
OutlawLemur
7/26/2012 1:07:36 AM #

I have this problem. I am using the Xbox 360 Kinect. Just remove the "Kinect.NearDepthMode = true;" line. Then it should work.

saleh salous
saleh salous
9/27/2013 2:42:02 PM #

hi all
i have question
- if i get x,y,z head skeleton joint position from skeleton. can i get corresponding  pixcel in depth frame

Renaud
Renaud
10/2/2013 10:42:37 PM #

Yes, that's very easy to achieve. Use this reference: msdn.microsoft.com/.../...keletonpointtodepth.aspx

You just have to call the MapSkeletonPointToDepth method on your KinectSensor object. You give two parameters:
- your SkeletonPoint (with X,Y,Z coordinates)
- the format of the DepthFrame in which you want the pixel coordinates

You'll then receive a DepthImagePoint with X,Y coordinates corresponding to your DepthFrame!

Hope it helped!

Renaud

Add comment

  Country flag

biuquote
Loading

TextBox

About the author

I'm a developer, blog writer, and author, mainly focused on Microsoft technologies (but not only Smile). I'm Microsoft MVP Client Development since July 2013.

Microsoft Certified Professional

I'm currently working as an IT Evangelist with an awesome team at the Microsoft Innovation Center Belgique, where I spend time and energy helping people to develop their projects. I also give training to enthusiastic developers and organize afterworks with the help of the Belgian community.

MIC Belgique

Take a look at my first book (french only): Développez en HTML 5 pour Windows 8

Développez en HTML5 pour Windows 8

Membre de l'association Fier d'être développeur

TextBox

Month List