Friday, May 25, 2012

Engadget Tests Leap Motion Gesture Recognition Device

Engadget published an interview with David Holz and Michael Buckwald, the two men behind Leap Motion's gesture recognition device. While David and Michael refused to explain how it works, they only mentioned it's based on IR LEDs and cameras. The demo looks nice, albeit the window curtain is too dense for my taste:

16 comments:

  1. Vlad, I didnt follow your comment on "window curtain". Can you elaborate?

    ReplyDelete
  2. Eric, sure. In the video it appears that an extra care was taken to block the daylight from window. The curtain is black and very dense. Makes me wonder how well Leap's system works in normal living room daylight conditions.

    ReplyDelete
    Replies
    1. OK, I thought you were talking about some sort of virtual light curtain. I noticed in the video that hand placement seemed to be important - the system did not respond when he was gesturing when talking, making me think there was some sort of operating distance from the camera that was important.

      I think it is great that someone has this working so nicely. I think it is funny that the collective minds of this blog still don't know how this works (even if some guesses have been proposed).

      Delete
    2. Hasn't the answer has already been suggested in a previous post? They use a couple of IR LEDs and a single camera looking at the shadow of the hand projected onto the ceiling. Vertical motion is given by the scaling of the shadow. Horizontal motion is given by panning and stretching. Combine the two and you have 3D. Next time someone gets a chance to talk to him, just ask him if the thing works outdoors ;) can't believe someone dropped out of school for this. it could become a nice final-year project =) Z

      Delete
    3. You are joking I hope. Ceiling projection could give you x and z but not y.

      Delete
    4. By "y" you mean the distance to the table? That is given by the scaling of the shadow as I said.

      Delete
    5. Another way is to use two LEDs and light them up alternatively. The distance of the shadows on the ceiling gives you y. Or one LED and two cameras but that would be stereovision and he said it's not the case. An easy concept and I'm surprised you could't get it. Maybe I should have dropped out of school and start patenting obvious things =) Z

      Delete
    6. Relative to the display, x is horizontal, y is vertical, and z is the orthogonal distance the to the plane of the display.

      We can come up all day long with all kinds of solutions, and fixes to prpoblems those solutions, as it seems you have suggested. I just wonder if someone actually knows.

      And nothing wrong with patenting a solution if it works well.

      Delete
    7. That's exactly the problem: if how this thing works is known, then so will its limitations, and all the hype and upsell are killed right there. And if my theory is correct, this solution does NOT work well in quite a few scenarios: outdoors, lecture halls, ceiling lights, etc. I'm sure the math is interesting. But like I said this is probably good for a student project but not much more.

      Delete
    8. Hi,
      I'm not a specialist, but in my opinion celling shadow is posible only with laser, or very collimated light. they mention that they use cameras, so there are more than one I think that there are more than two. I think that it could be matrix of sensors...

      Delete
    9. My iPhone 4 flash LED does a pretty good job casting a clear shadow on the ceiling of a reasonably dark room. Its beam is probably 45-degree wide. y-resolution is not so good though, and the finger must be close to the light in order to see the scaling effect. I'd go with a matrix of LEDs aimed at different angles.

      If you have an iPhone 4, just download a flashlight app and try it out.

      Delete
  3. The limited volume of interaction makes me think that this technology is going to compete with touchscreens, rather than with 3D cameras...
    Matteo

    ReplyDelete
    Replies
    1. Engadget describes it more like a 3D touchscreen:

      "there's a problem with creating and manipulating 3D models using a mouse and keyboard --it's a needlessly complicated operation involving clicks and drop down menus. Holz wanted a way to make "molding virtual clay as easy as molding clay in the real world."

      Delete
    2. No, it can not replace the touch screen. Because the fielf of view ia conic, so it has to be placed either very far from the screen or at the corner of the screen. This is an example of NIT's sensor inside a touch screen: http://www.youtube.com/watch?v=-Ie0OIGU2Kc
      It should be hard to use this device.

      -yang ni

      Delete
  4. Maybe they do blob detection on 2-4 VGA 60Hz CMOS imagers with narrow NIR (at 750/800/850/900nm) band pass filters & matched illuminators. A FPGA/DSP does 2D blob detection and triangulates 3D based on first order moment blob centroids. This probably works okay when you have a high Z gradient, or good depth variation to segment the centroids (i.e. fingers, pencil, etc.). Should have a hard time with surfaces parallel to the sensor/display.

    Maybe they can have a latency around the frame period (1/60Hz), though I wonder how latency is defined with a rolling shutter. I guess it's a nice product for HCI.

    ReplyDelete
  5. Can't it be that they use some kind of reflectance interpretation by illuminate the fingers with a known light source? That might be the reason why they measure in a considerably dimmed room. This technique is known as shape from shading. In case my assumption is correct their algorithm is quit impressive.
    Blur motion would never be problem but a ambient light would be critical. NIR light source with narrowband filter would help and it seems that they apply this technique.

    ReplyDelete

All comments are moderated to avoid spam and personal attacks.