Eric Fossum put his Samsung IEDM 2011 paper on-line:
"A 192×108 pixel ToF-3D image sensor with single-tap concentric-gate demodulation pixels in 0.13 μm technology"
T.Y. Lee, Y.J. Lee, D.K. Min, S.H. Lee, W.H. Kim, S.H. Kim, J.K. Jung, I. Ovsiannikov,
Y.G. Jin, Y.D. Park, E.R. Fossum, and C.H. Chung
"A 3D-ToF FSI image sensor using novel concentric photogate [CG] pixels with single-tap operation is described. Through the use of CG structure, we are able to achieve high DC at larger pixel pitches. The new CG pixel structure substantially improves DC [demodulation contrast] to 53% at 20MHz at 28 μm pixel pitch. Recent initial results from a backside-illuminated (BSI) implementation of the same sensor show further improved performance and will be reported elsewhere."
When is (mass market) VGA+ ToF coming? Fingers are usually further away than 1m..
ReplyDeleteWhat is the LED optical power used in the test?
ReplyDeleteMass market? Your guess is as good as mine. One could say Kinect is a mass market device so competing with that technology could happen before long.
ReplyDeleteDo you usually sit more than 1.5 meters from your display? and how close are your hands?
Only information in the paper was approved for release but optical power is quite similar to other TOF devices reported, perhaps a little less due to improved QE and DC. Like with all imaging, the more photons the better up to full well (e.g. unless it is unmodulated sunlight!) Optical power requirements are a major disadvantage of active light ranging sensors for mobile applications...but not so bad for outlet power applications.
Hi Eric,
ReplyDeleteI do if the display is a TV. Or if gesturing at my laptop or phone for a presentation.
Kinect is purportedly VGA, but trying to extract fine detail from the raw data will remind you more of QCIF. Still impressive for the price.
There is a teaser of some other Samsung work that will be reported at ISSCC in Feb 2012 regarding a 480x360 TOF sensor embedded in a 2 Mpixel (1.5 after TOF pixels) color sensor shown on page 51 of my slides referenced a few days ago in this blog.
ReplyDeleteThat reminds me. Before I consider doing some experiments at Dartmouth, does anyone know anything about the "X-Y resolution of depth" of the human vision system? That is, consider an RGBZ sensor. We know that the HVS response for red and blue is far worse for X-Y resolution than for green. But, what about depth? How many Z pixels do we need to embed in an RGBZ sensor to achieve "being there" performance as far as depth goes (after ISP of course so that segmentation is already performed). So far I have not found much on depth perception in humans, and we are still good at judging depth even with one eye closed.
ReplyDeleteEric, I do not have the answer on depth resolution, but regarding "we are still good at judging depth even with one eye closed": when trying to play volleyball and tennis with one eye closed, my abilities to intercept balls degrade significantly. My be just personal experience though.
ReplyDeleteSure, of course depth perception with one eye is not so good, but the brain plays an important part in extracting depth information from visual field motion and other knowledge-based clues.
ReplyDeleteHuamn depth perception is mainly subjective. Beyond certain distance, it's hard to give an objective depth appreciation. This is determined by the triangulation of which the precision decreases quadatrically with distance. There were a lot of discussion on the order between shape and depth perception, the most known example is the random dot stereogram. If you can segment correctly the objects, then you can create a vivid 3D sensation by using a few distances such as the theater decoration. But by direct distance measurement, that will be much more difficult. It's hard to create a HD 3D scene by using a 198x108-pixel depth map. For some gesture controls, it's OK.
ReplyDelete-yang ni
The upcoming necessity of a standard of comparison between depth sensors is evident. The question of needed depth resolution must be answered more general, independent of chosen technology. As far as I know there are some discussions on research level but without useful outcome for industry (e.g. application development).
ReplyDeleteWith global models and an accepted standard it will be evident that for example structure light sensors will need more resolution than TOF to serve the same usable depth information. Furthermore an evolved sensor systems like human eye with brain based image data processing uses in some way stereo vision but also depth calculation of the different spatial frequencies in a image sequence like compound eyes. A benchmark of different technologies claims an accepted standard.
The vision is task oriented system. It's hard to put a universal standard for such particular performance. Can you make out an universal criterion for fly's eye and human's eye ?
ReplyDelete-yang ni
Here are some relevant patent documents. The first one is the most relevant.
ReplyDeleteUS 2010/0039546
US 2011/0049590
US 2008/0258187
US 2004/0245433
US 7977717
JP 2011-49446
The first is sort of relevant. You need to read the patent to understand what it teaches. It is not the same even though if you just quickly look at the pictures you might think it is. Actually the Microsoft device is similar to a device presented at IISW last June by a Japanese group, as I mentioned after their talk.
ReplyDelete