Monday, July 20, 2020

Microsoft Develops Under-Display Camera Solution for Videoconferencing

Microsoft Research works on embedding a camera under a display for videoconferencing:

"From the earliest days of videoconferencing it was recognized that the separation of the camera and the display meant the system could not convey gaze awareness accurately. Videoconferencing systems remain unable to recreate eye contact—a key element of effective communication.

Locating the camera above the display results in a vantage point that’s different from a face-to-face conversation, especially with large displays, which can create a sense of looking down on the person speaking.

Worse, the distance between the camera and the display mean that the participants will not experience a sense of eye contact. If I look directly into your eyes on the screen, you will see me apparently gazing below your face. Conversely, if I look directly into the camera to give you a sense that I am looking into your eyes, I’m no longer in fact able to see your eyes, and I may miss subtle non-verbal feedback cues.

"With transparent OLED displays (T-OLED), we can position a camera behind the screen, potentially solving the perspective problem. But because the screen is not fully transparent, looking through it degrades image quality by introducing diffraction and noise.

To compensate for the image degradation inherent in photographing through a T-OLED screen, we used a U-Net neural-network structure that both improves the signal-to-noise ratio and de-blurs the image.

We were able to achieve a recovered image that is virtually indistinguishable from an image that was photographed directly.

Via MSPowerUser


  1. Another solution is to put two cameras on the two sides of the device forming a stereo pair. The depth map can then be used to render a virtual camera that is in the middle of the screen. I played around with the idea and built a prototype for this and I think it could work well.

    I've written an article about it here: The code is also available on GitHub:

  2. At AIRY3D, we produce technologies that enable monolithic, single-sensor (monocular) systems for producing both 3D images and 2D color images with minimal computation/latency and only a small increment to the sensor cost. This can enable not only 3D perspective correction, but new features based on user attention/pose/gestures.

  3. James, What is the working range for your 3D application


All comments are moderated to avoid spam and personal attacks.