Thursday, June 25, 2020

v2e and Event-Driven Camera Nonidealities

ETH Zurich publishes an paper "V2E: From video frames to realistic DVS event camera streams" by Tobi Delbruck, Yuhuang Hu, and Zhe He. The V2E open source tool is available here.

"To help meet the increasing need for dynamic vision sensor (DVS) event camera data, we developed the v2e toolbox, which generates synthetic DVS event streams from intensity frame videos. Videos can be of any type, either real or synthetic. v2e optionally uses synthetic slow motion to upsample the video frame rate and then generates DVS events from these frames using a realistic pixel model that includes event threshold mismatch, finite illumination-dependent bandwidth, and several types of noise. v2e includes an algorithm that determines the DVS thresholds and bandwidth so that the synthetic event stream statistics match a given reference DVS recording. v2e is the first toolbox that can synthesize realistic low light DVS data. This paper also clarifies misleading claims about DVS characteristics in some of the computer vision literature. The v2e website is this https URL and code is hosted at this https URL."

The paper also explains some of the misconceptions about DVS sensors:

"Debunking myths of event cameras: Computer vision papers about event cameras have made rather misleading claims such as “Event cameras [have] no motion blur” and have “latency on the order of microseconds” [7]–[9], which were perhaps fueled by the titles (though not the content) of papers like [1], [10], [11]. Review papers like [5] are more accurate in their descriptions of DVS limitations, but are not very explicit about the actual behavior.

DVS cameras must obey the laws of physics like any other vision sensor: They must count photons. Under low illumination conditions, photons become scarce and therefore counting them becomes noisy and slow. v2e is aimed at realistic modeling of these conditions, which are crucial for deployment of event cameras in uncontrolled natural lighting.


  1. if you can generate DVS image from frame based video, what is the utility of DVS??

    1. > v2e processes video about 20 to 100 times slower than realtime...
      This is for use cases where you have a lot of visual data but no event camera but you want to develop algorithms for those environments.

  2. One can always generate dynamic images thru processing of sequential frames. The answer lies in comparing the energy required to get to the same outcome using event-driven and frame-based sensors. Regarding no motion blur and low latency arguments - One cannot breach the basic laws to reach thresholds in no time and trigger a readout. The tricks are in the right pixel topologies to minimize these.


  3. This means that all the information is inside the frame based image sequence. You don't need to do this transformation and many algos are available.

    1. No. The information is actually in the events. The same group has also a tool called e2vid that generates frame sequences from events. Also, there are a numbers of papers showing that ML works directly on events, thus the information is the events.


All comments are moderated to avoid spam and personal attacks.