Saturday, February 06, 2021

Do Event-Based Imagers Have Advantage over Regular Global Shutter Ones?

A team of researchers from Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, and University of Defence, Brno, Czech Republic, publishes an interesting MDPI paper "Experimental Comparison between Event and Global Shutter Cameras" by Ondřej Holešovský, Radoslav Škoviera, Václav Hlaváč, and Roman Vítek.

"We compare event-cameras with fast (global shutter) frame-cameras experimentally, asking: “What is the application domain, in which an event-camera surpasses a fast frame-camera?” Surprisingly, finding the answer has been difficult.

Our methodology was to test event- and frame-cameras on generic computer vision tasks where event-camera advantages should manifest. We used two methods: (1) a controlled, cheap, and easily reproducible experiment (observing a marker on a rotating disk at varying speeds); (2) selecting one challenging practical ballistic experiment (observing a flying bullet having a ground truth provided by an ultra-high-speed expensive frame-camera). The experimental results include sampling/detection rates and position estimation errors as functions of illuminance and motion speed; and the minimum pixel latency of two commercial state-of-the-art event-cameras (ATIS, DVS240).

Event-cameras respond more slowly to positive than to negative large and sudden contrast changes. They outperformed a frame-camera in bandwidth efficiency in all our experiments. Both camera types provide comparable position estimation accuracy. The better event-camera was limited by pixel latency when tracking small objects, resulting in motion blur effects. Sensor bandwidth limited the event-camera in object recognition. However, future generations of event-cameras might alleviate bandwidth limitations.

We tested two event-cameras: iniVation (Zurich, Switzerland) DVS240 (DVS240 in short), which is an evolved version of the popular DAVIS240, and Prophesee (Paris, France) ATIS HVGA Gen3 (ATIS in short)."

When comparing the following data, especially the low-light part of it, please note that Basler camera has about 16 times smaller pixel area than the other 3 cameras. Also note that the power efficiency was not a part of this research.


"In future work, (a) we aim to research event/frame-camera performance in high dynamic range scenes; and (b) use event-cameras in robotic perception tasks requiring fast feedback loops."

9 comments:

  1. Interesting paper. Maybe its worth mentioning that the compared Basler runs a Python 300 sensor (this sensor is around for about 10 years...) on USB3.

    https://www.baslerweb.com/en/products/cameras/area-scan-cameras/ace/aca640-750um/

    https://www.onsemi.com/pub/Collateral/NOIP1SN1300A-D.PDF

    https://www.onsemi.com/products/sensors/image-sensors-processors/image-sensors/python300

    ReplyDelete
    Replies
    1. Also the DAVIS240c (~2013) and ATIS (~2010) are around for about 10 years.

      Delete
  2. Wouldn't it be necessary to custom tailor the vision algorithms for the event-based camera output for the event-based cameras to truly outperform a regular camera?

    ReplyDelete
    Replies
    1. I think this is a key point with event sensors. What do you win if you pack 'all events that have happened in the last 100us' into a frame? Well - maybe the data reduction on the sensor still remains. But processing this data is a challenge, you really have to think in 3D (x y t) and match your scene accordingly. A lot of image processing software is around for frame based images. And a lot of frame based thinking in the heads of a lot of sw engineers.

      Another challenge is figure 4. Pixel latency depending on contrast. Real world objects are not black/white but also grey. This means that your objects get blurred depending on brightness because bright parts of edges happen earlyer than greyer parts of edges.

      Pixel latency itself is a problem/challenge. Real world is captured 100s of us wrong. 1ms/s (not that fast) results in 100s of um position error due to pixel latency.

      Another challenge is real time behaviour if you want to feed the information of the cameras to a servo control loop. We tried to do but it turned out to be really challenching. It seems problems where you have a few ms before you have to react are better suited for this cameras than direct interaction in servo loops.

      And another challenge is: if there is no movement there is no data. I think it might be promising to combine a higher res 2D sensor with a event sensor via beam splitter (as already presented in various papers).

      One point the paper misses is the approach to trigger a short light pulse into a longer opened shutter window. This is quite promising. Also - I think you can go below 56us using a 2020 gs sensor (or a camera that allows shorter trigger pulses. I suppose the 56us limit is more a feature of Basler implementation than a limit of the sensor).

      But I really like the event sensor approach, very interesting technically. Will be interesting to see if real applications appear where it is really a solution.

      It was also posted here various times: there is a great (constantly maintained) collection of links on event based imaging on: https://github.com/uzh-rpg/event-based_vision_resources

      Delete
    2. Thanks for the reply! There seem to be quite some challenges that will have to be overcome. Do you think a hybrid approach could have advantages (e.g. an event sensor that can also take a regular 2D image)? I know the DAVIS camera has this option (although, in the papers it is usually only briefly mentioned) and I think the original ATIS sensor also had this functionality, but I don't know if it gives any real advantage.

      Delete
    3. I think it really depends on what you want to do, what type of problem you try to solve... I found the quality of the 2d images of DAVIS quite poor. Sufficient for example to run camera calibration on some static fiducial but not comparable with a standard 2D sensor. Also it was relatively slow. I think I'd favor a solution where the event sensor is purely event sensor and a "real" 2D sensor is combined via beamsplitter. You also dont have only event problems for a camera. Often you have 80% standard 2D problems to solve and one step that requires events (or other special information from other smartpixel sensor, there are several other types of sensors where you have this datareduction in pixel approach. Usually they lack in 2d performance and this is somehow a drawback. Because most time you need the 2d + the addon provided by smartpixel.

      Delete
    4. Prophesee asked me to post this comment:

      Great to see that the experiments verify our own characterization results obtained with a different approach. Nevertheless, including power efficiency would improve the quality of the paper as there is a huge difference between the different cameras especially considering the Photron as a 10kg device that requires 230W DC power. The main suggestions of the authors for improvements were already implemented in a later sensor design which was published by Prophesee and Sony in February 2020 at ISSCC (https://ieeexplore.ieee.org/abstract/document/9063149):


      The readout bandwidth was significantly increased by two orders of magnitude to 1.1 Billion events per second and the data efficiency was further pushed achieving a value of only 1.5bit/event for high data rates. The new sensor shows superior low-light performance even with a reduced pixel size of 4.86um (which is similar to the sensor in the Basler camera) while the power consumption of a VGA sensor version would be as low as 10mW.

      Delete
  3. I don't think that the paper is of very high quality. This is just a slightly better version of their first paper on the topic that was of low quality. I think it is unfair to compare global shutter sensors that have been around for decades, which are well-characterized in a standardized manner with event sensors which are fairly new and still suffer from poor characterization and standardisation. As many said, they are obviously missing the intrinsic nature of the data they process when they are applying image processing on event accumulation in timeslices.
    They are claiming that only pixel pitch is relevant to compare illumination received but they completely fail to factor in the fill factor that is extremely low and old generations of event sensors (we're talking about 10-20%)
    So I have just one dream, that one of the manufacturers cited in this paper does a proper version of the tests so we know once and for all where event based stands with respect to global shutter.

    ReplyDelete
    Replies
    1. Agree. The reviewers have missed question on char methodology. On the other side, this is not the most famous journal in sensors, I guess there is a reason why it was published here.

      Delete

All comments are moderated to avoid spam and personal attacks.