Event sensors are bio-inspired devices that imitate the brain’s efficient event-driven communication mechanisms. In contrast to conventional sensors, which capture the scene synchronously at a fixed rate, event sensors report the changes of the scene asynchronously.
For example, event sensors, such as DVS cameras, capture changes in luminosity over time for each pixel independently rather than intensity images, as conventional sensors do. The benefits of higher temporal resolution, lower time latency, higher dynamic range, and higher power efficiency have sparked interest in machine learning for event data.
With the improved performance of deep learning (DL) algorithms on various tasks in recent years, it has gained focus for learning with event sensors. However, overfitting causes the DL model that performs well on training data to degrade when validated against new and unknown data.
A straightforward solution to the overfitting problem is to increase the amount of labeled data significantly. This is, however, theoretically feasible but may be prohibitively expensive in practice. In addition, the severeness of the overfitting problem increases in event data due to their small size.
Data Augmentation: A solution for generalization
Many studies suggest that data augmentation improves the generalization ability of DL models by increasing the amount and diversity of data from available data. Translation, Rotating, Flipping, Cropping, Contrast, Sharpness, and other image augmentation techniques are pretty common. However, event data differ fundamentally from frame-like data (like images), and thus we cannot directly apply augmentation techniques developed for frame-like data to asynchronous event data.
Researchers from Chongqing University, the National University of Singapore, the German Aerospace Center, and Tsinghua University present a new technique called EventDrop for augmenting event data by dropping events. EventDrop is the first work to augment asynchronous event data by dropping events in a way that is simple to implement, computationally low-cost, and applicable to a variety of event-based tasks.
This study was inspired by observations that the number of events in a scene changes significantly over time. For example, the output of event cameras, for example, for the same scene under the same lighting conditions, can vary considerably over time. This could be caused by noise in event cameras. It is possible to improve the diversity of event data and thus the performance of downstream applications by randomly dropping a proportion of events.
Furthermore, when performing certain tasks on real-world data, scenes in images processed by DL algorithms may be partially occluded. The ability of algorithms to generalize well across different data sets is thus highly dependent on the diversity of the training dataset in terms of occlusion. However, the majority of available training datasets have low occlusion variance.
EventDrop addresses these issues by dropping events selected using various strategies to increase training data diversity. The researchers propose the following three methods for deciding which events should be dropped:
- Random drop: It randomly drops a proportion of events in the sequence to overcome the noise originating from event sensors.
- Drop by time: It drops events that occur at random intervals of time, stimulating the case that objects are partially occluded during a specific time.
- Drop by area: It discards events that occur within a randomly chosen pixel area while also attempting to improve data diversity by simulating various scenarios in which some parts of objects are partially obscured.
These augmentation operations allow for an increase in the amount of training data and the diversity of the data.
The researchers used N-Caltech101 and N-Cars datasets to test EventDrop. They discovered that by removing events, their method could significantly improve the accuracy of different deep neural networks on object classification tasks across both datasets.
The team plans to apply the proposed method to other event-based learning tasks, including place recognition, pose estimation, traffic flow estimation, simultaneous localization and mapping, and many more.