Facebook AI unveils UVO (Unidentified Video Objects), a new dataset to boost the AI research on open-world segmentation. UVO is a benchmark for computer vision that can help machines mimic humans’ ability to detect unfamiliar visual objects with ease and precision due to its unmatched scope in data size and quality from video content.
The process of object segmentation has become a popular area for research in computer vision over the past few years. It is key to identify objects and understand where they are located correctly. This made researchers have proposed many different approaches such as Mask R-CNN and MaskProp. In essence, these methods are limited because they assume everything seen by them has been already preordained for detection and segmentation beforehand.
There are countless object concepts that models have never seen or learned in real-world applications, such as embodied AI or augmented reality assistants. This is because it’s not feasible to train a model on all the open world, unseen objects they may encounter.
When it comes to identifying objects, people have a distinct advantage over machines. People are capable of detecting unfamiliar items without any previous knowledge about them while the same cannot be said for robots and other types of machinery. This question has been one of the most challenging for the Facebook AI team. Their explorations have led to some unexpected results, such as machines detecting and segmenting any object they encounter regardless if it’s previously known or unknown.
Facebook AI researchers are excited about UVO, their new data set that contains real-world videos of action recognition benchmark with dense and exhaustive high-quality object mask annotations. The included video clips have an average of 13.5 unique object instances, which is eight times as many than in existing datasets built by the closed world assumption. Facebook AI researchers believe UVO is a versatile test bed for developing novel approaches to open-world segmentation while inspiring more research on building comprehensive understanding outside just classification or detection.
UVO v0.5: Google Drive Link