In 2019, Facebook AI Research released Detectron2 that gave developers an easy path to plugging custom modules into any object detection system. Detectron2 is a PyTorch-based library designed for training ML models to perform image classification and detect objects. Expanding Detectron2, the Mobile Vision team at Facebook Reality Labs released Detectron2Go (D2Go).
D2Go is a new, state-of-the-art extension for training and deploying efficient deep learning object detection models on mobile devices and hardware. D2Go is built on top of Detectron2, TorchVision, and PyTorch Mobile. Being the first tool of its kind, D2Go will allow users to take their models from training to mobile deployment.
The use of D2Go for object detection relies majorly on two factors —
- Latency (speed)
Latency is the major challenge that many vision systems face. Devices using server- or cloud-based models take time to gather data, send it to the cloud for processing, and then act on it. The latency is reduced if the model can live on the edge (inside the device itself).
End users also get additional security and privacy benefits with the On-device models. There are privacy concerns in object recognition because people worry about sensitive data, like personal images being sent to the cloud. Being an On-device model, D2Go handles the data and processing on-device.
The development has been taken one step further with D2Go. FBNet models that are pre-optimized for mobile devices can be created with architectures that can efficiently perform detection and segmentation tasks. The models can achieve the same functions as more significant, server-based models with more accuracy and efficiency. FAIR performed some tests on mobile-based models developed with D2Go; the results showed reduced latency and more precision than the server-based counterparts.
D2Go gives developers an option to use PyTorch Lightning as a training framework and leverage the community’s preexisting tools.
According to the team, D2Go combined with FBNetV3 provides instance segmentation, efficient detection, and keypoint estimation models. The above save compute in resource-abundant cases and allow such claims to run on-device. Facebook uses D2Go to develop computer vision models where it is essential to have hardware-aware, real-time models for a great user experience.