Apple AI Researchers Introduce ‘MobileOne’, a Novel Mobile Backbone that Cuts Inference Time to Under One Millisecond on an iPhone12

In a recent research paper, a group of researchers from Apple emphasized that the problem is to reduce the expense of latency while increasing the accuracy of efficient designs by identifying major bottlenecks that impact on-device delay.

While lowering the number of floating-point operations (FLOPs) and parameter counts have resulted in efficient mobile designs with great accuracy, variables like memory access and parallelism continue to have a detrimental impact on delay cost during inference.

The research team introduces MobileOne, a unique and efficient neural network backbone for mobile devices, in the new publication An Improved One Millisecond Mobile Backbone, which reduces inference time to under one millisecond on an iPhone12 and achieves 75.9% top-1 accuracy on ImageNet.

The team’s significant contributions are summarized as follows:

  • Team present MobileOne, a revolutionary architecture that operates on a mobile device in less than one millisecond and provides state-of-the-art picture classification accuracy within efficient model topologies. Their model’s performance is likewise applicable to desktop CPUs.
  • In current efficient networks, they investigate performance constraints in activations and branching that result in enormous latency costs on mobile.
  • The impacts of train-time re-parameterizable branches and dynamic regularization relaxation in training are investigated. They work together to overcome optimization bottlenecks that might occur while training tiny models.
  • Their model generalizes to additional tasks, such as object detection and semantic segmentation, and outperforms earlier efficient approaches.

The article begins with an overview of MobileOne’s architectural blocks, which are intended for convolutional layers that are factored into depthwise and pointwise layers. The foundation is Google’s MobileNet-V1 block, consisting of 3*3 depthwise convolutions followed by 1*1 pointwise convolutions. To boost model performance, over-parameterization branches are also employed.

MobileOne employs a depth scaling strategy similar to MobileNet-V2: shallower early stages with higher input quality and slower layers. There are no data movement expenses since this arrangement does not require a multi-branched architecture at inference time. Compared to multi-branched systems, this allows the researchers to aggressively grow model parameters without incurring hefty latency penalties.

MobileOne was tested utilizing mobile devices on the ImageNet benchmark. On an iPhone12, the MobileOne-S1 model obtained a lightning-fast inference time of under one millisecond while obtaining 75.9% top-1 accuracy in the tests. MobileOne’s adaptability was also proved in other computer vision applications. The researchers successfully used it as a backbone feature extractor for a single shot object detector and in a Deeplab V3 segmentation network.

The research team examined the relationship between prominent metrics – FLOPs and parameter count – and latency on a mobile device in this section. They also look at how different architectural design decisions affect latency on the phone. They discuss our design and training procedure based on the results of the evaluation.

Overall, the study confirms the proposed MobileOne as an efficient, general-purpose backbone that produces state-of-the-art outcomes while being several times quicker on mobile devices compared to existing efficient designs.

This Article is written as a summary article by Marktechpost Staff based on the paper 'An Improved One millisecond Mobile Backbone'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, reference post.

Please Don't Forget To Join Our ML Subreddit
🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...