Neural Architecture Search (NAS) has recently become an interesting area of deep learning research, offering promising results. One such approach, Vanilla NAS, uses search techniques to explore the search space and evaluate new architectures by training them from scratch. However, this may require thousands of GPU hours, leading to a very high computing cost for many research applications.
Researchers often utilize another approach, one-shot NAS, to substantially lower the computing cost using a supernet. A supernet is capable of approximating the accuracy of neural architectures in the search space without being trained from scratch. But its search can be hampered by inaccurate predictions from the supernet, thus, making it hard to identify suitable architectures.
Facebook has, therefore, recently introduced Few-shot NAS. This novel method combines the accurate network ranking of vanilla NAS with the one-shot NAS’s speed and minimal computing cost.
Few-shot NAS enables any user to design a powerful customized model for their tasks using very few GPUs. The researchers also show that it effectively designs numerous state-of-the-art models, ranging from convolutional neural networks to generative adversarial networks.
Few-shot NAS improves performance estimation by partitioning the search space into various independent regions and then employing multiple sub-supernets to cover these regions. To partition the search space sensibly, the researchers choose to utilize the structure of the original supernet. They pick each edge connection individually to select a way to split the search space consistent with how the supernet is constructed.
To investigate whether using multiple supernets could offer the best aspects of both one-shot NAS and vanilla NAS, the research team at Facebook designed a search space containing nearly 1,296 networks. Firstly they trained the networks to rank them according to their actual accuracies on the CIFAR10 data set. Next, they predicted the 1,296 networks using 6, 36, and 216 sub-supernets. Lastly, they compared the predicted rankings with the actual ranking. The results proved that the ranking improved significantly even when adding just a few sub-supernets.
The team then tested their idea on real-world tasks and found that, compared with one-shot NAS, few-shot NAS improved the accuracy of architecture evaluations with a minute increase in evaluation costs.
On ImageNet, few-shot NAS finds models that reach nearly 80.5 percent top-1 accuracy at 600 MFLOPS and 77.5 percent top-1 accuracy at 238 MFLOPS. In AutoGAN, few-shot NAS outperforms the previously achieved results by almost 20 percent, whereas, on CIFAR10, it reaches 98.72 percent top-1 accuracy without using any extra data or transfer learning.
Experiments have demonstrated that few-shot NAS improves various one-shot methods significantly, such as four gradient-based and six search-based methods on three different tasks in NasBench-201 and NasBench1-shot-1.
The work shows that the few-shot NAS is a simple yet extremely effective advance over the ability of one-shot NAS in improving ranking prediction. Moreover, it is can also be widely applicable to all existing NAS methods. While the team displays these scenarios as concrete examples, the technique they have developed can have broad applications, for example, when candidate architecture needs to be evaluated quickly in search of better architectures.
Few-shot NAS contributes to the design of accurate and quick value models. Applying this few-shot approach can improve the search efficiency of various neural architecture search algorithms using a supernet such as AttentiveNAS and AlphaNet. The research team at Facebook hopes that their method can be utilized in even broader scenarios.
Consultant Intern: Kriti Maloo is currently pursuing her B.Tech from Indian Institute of Technology (IIT) Bhubaneswar. She is interested in Data Analytics and its applications in various domains. She is a Bibliophile and loves to explore new advancements in the field of technology.