Google AI has announced the release of Model Search, a platform that will help researchers develop machine learning (ML) models automatically and efficiently. Model Search isn’t domain-specific, flexible, and well equipped to find the appropriate architecture that best fits a given dataset and problem. At the same time, it minimizes the coding time, effort, and resources. Model Search is built on Tensorflow and can run on both distributed settings or a single machine.
The Success of neural networks often depends on the extent to which they can generalize to various tasks. It is challenging to design Neural networks that can generalize well as the research community’s understanding of this concept is limited. The limitations become complicated when Machine Learning domains are taken into consideration. Techniques like neural architecture search (NAS) use algorithms, reinforcement learning (RL), evolutionary algorithms, and combinatorial search to build a neural network from a given search space. Although these techniques can deliver results better than their manually designed counterparts, these algorithms usually compute heavily and need thousands of models to train before converging and are domain-specific.
These shortcomings can be overcome by using Model Search. The Model Search System is built up of multiple trainers, a search algorithm, and a database to store evaluated models. The system can run both training and evaluation experiments in an adaptive yet asynchronous manner. Each trainer conducts experiments on their own, and all the trainers share knowledge from their experiments. At the starting of every cycle, the search algorithm goes over all the completed trials and then uses beam search to determine what to try next. It then implores mutation over one of the best architectures it finds and assigns the resulting model back to a trainer.
The neural network is built from a set of predefined blocks. This approach is more efficient as it explores only structures and not their fundamental and detailed components, thereby reducing the search space scale. As the framework is built on Tensorflow, blocks can implement any function that takes a tensor as an input. Moreover, the blocks provided can be fully defined neural networks that are already known to work for the given problem. In this case, Model Search can be configured to act as a powerful ensembling machine. The search algorithms used in Model Search are adaptive, greedy, and incremental making them converge faster than RL algorithms.
To improve efficiency and accuracy, Model Search enables transfer learning between various internal experiments in two ways: knowledge distillation or weight sharing. Knowledge distillation allows improving candidates’ accuracy by adding a loss term that matches the high-performing models’ predictions in addition to the ground truth. In contrast, Weight sharing bootstraps some of the network’s parameters from previously trained candidates by copying suitable weights from once trained models and randomly initializing the remaining ones.
The researchers claim that Model Search improves upon production models with minimal iterations. They illustrated Model Search’s capabilities in the speech domain by discovering a model for keyword spotting and language identification. It used fewer than 200 iterations and was found to improve efficiency. The researchers also applied Model Search to find an architecture suitable for image classification on the heavily explored CIFAR-10 imaging dataset. They observed that they were quickly able to reach a benchmark accuracy of 91.83 in only 209 trials as compared to 5807 trials for the RL algorithm.
The Model Search Code aims to provide the researchers with a flexible, domain-agnostic framework for ML model discovery. The framework is powerful enough to build models with state-of-the-art performance on well-known problems when provided with a search space composed of standard building blocks. The code extends access to AutoML solutions to the ever-flourishing research community.
Consultant Intern: Kriti Maloo is currently pursuing her B.Tech from Indian Institute of Technology (IIT) Bhubaneswar. She is interested in Data Analytics and its applications in various domains. She is a Bibliophile and loves to explore new advancements in the field of technology.