CMU Researchers Explain the Effectiveness of AutoML for Diverse Tasks Using AutoML Decathlon and NAS-Bench-360

Machine learning (ML) has experienced a sharp increase in popularity and complexity over the past ten years. Improved deep neural networks it is being used in various tasks, from tracking credit card fraud, solving partial differential equations (PDEs), and predicting medical conditions from gene sequences.

However, these fields require expert-driven creation of intricate hyperparameter tuning schemes. Such resource-intensive iteration is costly and out of reach for most practitioners. 

AutoML aims to enable ML developers to use ML on arbitrary new tasks. New research by Carnegie Mellon University investigates whether currently accessible AutoML tools easily and rapidly achieve near-expert performance on various learning tasks. The AutoML Decathlon and NAS-Bench-360 are two recent but related projects that assess the field’s current success in attaining this objective.

 A NAS Benchmark for a variety of tasks is NAS-Bench-360.

The first is a suite of benchmarks for neural architecture search (NAS), an emerging subject that aims to automate the creation of neural network models. NAS-Bench-360 is the first NAS testbed that extends beyond conventional AI domains like vision, text, and audio signals, with assessments on ten different tasks.

The ten tasks differ specifically in their area (such as images, financial time series, audio, and the natural sciences), issue type (such as regression, single-label, and multi-label classification), and scale (ranging from several thousand to hundreds of thousands of observations).

Their assessment of NAS-Bench-360 serves as a robustness test to see whether the computer vision-driven advancement in the field of NAS is indeed indicative of the wider success of AutoML across several applications, data kinds, and activities. More significantly, the benchmark will be an effective tool for creating and testing fresh, improved NAS methodologies. 

Their findings show that the cutting-edge algorithm like GAEA to search over a large search space like DARTS does produce models that consistently outperform expert architectures on half of the tasks, in addition to beating XGBoost, a stalwart Kaggle favorite, and a recent attempt at a general-purpose architecture, Perceiver IO. On the other hand, it performs poorly on a number of tasks, barely outperforming a basic baseline, such as a tweaked Wide ResNet.

A lack of robustness in the sector and the necessity for such a benchmark are shown by the examination of contemporary NAS methods on NAS-Bench-360.

The AutoML Decathlon is a competition with a variety of challenges and techniques.

The second project is a NeurIPS 2022 competition built upon their NAS-Bench-360 work. The intention of the NAS-Bench-360 release is to encourage the creation of NAS techniques that excel at a variety of activities. Researchers will make available a variety of tasks throughout the competition’s public development phase that will serve as a sample of (though separate from) the final set of test tasks used for evaluation.

The inconsistent results of NAS on this benchmark question whether autonomous architecture design should even be the core area of study for AutoML research in general.

This contest aims to close two gaps between research and application: lack of task variety and siloed methodological development.

This competition intends to promote innovation in AutoML with immediately transferrable outcomes to ML practitioners by developing it in a practitioner-centric manner and catering to the two aforementioned shortcomings.

Although AutoML is not a new field of study, the competition is timely given:

  1. The quick expansion of ML job diversity
  2. The advancement of ML model development
  3. The rapid rise of dataset size and compute resource availability.

The foundation of the AutoML Decathlon is a collection of 20 datasets that have been carefully chosen to reflect a wide range of real-world uses in science, technology, and industry fields.

Ten of the tasks will be utilized for development, and ten more will be used for final evaluation and won’t be made public until after the competition.



Please Don't Forget To Join Our ML Subreddit
🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...