PDEBENCH: A Benchmark Suite of Time-Dependent Simulation to Benchmark The Performance of Novel Machine Learning Models

Recent advances in the emerging field of Scientific Machine Learning (also known as machine learning for physical sciences or data-driven science) have expanded the scope of traditional machine learning (ML) methods to include the time evolution of physical systems. Rapid progress has been made in this field in using neural networks to make predictions using available observations over continuous domains or with challenging constraints and physically motivated conservation laws. These neural networks offer a method for solving PDEs that complements traditional numerical solvers. Data-driven ML methods, for example, are helpful when observations are noisy or the underlying physical model needs to be fully known or defined.

Furthermore, neural models have the advantage of being continuously differentiable in their inputs, which is helpful in various applications. In physical system design, for example, the models are physical objects and thus cannot be analytically differentiated. Similarly, benchmark physical simulation models exist in many fields, such as hydrology, but forward simulation models are non-differentiable black boxes. This makes solving optimization, control, sensitivity analysis, and inverse inference problems more difficult. While complex methods such as Bayesian optimization or reduced order modeling attempt to compensate for this lack of differentiability, gradients for neural networks are both readily available and efficient.

There are numerous popular benchmarks for classical ML applications, such as image classification, time series prediction, and text mining, and evaluations using these benchmarks provide a standardized means of testing the effectiveness and efficiency of ML models. There currently needs to be widely available, practically simple and statistically challenging benchmarks with ready-to-use datasets to compare methods in Scientific ML. While some progress has been made toward reference benchmarks in recent years, they hope to provide a benchmark that is more comprehensive in terms of the PDEs covered and allows for more diverse methods for evaluating the efficiency and accuracy of the ML method.

The problems involve various governing equations and various assumptions and conditions; for a visual teaser, see figure below. Data can be generated by running code through a standard interface or downloading high-fidelity simulation datasets. All code is distributed under a permissive open-source license, allowing easy reuse and extension. In addition, they propose an API to facilitate the implementation and evaluation of new methods, current competitive baseline methods such as FNOs and autoregressive models based on the U-Net, and a set of pre-computed performance metrics for these algorithms. As a result, they can compare their predictions to the “ground truth” provided by the baseline simulators used to generate the data.

PDEBENCH offers a variety of non-trivial scientific challenges for benchmarking current and future ML methods, such as wave propagation and turbulent flow in 2D and 3D

Benchmarks in Scientific ML, like in other machine learning application domains, can provide readily available training data for algorithm development and testing without the overhead of generating data from scratch. The training/test data in these emulation tasks is theoretically unlimited because a simulator can cause more data. In practice, producing such datasets can be highly taxing regarding computing time, storage, and access to the specialized skills required. PDEBENCH also addresses the need for quick, off-the-shelf training data, avoiding these roadblocks while providing an easy on-ramp for future expansion.

They propose in this paper a versatile benchmark suite for Scientific ML that:

  1. It provides diverse data sets with distinct properties based on 11 well-known time-dependent and time-independent PDEs.
  2. It covers both “classical” forward learning problems and inverse settings.
  3. It is accessible via a uniform interface to read/store data across several applications.
  4. It Is extensible.
  5. It has results for popular state-of-the-art ML models (FNO, Initial, and (e.g., viscosity).

Each data set contains sufficient samples for training and testing for a wide range of parameter values, with a high enough resolution to capture local dynamics. Furthermore, their goal is not to provide a complete benchmark that includes only some possible combinations of inference tasks on all known experiments but to make it easier for future researchers to benchmark their preferred methods. Their goal here is to invite other researchers to use their ready-to-run models to fill in the gaps for themselves. To assess ML methods for scientific problems, they consider several metrics that go beyond the standard MSE and include physics properties.

The preliminary experimental results obtained with PDEBENCH confirm the importance of comprehensive Scientific ML benchmarks: There is no single model that fits all, and there is plenty of room for new ML developments. The results show that the MSE on test data, the standard error measure in ML, could be a better proxy for evaluating ML models, particularly in turbulent and non-smooth regimes where it fails to capture small spatial scale changes. They also discuss an application in which a parameter of the underlying PDE significantly influences the difficulty of the problem for ML baselines. The dataset, code and pretrained models are all available freely on the web.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'PDEBENCH: An Extensive Benchmark for Scientific Machine Learning'. All Credit For This Research Goes To Researchers on This Project. Check out the paper and code.
Please Don't Forget To Join Our ML Subreddit

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

[Announcing Gretel Navigator] Create, edit, and augment tabular data with the first compound AI system trusted by EY, Databricks, Google, and Microsoft