A Study on Various Deep Learning-based Weather Forecasting Models

Due to its impact on human life worldwide, weather forecasting has drawn the interest of several researchers from various research communities. Many studies have been motivated to explore hidden hierarchical patterns in the large volume of weather datasets for weather forecasting due to the recent development of deep learning techniques, the widespread availability of massive weather observation data, and the advent of information and computer technology. ML techniques have been applied to forecast extreme weather events, identify extreme weather and climate patterns in observed and modeled atmospheric conditions, and give operational guidance and risk assessment for severe weather. The past few years have seen the development of deep learning-based Weather Forecasting Models like MetNet-2, WF-UNet, ClimaX, GraphCast, Pangu-Weather, and more. This article briefly discusses these models to get an insight into how these models are quickly beating traditional Meteorological Simulators by large margins.

ClimaX: Foundation Model For Weather & Climate 

Numerical atmospheric models grounded in physics are the backbone of today’s weather and climate forecasting software. These techniques modeled nonlinear dynamics and intricate multi-variable interactions, making them challenging to approximate. Numerically simulating atmospheric processes with high spatial and temporal resolution is computationally demanding. Up-to-date machine learning-based data-driven techniques directly handle downstream forecasting or projection tasks by training a data-driven functional mapping in deep neural networks. These networks lack numerical model generality since they are trained on limited and consistent climate data for discrete spatiotemporal tasks.

New research by Microsoft Autonomous Systems and Robotics Research, Microsoft Research AI4Science, and UCLA present ClimaX, a deep learning model for weather and climate science that can be trained on different datasets with different variables, spatial and temporal coverage, and physical foundations. ClimaX uses CMIP6 climate datasets for unsupervised training. To increase computing while keeping broad usability, ClimaX expands Transformer with novel encoding and aggregation blocks. 

After initial training, the climaX may be fine-tuned to perform a wide range of climate and weather jobs, including those that involve atmospheric variables and different time and space scales. Even when pretrained at lower resolutions and with less computational budgets, ClimaX’s universality allows it to outperform data-driven baselines on weather forecasting and climate prediction benchmarks. 

The researchers believe this method’s universality may make it useful for more diverse purposes. This may include predicting extreme weather events and evaluating anthropogenic climate change, two examples of Earth systems science tasks that could benefit from a ClimaX backbone that has already been pretrained. Agriculture, demography, and actuarial sciences are also interesting candidates because of their close ties to weather and climate.

Pangu-Weather For Global Weather Forecasting

A team of researchers from Huawei Cloud Computing introduced Pangu-Weather, a global weather forecasting system based on deep learning. The team gathered 43 years of hourly global meteorological data from the ECMWF’s fifth-generation reanalysis (ERA5) to create a data-driven environment and train a few deep neural networks with 256 million parameters.

This is the first AI-based approach that outperforms cutting-edge numerical weather prediction (NWP) techniques in terms of accuracy of all variables (such as geopotential, specific humidity, wind speed, temperature, etc.) and across all time scales (from one hour to one week). Prediction accuracy is increased using a hierarchical temporal aggregation strategy and a 3D Earth Specific Transformer (3DEST) architecture that transforms height (pressure level) data into cubic data. Short- to medium-range deterministic forecasting is Pangu-forte. Weather (i.e., forecast time ranges from one hour to one week). 

Several downstream prediction options are available from Pangu-Weather, such as tropical cyclone tracking and real-time large-member ensemble forecast. Pangu-Weather answers the question of whether AI-based techniques can perform better than NWP techniques and makes fresh recommendations for enhancing deep learning weather forecasting systems.

The team believes that their training method has not yet attained full convergence. There is room to increase the number of observational components, integrate the time dimension into the training of 4D deep networks, and use deeper and/or wider networks. All call for GPUs with more Memory and FLOPs. Hence future weather forecasts will be better because of computational resources.

A Multi-Resolution Deep Learning Framework

Extreme weather events substantially threaten human life and the economy, with annual costs in the billions of dollars and a human toll in the tens of thousands. As a result of climate change, their consequences, and intensity are predicted to increase. The principal instrument for climate projections, general circulation models (GCMs), unfortunately, cannot adequately define weather extremes.

A group of scientists from Verisk Analytics, Otto-von-Guericke University, and the Massachusetts Institute of Technology has developed a multi-resolution deep learning framework to speed up the simulation of extreme weather events. To eliminate the biases and improve the resolution of the GCM simulation, they mix a physics-based GCM performed at coarse resolution with machine-learning models trained on observational data. 

The main ingredients are:

  • A divide-and-conquer training strategy that permits the training of regional models at a high spatial resolution
  • Novel statistical loss functions that emphasize extreme values and space-time coherency
  • A compact, multi-scale representation of physical processes on the sphere that efficiently captures energy transfers across scales. 

A decision maker can utilize the full-scale debiased simulation to look at current scenarios and gauge their exposure to catastrophic weather disasters, all with an arbitrary level of detail.

The suggested architecture makes million-year extreme weather simulations feasible, improving disaster-event quantification. As the need for global simulations that account for interdependencies across many geographies and threats continues to rise, the researchers believe this will help satisfy that requirement.

Real-time Bias Correction of Wind Field Forecasts

European Centre for Medium-Range Weather Forecasts (ECMWF; EC for short) forecasts can serve as a foundation for developing maritime-disaster warning systems while containing some systematic biases. The European Commission’s fifth-generation atmospheric reanalysis (ERA5) data is highly accurate; however, it is a few days late. Nonlinear mapping between EC and ERA5 data could be improved with a spatiotemporal deep-learning approach, allowing for more accurate real-time wind forecasts from EC. 

A recent study by the Ocean University of China, the National Marine Environment Forecasting Center, and the University of Portsmouth designed a multi-task learning loss function to correct wind speed and wind direction using a single model. They implemented it in the Multi-Task-Double Encoder Trajectory Gated Recurrent Unit (MT-DETrajGRU) model, which employs an enhanced “double-encoder forecaster” architecture to model the spatiotemporal sequence wind components. The western North Pacific (WNP) served as the research region. The EC’s 10-day wind-field forecasts were corrected for rolling bias in real-time from December 2020 to November 2021 throughout all four seasons. After being adjusted with the MT-DETrajGRU model, the wind speed and wind direction biases in the four seasons were reduced by 8-11% and 9-14%, respectively, compared with the original EC forecasts.

Furthermore, the proposed technique modeled the data consistently under varying climate circumstances. The data-driven mode built here is resilient and generalizable, as demonstrated by the similar correction performance under normal and typhoon conditions. The team plans to incorporate other variables that influence the wind field, such as temperature, air pressure, and humidity, into the model in future investigations.

Predicting Wind Farm Power And Downstream Wakes Using Weather Patterns

A new study by ECMWF, Bonn, Imperial College London, UK Meteorological Office, Exeter, and Shell Research Ltd establishes a novel wind energy workflow that shows for the first time how complicated numerical weather prediction models can be successfully integrated with unsupervised clustering algorithms to efficiently make accurate long-term predictions of wind farm power and downstream wakes. This procedure begins by identifying weather trends using unsupervised k-means clustering on ERA5 reanalysis data to account for regional and temporal variability. To calculate cluster power output and downstream wind farm wake, a WRF simulation is done using the cluster center’s average meteorological conditions. 

This analysis determines the best variable and domain size for offshore wind energy production weather patterns. After running WRF simulations, the team applied a unique post-processing approach to cluster simulations to improve long-term wind power output and downstream wake predictions. The novel method allows multi-year and multi-decadal estimates of an offshore wind farm’s power and downstream wakes without running a simulation. While prior research has conducted small-scale examinations of downstream wind farm wakes, this is the first tool to lessen these wakes by providing precise and rapid long-term projections that improve wind farm location knowledge.

By applying this approach to two case study regions, the team demonstrates that, while taking less than 2% of the computing effort, proposed long-term predictions achieve excellent agreement with those from a year of WRF simulations. When grouping on wind velocity, the results are the most precise.

GraphCast: Providing Efficient Medium-Range Global Weather Forecasting

From picking out an outfit to what to do in the event of a hurricane, people constantly adjust their plans based on weather forecasts. People rely on “medium-range” weather forecasts, which are issued by meteorological services up to four times daily, for making decisions that require knowledge of the weather ten days in the future.

A recent study by DeepMind and Google introduces GraphCast. This new ML-based weather simulator exceeds the world’s most accurate deterministic operational medium-range weather forecasting system and all ML baselines. GraphCast autoregressive model is trained using meteorological data from the ERA5 reanalysis archive at the European Center for Medium-Range Weather Forecasting (ECMWF). The model is built on neural graph networks and a novel high-resolution multi-scale mesh representation. It has a resolution of around 25×25 kilometers at the equator and can create 10-day forecasts at 6-hour intervals for five surfaces and six atmospheric variables, each at 37 vertical pressure levels. 

In 90.0% of the 2760 variable and lead time combinations, GraphCast outperformed ECMWF’s deterministic operational forecasting method, HRES. For 99.2 percent of the 252 targets it reported, GraphCast outperformed the most accurate previous ML-based weather forecasting model. With Cloud TPU v4 technology, GraphCast can produce a 10-day prediction (35 GB of data) in under 60 seconds. 

Unlike more traditional forecasting techniques, ML-based forecasting may easily grow in size and sophistication as additional data becomes available for training. This study is a major advance for ML-based weather modeling. In principle, it can be applied to a much broader set of environmental and other geo-spatial-temporal forecasting challenges, such as modeling various meteorological factors and seasonal and climate predictions, wildfires, deforestation, etc. 

WeatherFusionNet For Predicting Precipitation from Satellite Data

Deep learning methods have improved weather predicting accuracy recently. Researchers from Czech Technical University in Prague presented two deep learning models to forecast rainfall at the 2021 AI for Good World Summit Challenge on predicting extreme weather occurrences.

The first model, sat2rad, is a U-Net-based deep learning model that estimates rainfall in the current satellite frame time step. This model predicts rainfall for the full satellite area using convolutional neural networks’ spatial invariance, even if radar data is only available for a smaller area. The sat2rad model was applied to all four satellite frames separately to generate four channels.

The second model, PhyDNet, is a recurrent convolutional network that separates physical dynamics from supplementary visual input. Two branches of PhyDNet handle physical dynamics and residual information for future prediction. Due to competition limits, PhyDNet was trained on satellite data instead of radar frames. To make the prediction, another U-Net merged the outputs of both models with the input sequence.

The study indicated that employing the sat2rad and PhyDNet models increased rainfall prediction. The spatial invariance of convolutional neural networks helped estimate rainfall for the full satellite area, even if radar data was only available for a smaller area. 

WF-UNet: Weather Fusion UNet for Precipitation Nowcasting

Accurate short-term forecasts (nowcasts) of precipitation are necessary when designing early warning systems for severe weather and its consequences, such as urban flooding or landslides. There are several environmental uses for nowcasting, from agricultural management to improving aviation safety. 

Collaborative research between Maastricht University and Utrecht University explores the feasibility of using a UNet core model, and an extension of that model, to predict rainfall in western Europe up to three hours in advance. Their study proposes the Weather Fusion UNet (WF-UNet) model, which builds on the Core 3D-UNet model by including variables like wind speed and precipitation in the training process and then analyzing how these factors affect the performance of the objective task of predicting precipitation. 

Using the ERA5 dataset from Copernicus, the European Union’s Earth observation program, the team compiled radar images of precipitation and wind for six years (January 2016 to December 2021) across 14 European nations, with 1-hour temporal resolution and 31 square km spatial resolution. They evaluate the proposed WF-UNet model compared to the persistence model and other UNet-based architectures trained with sole precipitation radar input data. According to the findings, WF-UNet achieves 22%, 8%, and 6% lower MSE than the other best-performing designs analyzed when the time horizon is 1, 2, and 3 hours, respectively. Compared to the traditional UNet model, decision-level fusion is superior at capturing the spatiotemporal information included in archived radar images. WF-UNet outperforms other tested UNet-based models in short-term nowcasting thanks to its superior feature extraction capabilities.

All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 26k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.


  • https://arxiv.org/pdf/2210.12137.pdf
  • https://arxiv.org/abs/2212.14160
  • https://arxiv.org/pdf/2211.16824.pdf
  • https://arxiv.org/pdf/2211.02556.pdf
  • https://arxiv.org/pdf/2212.12794.pdf
  • https://arxiv.org/pdf/2301.10343.pdf
  • https://arxiv.org/pdf/2302.04102.pdf
  • https://arxiv.org/pdf/2302.05886.pdf
  • https://search.zeta-alpha.com/tags/68633
[Announcing Gretel Navigator] Create, edit, and augment tabular data with the first compound AI system trusted by EY, Databricks, Google, and Microsoft