Deep learning technologies, such as automatic learning of temporal dependence and automated handling of temporal structures like trends and seasonality, hold a lot of promise for time series forecasting. Most real-world datasets include a temporal component. Therefore projecting the future can be pretty beneficial. In time series machine learning, multi-horizon forecasting, or predicting variables-of-interest at several future time steps, is a critical challenge.
Deep neural networks (DNNs) are increasingly being employed in multi-horizon forecasting, and they have been shown to outperform classic time series models. Unlike most models that focus on recurrent neural network (RNN) variants, recent works use attention-based layers to improve the selection of relevant time steps in the past beyond the inductive bias of RNNs – sequential ordered processing of information including. However, they frequently ignore the standard inputs in multi-horizon forecasting, assuming that all exogenous inputs are known in the future or ignoring crucial static variables.
Traditional time series models are governed by intricate nonlinear interactions among a large number of parameters. This makes it difficult to understand how they arrive at their conclusions. Although certain attention-based models are proposed with intrinsic interpretability for sequential data, such as language or speech, multi-horizon forecasting has many other forms of inputs. Attention-based models can help you understand relevant time steps, but they can’t tell you how certain important aspects are at any given time step. To achieve high performance and interpretability in multi-horizon forecasting, new strategies are required to deal with data heterogeneity.
A new Google research proposes the Temporal Fusion Transformer (TFT), an attention-based DNN model for multi-horizon forecasting. TFT is built to explicitly align the model with the broad multi-horizon forecasting job, resulting in greater accuracy and interpretability across a wide range of applications.
TFT is designed to efficiently create feature representations for each input type (i.e., static, known, or observed inputs). Its major components include:
- Gating mechanisms: This helps to skip over any unneeded model components, allowing for flexible depth and network complexity to suit a wide range of datasets
- Variable selection networks: At each time step, it provides a selection of important input variables. While traditional DNNs may overfit irrelevant features, attention-based variable selection can help enhance generalization by pushing the model to focus the majority of its learning capacity on the most important feature.
- Static covariate encoders: It incorporates static features to regulate the modeling of temporal dynamics.
- Temporal processing: This is to learn both long- and short-term temporal associations using time-varying inputs that are both observed and known. Local processing is handled by a sequence-to-sequence layer, which benefits from its inductive bias for ordered information processing. On the other hand, long-term dependencies are handled by a unique interpretable multi-head attention block. This can shorten the effective path length of information, as any previous time step containing relevant data can be targeted immediately.
- Prediction intervals display quantile predictions to determine the range of goal values at each prediction horizon. This aids users in comprehending the output distribution rather than just the point forecasts.
TFT was compared to a variety of multi-horizon forecasting models. This includes deep learning models with iterative and direct approaches (e.g., DeepAR, DeepSSM, ConvTrans) and classic models like ARIMA ETS and TRMF. The results over a variety of datasets show that TFT outperforms all benchmarks.
With three use examples: variable importance, persistent temporal patterns, identifying significant events, the team shows how TFT’s design enables individual component analysis for improved interpretability.
So far, TFT has been utilized to assist retailers and logistics organizations with demand forecasting by boosting forecasting accuracy and enhancing interpretability.
According to the team, TFT can be further used to address climate-related issues, such as reducing greenhouse gas emissions through the real-time balancing of energy supply and demand and enhancing the accuracy and interpretability of rainfall forecasting predictions.