In many commercial and scientific applications, the challenge of forecasting or predicting future values based on historical values is crucial. This is because forecasting can aid in making critical decisions, such as how much inventory of a given product to keep on hand or how to allocate resources effectively in a data center. Time-series data—a sequential collection of numerical data typically gathered over regular time intervals—is used for forecasting. Some examples include the daily total sales of a particular product from an e-commerce platform or the minute-by-minute CPU utilization of a server in a data center. The two types of time-series data are stationary and non-stationary. Stationarity describes both the regularity of the time series statistical patterns and data values that remain within a given range. In contrast to non-stationarity, which is a phenomenon where the statistical distribution for time-series data does not remain stationary, stationary time series preserve the statistical information of the time series data (such as the mean or variance).
Humans often acquire new concepts more quickly and effectively than machine learning models, which frequently need enormous quantities of training data to function correctly. Meta-learning techniques often seek to replicate the kind of rapid learning seen by humans by utilizing an inner and outer learning loop. The inner learning loop swiftly picks up new information from a condensed set of samples known as the support set. The inner loop’s ability to make this quick adaptation to new support sets is guaranteed by the outside learning loop. Being trained on a query set—a set of instances from the initial support set that is comparable but distinct—accomplishes this.
As IT infrastructure has grown, so has the capacity to gather larger amounts of this data, resulting in extraordinarily lengthy time-series datasets. With data being collected in significant volumes over time, the mechanism that produces the data can also likely change. For instance, daily sales may increase dramatically if a product becomes popular compared to prior years. As a result, the patterns of the gathered data shift with time, creating a non-stationary time-series.
Although having access to more data is typically advantageous in machine learning (ML), it presents a challenge when machine learning algorithms are applied to such data because most techniques perform best with evenly distributed data. Covariate shift and conditional distribution shift are two issues with the data that result in the models degrading when the system goes through change. A covariate shift happens when the time-series values’ statistics change, whereas a conditional distribution shift happens when the method used to produce the data changes.
Thus, the performance of existing time-series forecasting methods can degrade due to non-stationarity. To overcome these drawbacks, researchers from Salesforce have developed a fresh approach to non-stationary time-series forecasting dubbed DeepTime. This method addresses issues inherent in lengthy time-series data sequences by extending the traditional time-index models into the deep learning paradigm. DeepTime uses a novel meta-learning forecasting task formulation to get around the problem of overly expressive neural networks. This made Salesforce the first company in the sector to introduce how to use deep time-index models for time-series forecasting.
The methodology uses deep time-index models, where the pre-specified function is substituted with a deep neural network, as opposed to traditional time-index methods, which manually specify the link between the time-index characteristics and output values. Instead of manually specifying these associations, the model may learn them from the data. Deep neural networks are overly expressive and frequently cause the data to overfit, hence doing so naively results in subpar forecasts. To solve this issue, the researchers used a meta-learning formulation.
Salesforce’s Deeptime addresses non-stationarity with the help of the locally stationary distribution assumption. As a result, even while the lengthy sequence is non-stationary, it is reasonable to believe that nearby time steps still exhibit the same patterns and distribution that may gradually alter over time. As a result, a lengthy time series can be divided into tasks that are supposed to be stationary. The time-series is divided into a lookback window (historical data) and a forecast horizon for each task (the values to be predicted). The prediction horizon is the query set in the meta-learning system, and the lookback window is the support set.
Deep neural networks have many parameters to learn; thus, meta-learning the entire model can be laborious and time-consuming. To address this, the researchers modified the model architecture to shorten the training period. The critical concept is only to use the meta-learning inner loop adaptation step on the ridge regressor, which is the final layer and can be computed quickly during training. DeepTime can get around the conditional distribution shift and covariate shift problems due to this formulation.
The team tested their approach using both synthetic and actual data. DeepTime can extrapolate hidden functions that contain novel patterns that it was not given access to in training data while working with synthetic data. The framework achieves state-of-the-art performance on 20 out of 24 settings (based on mean squared error metric) on six real-world time-series datasets from a variety of application domains and forecast horizons. The outstanding architecture also demonstrates exceptional efficiency, outperforming all benchmarks in terms of memory and running-time costs.
The use of ridge regression by DeepTime makes it possible for the framework to achieve a precise one-step solution instead of an approximative iterative one by ensuring that predicted values are closer to the actual ones. As a result, DeepTime offers a superior method for developing answers in the field of time-series forecasting. One of DeepTime’s main advantages is that it is a time-series forecasting technique that is quicker, more precise, and ultimately more beneficial than other techniques. It can also offer more precise projections regarding the effects on the economy and business, which can help with decisions further down the line, including resource allocation (when used for sales forecasting) or data center planning. Additionally, using the forecasting model in businesses could potentially have a lower carbon footprint because of the frameworks’ better efficiency.
This Article is written as a research summary article by Marktechpost Staff based on the research paper 'DEEPTIME: DEEP TIME-INDEX META-LEARNING FOR NON-STATIONARY TIME-SERIES FORECASTING'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, github link and reference article. Please Don't Forget To Join Our ML Subreddit
Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.