Researchers From Imperial College London introduce TsT-GAN: A Novel Framework For Training Time-Series Generative Models

Nowadays, data is considered a fuel in the data analytics field. The real-time applications require time series data for analysis and future prediction. But all these applications usually lack the necessary, sufficient data for analysis. Hence, various data augmentation techniques need to be adopted. Researchers from Imperial College London introduce a framework called TsT-GAN, based on generative adversarial networks (GAN), that are utilized to augment the time-series data. It aims to fulfill the various objectives like capturing the steps of the conditional distribution of real-time sequences and creating a model that joins the distribution of all the real-time sequences.

The paper’s significant contribution is to develop the model consisting of a generator that can produce entire joint distributions considering the distribution conditions. The training framework can be applied to any time series dataset that quantitively results in a standard method that can be trained on the synthetic test on a realistic approach while qualitatively using t-SNE. 

The TsT-GAN block diagram is shown in Figure 1. It is divided into four components, as described below.

  1. Embedder

This network consists of a transformer that accepts multivariate real-time sequences as an input and predicts the next entry in the sequence at each position. Here, the model’s dimensions are the linear projection of the input vector. This input vector is provided to the embedder network and converted into the set of embeddings. 

  1. Predictor

The embeddings generated by the embedder are passed to the predictor network to convert them back into the input dimensions.

  1. Generator

The generator model is responsible for generating the synthetic sequences by extracting the information from the original input vectors. Similar to the embedding network, it consists of a transformer encoder that uses bidirectional attention. The input vector is projected into model dimensions. Then the generator model is passed by the noise vector, which will generate the latent embeddings and finally convert them into synthetic sequences.

  1. Discriminator

The discriminator model is constructed similarly to the BERT (Bidirectional Encoder Representations from Transformers) model, which acts as a transformer encoder with bidirectional attention. It is responsible for accepting the real sequences of the dataset as an input and classifying them into original as true and the sequences produced by the generator as false. 

The model is trained in three stages. The first stage includes the autonomous training of embedder–predictor components, while the second stage includes training the generator by utilizing the masked modeling. Finally, the last stage includes joint training of all the components. All the components are feed-forward neural networks. For optimization, adaptive moment estimation (Adam) is utilized along with normalization and learning rate as 0.001 in the first two stages of training, while for joint training, the learning rate utilized is 0.00002. Moreover, the second stage of masked modeling uses a mask of 0.3 for all datasets, and the batch size is 128. Gaussian Error Linear Unit (GELU) is used as the activation function as the data is non-linear. 

The experimentations are performed using five standard datasets like Sines, Stocks, UCI appliances energy prediction, UCI Hungarian chickenpox cases, and UCI air quality datasets. The TsT-GAN model’s performance is compared to those of state-of-the-art techniques such as TimeGAN, RCGAN, C-RNN-GAN, COT-GAN, and Professor Forcing (P-Forcing). The results proved that the proposed TsT-GAN outpaces other models for all datasets in the predictive score. In contrast, the autoregressive generator outpaces TsT-GAN in the discriminative score for the Stocks and Chickenpox dataset, but the difference is negligible.

The paper proposed a novel framework TsT-GAN, for training time-series generative models. The major drawback of the proposed model is time complexity which is O(n2) for the sequence of length n. Thus, the work can be extended in the future to decrease the time complexity and to improve the discriminative scores.

This Article Is Based On The Research Paper 'Time-series Transformer Generative Adversarial Networks'. All Credit For This Research Goes To The Researchers of This Project. Check out the paper and related codes.

Please Don't Forget To Join Our ML Subreddit