Continual Adapter Tuning (CAT): A Parameter-Efficient Machine Learning Framework that Avoids Catastrophic Forgetting and Enables Knowledge Transfer from Learned ASC Tasks to New ASC Tasks

Aspect Sentiment Classification (ASC) is a critical task aimed at discerning sentiment polarity within specific domains, such as product reviews, where the sentiment toward particular aspects needs to be identified. Continual Learning (CL) poses a significant challenge for ASC models due to Catastrophic Forgetting (CF), wherein learning new tasks leads to a detrimental loss of previously acquired knowledge. As ASC models must adapt to evolving data distributions across diverse domains, preventing CF becomes paramount. 

When the number of tasks rises, traditional techniques frequently require keeping distinct model checkpoints for every task, which becomes unfeasible. Recent methods try to reduce CF by separately freezing the core model and training task-specific components. However, they frequently fail to consider effective knowledge transfer between tasks, which makes it more difficult for them to handle an increasing number of domains effectively.

✅ [Featured Article] Selected for 2024 GitHub Accelerator: Enabling the Next Wave of Innovation in Enterprise RAG with Small Specialized Language Models

Recently, a research team from China published a new article introducing innovative methods to address the limitations of existing approaches in ASC. Their proposed approach, Continual Adapter Tuning (CAT), employs task-specific adapters while freezing the backbone pre-trained model to prevent catastrophic forgetting and enable efficient learning of new tasks. Additionally, continual adapter initialization aids knowledge transfer, while label-aware contrastive learning enhances sentiment polarity classification. A majority sentiment polarity voting strategy simplifies testing by eliminating the need for task IDs, resulting in a parameter-efficient framework that improves ASC performance.

The proposed CAT method addresses the challenge of sequential learning in ASC tasks by leveraging Adapter-BERT architecture, a variant of the BERT (Bidirectional Encoder Representations from Transformers) model architecture extending BERT by incorporating adapters, which are small neural network modules inserted into each layer of the BERT architecture. These adapters allow BERT to be fine-tuned for specific downstream tasks while retaining most of its pre-trained parameters. Adapter-BERT thus enables more efficient and parameter-efficient fine-tuning for various natural language processing tasks, including sentiment analysis, text classification, and language understanding tasks. In CAT, a separate adapter is learned for each ASC task, ensuring the backbone pre-trained model remains frozen to prevent catastrophic forgetting. The process involves taking input sentences and aspect items, generating hidden states, and label-aware features specific to each task. To enhance classification efficiency, a label-aware classifier is developed, integrating contrastive learning to align input features and classifier parameters in the same space, leveraging label semantics. Training involves minimizing a combined loss function comprising variant cross-entropy and label-aware contrastive loss. Continual adapter initialization strategies, including LastInit, RandomInit, and SelectInit, facilitate knowledge transfer from previous tasks to new ones. Finally, a majority sentiment polarity voting strategy is proposed for testing, eliminating the need for task IDs and providing final sentiment polarity predictions based on voting across reasoning paths in the adapter architecture. CAT ensures efficient and accurate sentiment polarity classification in ASC tasks through these steps while supporting continual learning and knowledge transfer.

The authors evaluated the CAT framework through experiments comparing it with various baselines. They used 19 ASC datasets, assessing the accuracy and Macro-F1 metrics. Baselines included both non-continual and continual learning approaches, with adaptations for domain-incremental learning. They detailed implementation using BERTbase and adapters. Results showed CAT outperformed baselines in accuracy and Macro-F1. Ablation studies and parameter efficiency comparisons further validated CAT’s effectiveness.

In conclusion, the research team presents a straightforward yet highly effective parameter-efficient framework for continual aspect sentiment classification within a domain-incremental learning context, achieving unprecedented accuracy and Marco-F1 metrics. However, the framework’s applicability beyond domain-incremental learning settings remains to be explored. This aspect will be addressed in future research endeavors.

Check out the PaperAll credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 39k+ ML SubReddit

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor's degree in physical science and a master's degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...