Skip to main content

Research Repository

Advanced Search

AdaHAT: Adaptive hard attention to the task in task-incremental learning

Wang, Pengxiang; Bo, Hongbo; Hong, Jun; Liu, Weiru; Mu, Kedian

Authors

Pengxiang Wang

Hongbo Bo

Jun Hong Jun.Hong@uwe.ac.uk
Professor in Artificial Intelligence

Weiru Liu

Kedian Mu



Abstract

Catastrophic forgetting is a major issue in task-incremental learning, where a neural network loses what it has learned in previous tasks after being trained on new tasks. A number of architecture- based approaches have been proposed to address this issue. However, the architecture-based approaches suffer from another issue on network capacity when the network learns long sequences of tasks. As the network is trained on an increasing number of new tasks in a long sequence of tasks, more parameters become static to prevent the network from for- getting what it has learned in previous tasks. In this paper, we propose an adaptive task-based hard attention mechanism which allows adaptive updates to static parameters by taking into account the information about previous tasks on both the importance of these parameters to previous tasks and the current network capacity. We develop a new neural network architecture incorporating our proposed Adaptive Hard Attention to the Task (AdaHAT) mechanism. AdaHAT extends an existing architecture-based approach, Hard Attention to the Task (HAT), to learn long sequences of tasks in an incremental manner. We conduct experiments on a number of datasets and compare AdaHAT with a number of baselines, including HAT. Our experimental results show that Ada- HAT achieves better average performance over tasks than these base- lines, especially on long sequences of tasks, demonstrating the benefits from balancing the trade-off between stability and plasticity of a net- work when learning such sequences of tasks. Our code is available at github.com/pengxiang-wang/continual-learning-arena.

Presentation Conference Type Conference Paper (published)
Conference Name ECML PKDD 2024
Start Date Sep 9, 2024
End Date Sep 13, 2024
Acceptance Date May 27, 2024
Online Publication Date Aug 22, 2024
Publication Date Aug 31, 2024
Deposit Date Sep 5, 2024
Publicly Available Date Aug 23, 2025
Publisher Springer
Peer Reviewed Peer Reviewed
Pages 143-160
Series Title Lecture Notes in Computer Science
Series Number 14943
Series ISSN 0302-9743
Book Title European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part III
ISBN 9783031703515
DOI https://doi.org/10.1007/978-3-031-70352-2_9
Public URL https://uwe-repository.worktribe.com/output/12842056