AdaHAT: Adaptive hard attention to the task in task-incremental learning

Wang, Pengxiang; Bo, Hongbo; Hong, Jun; Liu, Weiru; Mu, Kedian

doi:10.1007/978-3-031-70352-2_9

AdaHAT: Adaptive hard attention to the task in task-incremental learning

Wang, Pengxiang; Bo, Hongbo; Hong, Jun; Liu, Weiru; Mu, Kedian

Authors

Pengxiang Wang

Hongbo Bo

Jun Hong Jun.Hong@uwe.ac.uk
Professor in Artificial Intelligence

Weiru Liu

Kedian Mu

Abstract

Catastrophic forgetting is a major issue in task-incremental learning, where a neural network loses what it has learned in previous tasks after being trained on new tasks. A number of architecture- based approaches have been proposed to address this issue. However, the architecture-based approaches suffer from another issue on network capacity when the network learns long sequences of tasks. As the network is trained on an increasing number of new tasks in a long sequence of tasks, more parameters become static to prevent the network from for- getting what it has learned in previous tasks. In this paper, we propose an adaptive task-based hard attention mechanism which allows adaptive updates to static parameters by taking into account the information about previous tasks on both the importance of these parameters to previous tasks and the current network capacity. We develop a new neural network architecture incorporating our proposed Adaptive Hard Attention to the Task (AdaHAT) mechanism. AdaHAT extends an existing architecture-based approach, Hard Attention to the Task (HAT), to learn long sequences of tasks in an incremental manner. We conduct experiments on a number of datasets and compare AdaHAT with a number of baselines, including HAT. Our experimental results show that Ada- HAT achieves better average performance over tasks than these base- lines, especially on long sequences of tasks, demonstrating the benefits from balancing the trade-off between stability and plasticity of a net- work when learning such sequences of tasks. Our code is available at github.com/pengxiang-wang/continual-learning-arena.

Presentation Conference Type	Conference Paper (published)
Conference Name	ECML PKDD 2024
Start Date	Sep 9, 2024
End Date	Sep 13, 2024
Acceptance Date	May 27, 2024
Online Publication Date	Aug 22, 2024
Publication Date	Aug 31, 2024
Deposit Date	Sep 5, 2024
Publicly Available Date	Aug 23, 2025
Publisher	Springer
Peer Reviewed	Peer Reviewed
Pages	143-160
Series Title	Lecture Notes in Computer Science
Series Number	14943
Series ISSN	0302-9743
Book Title	European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part III
ISBN	9783031703515
DOI	https://doi.org/10.1007/978-3-031-70352-2_9
Public URL	https://uwe-repository.worktribe.com/output/12842056

Files

This file is under embargo until Aug 23, 2025 due to copyright reasons.

Contact Jun.Hong@uwe.ac.uk to request a copy for personal use.

A survey of location inference techniques on Twitter (2015)
Journal Article

Privacy preserving record linkage in the presence of missing values (2017)
Journal Article

A novel ensemble learning approach to unsupervised record linkage (2017)
Journal Article

A collaborative multiagent framework based on online risk-aware planning and decision-making (2017)
Journal Article

Tweet for behavior change: Using social media for the dissemination of public health messages (2017)
Journal Article

AdaHAT: Adaptive hard attention to the task in task-incremental learning

Wang, Pengxiang; Bo, Hongbo; Hong, Jun; Liu, Weiru; Mu, Kedian

Authors

Abstract

Files

You might also like

Downloadable Citations