Pengxiang Wang
AdaHAT: Adaptive hard attention to the task in task-incremental learning
Wang, Pengxiang; Bo, Hongbo; Hong, Jun; Liu, Weiru; Mu, Kedian
Authors
Abstract
Catastrophic forgetting is a major issue in task-incremental learning, where a neural network loses what it has learned in previous tasks after being trained on new tasks. A number of architecture- based approaches have been proposed to address this issue. However, the architecture-based approaches suffer from another issue on network capacity when the network learns long sequences of tasks. As the network is trained on an increasing number of new tasks in a long sequence of tasks, more parameters become static to prevent the network from for- getting what it has learned in previous tasks. In this paper, we propose an adaptive task-based hard attention mechanism which allows adaptive updates to static parameters by taking into account the information about previous tasks on both the importance of these parameters to previous tasks and the current network capacity. We develop a new neural network architecture incorporating our proposed Adaptive Hard Attention to the Task (AdaHAT) mechanism. AdaHAT extends an existing architecture-based approach, Hard Attention to the Task (HAT), to learn long sequences of tasks in an incremental manner. We conduct experiments on a number of datasets and compare AdaHAT with a number of baselines, including HAT. Our experimental results show that Ada- HAT achieves better average performance over tasks than these base- lines, especially on long sequences of tasks, demonstrating the benefits from balancing the trade-off between stability and plasticity of a net- work when learning such sequences of tasks. Our code is available at github.com/pengxiang-wang/continual-learning-arena.
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | ECML PKDD 2024 |
Start Date | Sep 9, 2024 |
End Date | Sep 13, 2024 |
Acceptance Date | May 27, 2024 |
Online Publication Date | Aug 22, 2024 |
Publication Date | Aug 31, 2024 |
Deposit Date | Sep 5, 2024 |
Publicly Available Date | Aug 23, 2025 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Pages | 143-160 |
Series Title | Lecture Notes in Computer Science |
Series Number | 14943 |
Series ISSN | 0302-9743 |
Book Title | European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part III |
ISBN | 9783031703515 |
DOI | https://doi.org/10.1007/978-3-031-70352-2_9 |
Public URL | https://uwe-repository.worktribe.com/output/12842056 |
Files
This file is under embargo until Aug 23, 2025 due to copyright reasons.
Contact Jun.Hong@uwe.ac.uk to request a copy for personal use.
You might also like
A survey of location inference techniques on Twitter
(2015)
Journal Article
Privacy preserving record linkage in the presence of missing values
(2017)
Journal Article
A novel ensemble learning approach to unsupervised record linkage
(2017)
Journal Article
A collaborative multiagent framework based on online risk-aware planning and decision-making
(2017)
Journal Article
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search