Dr Nathan Duran Nathan.Duran@uwe.ac.uk
Lecturer in Artificial Intelligence
Sentence encoding for dialogue act classification
Duran, Nathan; Battle, Steve; Smith, Jim
Authors
Steve Battle Steve.Battle@uwe.ac.uk
Senior Lecturer
Jim Smith James.Smith@uwe.ac.uk
Professor in Interactive Artificial Intelligence
Abstract
In this study, we investigate the process of generating single-sentence representations for the purpose of Dialogue Act (DA) classification, including several aspects of text pre-processing and input representation which are often overlooked or underreported within the literature, for example, the number of words to keep in the vocabulary or input sequences. We assess each of these with respect to two DA-labelled corpora, using a range of supervised models, which represent those most frequently applied to the task. Additionally, we compare context-free word embedding models with that of transfer learning via pre-trained language models, including several based on the transformer architecture, such as Bidirectional Encoder Representations from Transformers (BERT) and XLNET, which have thus far not been widely explored for the DA classification task. Our findings indicate that these text pre-processing considerations do have a statistically significant effect on classification accuracy. Notably, we found that viable input sequence lengths, and vocabulary sizes, can be much smaller than is typically used in DA classification experiments, yielding no significant improvements beyond certain thresholds. We also show that in some cases the contextual sentence representations generated by language models do not reliably outperform supervised methods. Though BERT, and its derivative models, do represent a significant improvement over supervised approaches, and much of the previous work on DA classification.
Citation
Duran, N., Battle, S., & Smith, J. (2023). Sentence encoding for dialogue act classification. Natural Language Engineering, 29(3), 794-823. https://doi.org/10.1017/S1351324921000310
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 20, 2021 |
Online Publication Date | Nov 2, 2021 |
Publication Date | 2023-05 |
Deposit Date | Nov 19, 2021 |
Publicly Available Date | Jul 7, 2023 |
Journal | Natural Language Engineering |
Print ISSN | 1351-3249 |
Electronic ISSN | 1469-8110 |
Publisher | Cambridge University Press (CUP) |
Peer Reviewed | Peer Reviewed |
Volume | 29 |
Issue | 3 |
Pages | 794-823 |
DOI | https://doi.org/10.1017/S1351324921000310 |
Keywords | Artificial Intelligence; Linguistics and Language; Language and Linguistics; Software |
Public URL | https://uwe-repository.worktribe.com/output/8167980 |
Publisher URL | https://www.cambridge.org/core/journals/natural-language-engineering/article/sentence-encoding-for-dialogue-act-classification/2EF3DC8E57D1019960D18FDE685B1EBA# |
Additional Information | Copyright: © The Author(s), 2021. Published by Cambridge University Press |
Files
Sentence encoding for dialogue act classification
(870 Kb)
PDF
Licence
http://creativecommons.org/licenses/by/4.0/
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
A novel mirror neuron inspired decision-making architecture for human–robot interaction
(2023)
Journal Article
Inter-annotator agreement using the Conversation Analysis Modelling Schema, for dialogue
(2022)
Journal Article
Automatic Checking of Research Outputs (ACRO): A tool for dynamic disclosure checks
(2021)
Journal Article
Statistical disclosure controls for machine learning models
(2021)
Conference Proceeding