Skip to main content

Research Repository

Advanced Search

Towards idea mining: Problem-solution phrase extraction from text

Liu, Haixia; Brailsford, Tim; Goulding, James; Maul, Tomas; Tan, Tao; Chaudhuri, Debanjan

Authors

Haixia Liu

James Goulding

Tomas Maul

Tao Tan

Debanjan Chaudhuri



Contributors

Weitong Chen
Editor

Lina Yao
Editor

Taotao Cai
Editor

Shirui Pan
Editor

Tao Shen
Editor

Xue Li
Editor

Abstract

This paper investigates the feasibility of problem-solution phrases extraction from scientific publications using neural network approaches. Bidirectional Long Short-Term Memory with Conditional Random Fields (Bi-LSTM-CRFs) and Bidirectional Encoder Representations from Transformers (BERT) were evaluated on two datasets, one of which was created by University of Cambridge Computer Laboratory containing 1000 positive examples of problems and solutions (UCCL1000) with the corresponding phrases annotated. The F1-scores computed on the UCCL1000 dataset indicate that BERT is an effective approach to extract solution phrases (with an F1-score of 97%) and problem phrases (with an F1-score of 83%). To test the model’s robustness on a different corpus with a different annotation scheme, a dataset consisting of 488 problem-solution samples from the Conference on Neural Information Processing Systems (NIPS488) was collected and annotated by human readers. Both Bi-LSTM-CRFs and BERT performances were dramatically lower for NIPS488 in comparison with UCCL1000.

Citation

Liu, H., Brailsford, T., Goulding, J., Maul, T., Tan, T., & Chaudhuri, D. (2023). Towards idea mining: Problem-solution phrase extraction from text. In W. Chen, L. Yao, T. Cai, S. Pan, T. Shen, & X. Li (Eds.), Advanced Data Mining and Applications 18th International Conference, ADMA 2022, Brisbane, QLD, Australia, November 28–30, 2022, Proceedings, Part II (3-14). https://doi.org/10.1007/978-3-031-22137-8_1

Conference Name ADMA 2022: International Conference on Advanced Data Mining and Applications
Conference Location Brisbane, QLD; Conference Country: Australia
Start Date Nov 30, 2022
End Date Dec 2, 2022
Acceptance Date Aug 19, 2022
Online Publication Date Nov 24, 2022
Publication Date Jan 16, 2023
Deposit Date Jan 16, 2023
Publicly Available Date Nov 25, 2024
Publisher Springer Verlag
Volume 13726 LNAI
Pages 3-14
Series Title Lecture Notes in Computer Science book series (LNAI,volume 13726)
Series ISSN 0302-9743; 1611-3349
Book Title Advanced Data Mining and Applications 18th International Conference, ADMA 2022, Brisbane, QLD, Australia, November 28–30, 2022, Proceedings, Part II
ISBN 9783031221361
DOI https://doi.org/10.1007/978-3-031-22137-8_1
Keywords Text mining; Problem-solution extraction; NLP
Public URL https://uwe-repository.worktribe.com/output/10339175
Publisher URL https://link.springer.com/chapter/10.1007/978-3-031-22137-8_1
Related Public URLs https://link.springer.com/book/10.1007/978-3-031-22137-8

https://www.springer.com/series/558
Additional Information First Online: 24 November 2022; Conference Acronym: ADMA; Conference Name: International Conference on Advanced Data Mining and Applications; Conference City: Brisbane, QLD; Conference Country: Australia; Conference Year: 2022; Conference Start Date: 30 November 2022; Conference End Date: 2 December 2022; Conference Number: 18; Conference ID: adma2022; Conference URL: https://adma2022.uqcloud.net/index.html; Type: Single-blind; Conference Management System: CMT3; Number of Submissions Sent for Review: 198; Number of Full Papers Accepted: 72; Number of Short Papers Accepted: 0; Acceptance Rate of Full Papers: 36% - The value is computed by the equation "Number of Full Papers Accepted / Number of Submissions Sent for Review * 100" and then rounded to a whole number.; Average Number of Reviews per Paper: 5; Average Number of Papers per Reviewer: 3; External Reviewers Involved: No