Skip to main content

Research Repository

Advanced Search

Towards idea mining: Problem-solution phrases extraction from text

Liu, Haixia; Brailsford, Tim; Goulding, James; Maul, Tomas; Tan, Tao; Chaudhuri, Debanjan


Haixia Liu

James Goulding

Tomas Maul

Tao Tan

Debanjan Chaudhuri


Weitong Chen

Lina Yao

Taotao Cai

Shirui Pan

Tao Shen

Xue Li


This paper investigates the feasibility of problem-solution
phrases extraction from scientific publications using neural network approaches. Bidirectional Long Short-Term Memory with Conditional Random Fields (Bi-LSTM-CRFs) and Bidirectional Encoder Representations
from Transformers (BERT) were evaluated on two datasets, one of which
was created by University of Cambridge Computer Laboratory containing 1000 positive examples of problems and solutions (UCCL1000) with
the corresponding phrases annotated. The F1-scores computed on the
UCCL1000 dataset indicate that BERT is an effective approach to extract solution phrases (with an F1-score of 97%) and problem phrases
(with an F1-score of 83%). To test the model’s robustness on a different
corpus with a different annotation scheme, a dataset consisting of 488
problem-solution samples from the Conference on Neural Information
Processing Systems (NIPS488) was collected and annotated by human
readers. Both Bi-LSTM-CRFs and BERT performances were dramatically lower for NIPS488 in comparison with UCCL1000.


Liu, H., Brailsford, T., Goulding, J., Maul, T., Tan, T., & Chaudhuri, D. (2022). Towards idea mining: Problem-solution phrases extraction from text. In W. Chen, L. Yao, T. Cai, S. Pan, T. Shen, & X. Li (Eds.), ADMA 2022: Advanced Data Mining and Applications (3–14).

Conference Name ADMA 2022 : 18th International Conference on Advanced Data Mining and Applications
Conference Location Brisbane, Australia
Start Date Nov 28, 2022
End Date Nov 30, 2022
Acceptance Date Sep 15, 2022
Publication Date Nov 24, 2022
Deposit Date Oct 6, 2022
Publicly Available Date Nov 25, 2024
Publisher Springer
Volume 13726
Pages 3–14
Series Title Lecture Notes in Computer Science book series (LNAI, Volume 13726)
Book Title ADMA 2022: Advanced Data Mining and Applications
Chapter Number 1
ISBN 978-3-031-22136-1
Keywords Text mining, Problem-solution extraction, NLP
Public URL
Publisher URL
Related Public URLs