Skip to main content

Research Repository

Advanced Search

Towards computation of novel ideas from corpora of scientific text

Liu, Haixia; Goulding, James; Brailsford, Tim


Haixia Liu

James Goulding


P Rodrigues

V Santos Costa

J Game

A Jorge

C Soares


© Springer International Publishing Switzerland 2015. In this work we present a method for the computation of novel ‘ideas’ from corpora of scientific text. The system functions by first detecting concept noun-phrases within the titles and abstracts of publications using Part-Of-Speech tagging, before classifying these into sets of problem and solution phrases via a target-word matching approach. By defining an idea as a co-occurring pair, Known-idea triples can be constructed through the additional assignment of a relevance value (computed via either phrase co-occurrence or an ‘idea frequency-inverse document frequency’ score). The resulting triples are then fed into a collaborative filtering algorithm, where problem-phrases are considered as users and solution-phrases as the items to be recommended. The final output is a ranked list of novel idea candidates, which hold potential for researchers to integrate into their hypothesis generation processes. This approach is evaluated using a subset of publications from the journal Science, with precision, recall and F-Measure results for a variety of model parametrizations indicating that the system is capable of generating useful novel ideas in an automated fashion.


Liu, H., Goulding, J., & Brailsford, T. (2015). Towards computation of novel ideas from corpora of scientific text. Lecture Notes in Artificial Intelligence, 9285, 541-556.

Journal Article Type Conference Paper
Conference Name Joint European Conference on Machine Learning and Knowledge Discovery in Databases
Acceptance Date Jan 1, 2015
Publication Date Jan 1, 2015
Journal Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Print ISSN 0302-9743
Electronic ISSN 1611-3349
Publisher Springer Verlag
Peer Reviewed Peer Reviewed
Volume 9285
Pages 541-556
Series Title Lecture Notes in Computer Science
Book Title Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science
ISBN ; ; ;
Keywords idea mining, text mining, natural language processing, recommender systems, collaborative filtering
Public URL