Skip to main content

Research Repository

Advanced Search

Feature construction and selection using genetic programming and a genetic algorithm

Smith, Matthew G.; Bull, Larry


Matthew G. Smith

Lawrence Bull
School Director (Research & Enterprise) and Professor


Conor Ryan

Terence Soule

Maarten Keijzer

Edward Tsang

Riccardo Poli

Ernesto Costa


The use of machine learning techniques to automatically analyse data for information is becoming increasingly widespread. In this paper we examine the use of Genetic Programming and a Genetic Algorithm to pre-process data before it is classified using the C4.5 decision tree learning algorithm. The Genetic Programming is used to construct new features from those available in the data, a potentially significant process for data mining since it gives consideration to hidden relationships between features. The Genetic Algorithm is used to determine which such features are the most predictive. Using ten well-known datasets we show that our approach, in comparison to C4.5 alone, provides marked improvement in a number of cases.


Smith, M. G., & Bull, L. (2003). Feature construction and selection using genetic programming and a genetic algorithm. Lecture Notes in Artificial Intelligence, 2610, 229-237.

Journal Article Type Conference Paper
Publication Date Jan 1, 2003
Journal Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Print ISSN 0302-9743
Electronic ISSN 1611-3349
Publisher Springer Verlag
Peer Reviewed Not Peer Reviewed
Volume 2610
Pages 229-237
Series Title Lecture Notes in Computer Science
Series Number 2610
ISBN 354000971X; 9783540009719
Keywords programming techniques, computation by abstract devices, algorithm analysis and problem complexity, artificial intelligence, pattern recognition, bioinformatics
Public URL
Publisher URL