Daniel Dopazo Daniel.Dopazo@uwe.ac.uk
Research Fellow- Data Science
An automated machine learning approach for classifying infrastructure cost data
Dopazo, Daniel Adanza; Mahdjoubi, Lamine; Gething, Bill; Mahamadu, Abdul‐Majeed
Authors
Lamine Mahdjoubi Lamine.Mahdjoubi@uwe.ac.uk
Professor in Info. & Communication & Tech.
Bill Gething Bill.Gething@uwe.ac.uk
Professor in Architecture
Abdul‐Majeed Mahamadu
Abstract
Data on infrastructure project costs are often unstructured and lack consistency. To enable costs to be compared within and between organizations, large amounts of data must be classified to a common standard, typically a manual process. This is time-consuming, error-prone, inconsistent, and subjective, as it is based on human judgment. This paper describes a novel approach for automating the process by harnessing natural language processing identifying the relevant keywords in the text descriptions and implementing machine learning classifiers to emulate the expert's knowledge. The task was to identify “extra over” cost items, conversion factors, and to recognize the correct work breakdown structure (WBS) category. The results show that 94% of the “extra over” cases were correctly classified, and 90% of cases that needed conversion, correctly predicting an associated conversion factor with 87% accuracy. Finally, the WBS categories were identified with 72% accuracy. The approach has the potential to provide a step change in the speed and accuracy of structuring and classifying infrastructure cost data for benchmarking.
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 9, 2023 |
Online Publication Date | Oct 19, 2023 |
Publication Date | Apr 1, 2024 |
Deposit Date | Oct 20, 2023 |
Publicly Available Date | Apr 11, 2024 |
Journal | Computer-Aided Civil and Infrastructure Engineering |
Print ISSN | 1093-9687 |
Electronic ISSN | 1467-8667 |
Publisher | Wiley |
Peer Reviewed | Peer Reviewed |
Volume | 39 |
Issue | 7 |
Pages | 1061-1076 |
DOI | https://doi.org/10.1111/mice.13114 |
Keywords | Computational Theory and Mathematics; Computer Graphics and Computer-Aided Design; Computer Science Applications; Civil and Structural Engineering; Building and Construction |
Public URL | https://uwe-repository.worktribe.com/output/11385155 |
Additional Information | Accepted: 2023-10-09; Published: 2023-10-19 |
Files
An automated machine learning approach for classifying infrastructure cost data
(883 Kb)
PDF
Licence
http://creativecommons.org/licenses/by/4.0/
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
A leakage detection system extracting the most meaningful features with decision trees
(2020)
Presentation / Conference Contribution
Assessing movement quality on straight leg raise using neural networks and data science
(2022)
Journal Article
A leakage detection system with an efficient prioritization at a district meter area level
(2021)
Presentation / Conference Contribution
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search