Deep learning with small datasets: Using autoencoders to address limited datasets in construction management

Davila Delgado, Manuel; Oyedele, Lukumon

doi:10.1016/j.asoc.2021.107836

Deep learning with small datasets: Using autoencoders to address limited datasets in construction management

Davila Delgado, Manuel; Oyedele, Lukumon

Authors

Manuel Davila Delgado Manuel.Daviladelgado@uwe.ac.uk
Associate Professor - AR/VR Development with Artificial Intelligence

Lukumon Oyedele L.Oyedele@uwe.ac.uk
Professor in Enterprise & Project Management

Abstract

Large datasets are necessary for deep learning as the performance of the algorithms used increases as the size of the dataset increases. Poor data management practices and the low level of digitisation of the construction industry represent a big hurdle to compiling big datasets; which in many cases can be prohibitively expensive. In other fields, such as computer vision, data augmentation techniques and synthetic data have been used successfully to address issues with limited datasets. In this study, undercomplete, sparse, deep and variational autoencoders are investigated as methods for data augmentation and generation of synthetic data. Two financial datasets of underground and overhead power transmission projects are used as case studies. The datasets were augmented using the autoencoders, and the project cost was predicted using a deep neural network regressor. All the augmented datasets yielded better results than the original dataset. On average the autoencoders provide a model score improvement of 7.2% and 11.5% for the underground and overhead datasets, respectively. MAE and RMSE are lower for all autoencoders as well. The average error improvement for the underground and overhead datasets is 22.9% and 56.5%, respectively. Variational autoencoders provided more robust results and represented better the non-linear correlations among the attributes in both datasets. The novelty of this study is that presents an approach to improve existing datasets and thus improve the generalisation of deep learning models when other approaches are not feasible. Moreover, this study provides practitioners with methods to address the limited access to big datasets, a visualisation method to extract insights from non-linear correlations in data, and a way to improve data privacy and to enable sharing sensitive data using analogous synthetic data. The main contribution to knowledge of this study is that it presents a data augmentation technique for transformation variant data. Many techniques have been developed for transformation invariant data that contributed to improving the performance of deep learning models. This study showed that autoencoders are a good option for data augmentation for transformation variant data.

Journal Article Type	Article
Acceptance Date	Aug 16, 2021
Online Publication Date	Aug 25, 2021
Publication Date	2021-11
Deposit Date	Aug 30, 2021
Publicly Available Date	Aug 26, 2022
Journal	Applied Soft Computing
Print ISSN	1568-4946
Publisher	Elsevier
Peer Reviewed	Peer Reviewed
Volume	112
Article Number	107836
DOI	https://doi.org/10.1016/j.asoc.2021.107836
Public URL	https://uwe-repository.worktribe.com/output/7717930

Files

Deep learning with small datasets: using autoencoders to address limited datasets in construction management (1 Mb)
PDF

Licence
http://creativecommons.org/licenses/by-nc-nd/4.0/

Publisher Licence URL
http://www.rioxx.net/licenses/all-rights-reserved

Copyright Statement
This is the author's accepted manuscript. The final published version is available here: https://doi.org/10.1016/j.asoc.2021.107836

Deep learning with small datasets: Using autoencoders to address limited datasets in construction management (1.5 Mb)
Document

Licence
http://creativecommons.org/licenses/by-nc-nd/4.0/

Publisher Licence URL
http://www.rioxx.net/licenses/all-rights-reserved

Copyright Statement
This is the author's accepted manuscript. The final published version is available here: https://doi.org/10.1016/j.asoc.2021.107836

Reusability analytics tool for end-of-life assessment of building materials in a circular economy (2018)
Journal Article

Big data platform for health and safety accident prediction (2018)
Journal Article

Automated design studies: Topology versus One-Step Evolutionary Structural Optimisation (2013)
Journal Article

Vision network: Augmented reality and virtual reality for digital built Britain (2019)
Report

Investigating profitability performance of construction projects using big data: A project analytics approach (2019)
Journal Article

Deep learning with small datasets: Using autoencoders to address limited datasets in construction management

Davila Delgado, Manuel; Oyedele, Lukumon

Authors

Abstract

Files

You might also like

Downloadable Citations