Skip to main content

Research Repository

Advanced Search

Deep learning with small datasets: Using autoencoders to address limited datasets in construction management

Davila Delgado, Manuel; Oyedele, Lukumon

Deep learning with small datasets: Using autoencoders to address limited datasets in construction management Thumbnail


Authors

Manuel Davila Delgado Manuel.Daviladelgado@uwe.ac.uk
Associate Professor - AR/VR Development with Artificial Intelligence

Lukumon Oyedele L.Oyedele@uwe.ac.uk
Professor in Enterprise & Project Management



Abstract

Large datasets are necessary for deep learning as the performance of the algorithms used increases as the size of the dataset increases. Poor data management practices and the low level of digitisation of the construction industry represent a big hurdle to compiling big datasets; which in many cases can be prohibitively expensive. In other fields, such as computer vision, data augmentation techniques and synthetic data have been used successfully to address issues with limited datasets. In this study, undercomplete, sparse, deep and variational autoencoders are investigated as methods for data augmentation and generation of synthetic data. Two financial datasets of underground and overhead power transmission projects are used as case studies. The datasets were augmented using the autoencoders, and the project cost was predicted using a deep neural network regressor. All the augmented datasets yielded better results than the original dataset. On average the autoencoders provide a model score improvement of 7.2% and 11.5% for the underground and overhead datasets, respectively. MAE and RMSE are lower for all autoencoders as well. The average error improvement for the underground and overhead datasets is 22.9% and 56.5%, respectively. Variational autoencoders provided more robust results and represented better the non-linear correlations among the attributes in both datasets. The novelty of this study is that presents an approach to improve existing datasets and thus improve the generalisation of deep learning models when other approaches are not feasible. Moreover, this study provides practitioners with methods to address the limited access to big datasets, a visualisation method to extract insights from non-linear correlations in data, and a way to improve data privacy and to enable sharing sensitive data using analogous synthetic data. The main contribution to knowledge of this study is that it presents a data augmentation technique for transformation variant data. Many techniques have been developed for transformation invariant data that contributed to improving the performance of deep learning models. This study showed that autoencoders are a good option for data augmentation for transformation variant data.

Citation

Davila Delgado, M., & Oyedele, L. (2021). Deep learning with small datasets: Using autoencoders to address limited datasets in construction management. Applied Soft Computing, 112, Article 107836. https://doi.org/10.1016/j.asoc.2021.107836

Journal Article Type Article
Acceptance Date Aug 16, 2021
Online Publication Date Aug 25, 2021
Publication Date 2021-11
Deposit Date Aug 30, 2021
Publicly Available Date Aug 26, 2022
Journal Applied Soft Computing
Print ISSN 1568-4946
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 112
Article Number 107836
DOI https://doi.org/10.1016/j.asoc.2021.107836
Public URL https://uwe-repository.worktribe.com/output/7717930

Files





You might also like



Downloadable Citations