Skip to main content

Research Repository

Advanced Search

MIAEC: Missing data imputation based on the evidence Chain

Xu, Xiaolong; Chong, Weizhi; Li, Shancang; Arabo, Abdullahi

MIAEC: Missing data imputation based on the evidence Chain Thumbnail


Authors

Xiaolong Xu

Weizhi Chong

Shancang Li

Profile Image

Abdullahi Arabo Abdullahi.Arabo@uwe.ac.uk
Associate professor of Cyber Science and Network Security



Abstract

© 2013 IEEE. Missing or incorrect data caused by improper operations can seriously compromise security investigation. Missing data can not only damage the integrity of the information but also lead to the deviation of the data mining and analysis. Therefore, it is necessary to implement the imputation of missing value in the phase of data preprocessing to reduce the possibility of data missing as a result of human error and operations. The performances of existing imputation approaches of missing value cannot satisfy the analysis requirements due to its low accuracy and poor stability, especially the rapid decreasing imputation accuracy with the increasing rate of missing data. In this paper, we propose a novel missing value imputation algorithm based on the evidence chain (MIAEC), which first mines all relevant evidence of missing values in each data tuple and then combines this relevant evidence to build the evidence chain for further estimation of missing values. To extend MIAEC for large-scale data processing, we apply the map-reduce programming model to realize the distribution and parallelization of MIAEC. Experimental results show that the proposed approach can provide higher imputation accuracy compared with the missing data imputation algorithm based on naive Bayes, the mode imputation algorithm, and the proposed missing data imputation algorithm based on K-nearest neighbor. MIAEC has higher imputation accuracy and its imputation accuracy is also assured with the increasing rate of missing value or the position change of missing value. MIAEC is also proved to be suitable for the distributed computing platform and can achieve an ideal speedup ratio.

Citation

Xu, X., Chong, W., Li, S., & Arabo, A. (2018). MIAEC: Missing data imputation based on the evidence Chain. IEEE Access, 6, 12983-12992. https://doi.org/10.1109/ACCESS.2018.2803755

Journal Article Type Article
Acceptance Date Jan 20, 2018
Online Publication Date Feb 21, 2018
Publication Date Feb 21, 2018
Deposit Date Jan 29, 2018
Publicly Available Date Feb 22, 2018
Journal IEEE Access
Electronic ISSN 2169-3536
Publisher Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed Peer Reviewed
Volume 6
Pages 12983-12992
DOI https://doi.org/10.1109/ACCESS.2018.2803755
Keywords data preprocessing, digital forensics, map-reduce
Public URL https://uwe-repository.worktribe.com/output/871311
Publisher URL http://dx.doi.org/10.1109/ACCESS.2018.2803755
Additional Information Additional Information : (c) 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.

Files





You might also like



Downloadable Citations