Xiaolong Xu
MIAEC: Missing data imputation based on the evidence Chain
Xu, Xiaolong; Chong, Weizhi; Li, Shancang; Arabo, Abdullahi
Authors
Weizhi Chong
Shancang Li
Abdullahi Arabo Abdullahi.Arabo@uwe.ac.uk
Associate professor of Cyber Science and Network Security
Abstract
© 2013 IEEE. Missing or incorrect data caused by improper operations can seriously compromise security investigation. Missing data can not only damage the integrity of the information but also lead to the deviation of the data mining and analysis. Therefore, it is necessary to implement the imputation of missing value in the phase of data preprocessing to reduce the possibility of data missing as a result of human error and operations. The performances of existing imputation approaches of missing value cannot satisfy the analysis requirements due to its low accuracy and poor stability, especially the rapid decreasing imputation accuracy with the increasing rate of missing data. In this paper, we propose a novel missing value imputation algorithm based on the evidence chain (MIAEC), which first mines all relevant evidence of missing values in each data tuple and then combines this relevant evidence to build the evidence chain for further estimation of missing values. To extend MIAEC for large-scale data processing, we apply the map-reduce programming model to realize the distribution and parallelization of MIAEC. Experimental results show that the proposed approach can provide higher imputation accuracy compared with the missing data imputation algorithm based on naive Bayes, the mode imputation algorithm, and the proposed missing data imputation algorithm based on K-nearest neighbor. MIAEC has higher imputation accuracy and its imputation accuracy is also assured with the increasing rate of missing value or the position change of missing value. MIAEC is also proved to be suitable for the distributed computing platform and can achieve an ideal speedup ratio.
Journal Article Type | Article |
---|---|
Acceptance Date | Jan 20, 2018 |
Online Publication Date | Feb 21, 2018 |
Publication Date | Feb 21, 2018 |
Deposit Date | Jan 29, 2018 |
Publicly Available Date | Feb 22, 2018 |
Journal | IEEE Access |
Electronic ISSN | 2169-3536 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Volume | 6 |
Pages | 12983-12992 |
DOI | https://doi.org/10.1109/ACCESS.2018.2803755 |
Keywords | data preprocessing, digital forensics, map-reduce |
Public URL | https://uwe-repository.worktribe.com/output/871311 |
Publisher URL | http://dx.doi.org/10.1109/ACCESS.2018.2803755 |
Additional Information | Additional Information : (c) 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. |
Contract Date | Jan 29, 2018 |
Files
MIAEC Missing Data Imputation based on the Evidence Chain.pdf
(565 Kb)
PDF
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search