Skip to main content

Research Repository

Advanced Search

Privacy preserving record linkage in the presence of missing values

Chi, Yuan; Hong, Jun; Jurek, Anna; Liu, Weiru; O'Reilly, Dermot

Privacy preserving record linkage in the presence of missing values Thumbnail


Authors

Yuan Chi

Jun Hong Jun.Hong@uwe.ac.uk
Professor in Artificial Intelligence

Anna Jurek

Weiru Liu

Dermot O'Reilly



Abstract

© 2017 The problem of record linkage is to identify records from two datasets, which refer to the same entities (e.g. patients). A particular issue of record linkage is the presence of missing values in records, which has not been fully addressed. Another issue is how privacy and confidentiality can be preserved in the process of record linkage. In this paper, we propose an approach for privacy preserving record linkage in the presence of missing values. For any missing value in a record, our approach imputes the similarity measure between the missing value and the value of the corresponding field in any of the possible matching records from another dataset. We use the k-NNs (k Nearest Neighbours in the same dataset) of the record with the missing value and their distances to the record for similarity imputation. For privacy preservation, our approach uses the Bloom filter protocol in the settings of both standard privacy preserving record linkage without missing values and privacy preserving record linkage with missing values. We have conducted an experimental evaluation using three pairs of synthetic datasets with different rates of missing values. Our experimental results show the effectiveness and efficiency of our proposed approach.

Citation

Chi, Y., Hong, J., Jurek, A., Liu, W., & O'Reilly, D. (2017). Privacy preserving record linkage in the presence of missing values. Information Systems, 71, 199-210. https://doi.org/10.1016/j.is.2017.07.001

Journal Article Type Article
Acceptance Date Jul 5, 2017
Online Publication Date Jul 5, 2017
Publication Date Nov 1, 2017
Deposit Date Jul 10, 2017
Publicly Available Date Jul 5, 2018
Journal Information Systems
Print ISSN 0306-4379
Publisher Elsevier
Peer Reviewed Peer Reviewed
Volume 71
Pages 199-210
DOI https://doi.org/10.1016/j.is.2017.07.001
Keywords record linkage, probabilistic record linkage, privacy preserving record linkage, missing values, k-nearest neighbours, data encryption
Public URL https://uwe-repository.worktribe.com/output/879132
Publisher URL https://doi.org/10.1016/j.is.2017.07.001
Related Public URLs http://www.sciencedirect.com/science/article/pii/S030643791630504X

Files





You might also like



Downloadable Citations