Felix Ritchie Felix.Ritchie@uwe.ac.uk
Professor in Economics
Confidentiality and linked data
Ritchie, Felix; Smith, Jim
Authors
Jim Smith James.Smith@uwe.ac.uk
Professor in Interactive Artificial Intelligence
Contributors
Gentiana Roarson
Editor
Abstract
This chapter considers the confidentiality issues around linked data. It notes that the use and availability of secondary (adminstrative or social media) data, allied to powerful processing and machine learning techniques, in theory means that re-identification of confidential source data is likely in all types of releases.
In practice, there are barriers. Data linking is a complex and difficult process, and there are many things that could go wrong. However, this is less of a problem for a potential intruder, who is not concerned about re-identifying all data, but just enough to achieve his or her ends; the accuracy of the re-identification may not even be important if there is just a perception of poor confidentiality protection.
More importantly, focusing on the data-centred models misleads us into thinking "what can go wrong?" instead of "what will go wrong?". Aggregate statistics can be attacked to show hidden numbers of observations, but this does not necessarily disclose confidential information; aggregate statistics that could re-identify soruce data are not typically useful statistics. For the release of microdata, a user-centred perspective allows one to consider a range of non-statistical solutions which are both robust and fairly future-proof.
In summary, linked data does present a strong theoretical challenge to the protection of data, as statistical protection is outgunned by technology and software; but in practice a shift in focus to the evidence-based user-centred view shows that there are many directions for practical data protection to go.
Citation
Ritchie, F., & Smith, J. Confidentiality and linked data. In G. Roarson (Ed.), Privacy and Data Confidentiality Methods – a National Statistician’s Quality Review (1-34). Newport: Office for National Statistics
Online Publication Date | Dec 11, 2018 |
---|---|
Deposit Date | Jan 22, 2019 |
Peer Reviewed | Peer Reviewed |
Pages | 1-34 |
Series Title | National Statistician's Quality Review |
Book Title | Privacy and Data Confidentiality Methods – a National Statistician’s Quality Review |
Keywords | confidentiality, privacy, linked data, artificial intelligence, data mining, machine learning, data-centred, user-centred |
Public URL | https://uwe-repository.worktribe.com/output/856040 |
Publisher URL | https://gss.civilservice.gov.uk/guidances/quality/nsqr/privacy-and-data-confidentiality-methods-a-national-statisticians-quality-review/ |
You might also like
Spontaneous recognition: An unneccessary control on data access?
(2017)
Book Chapter
Open data: Who needs it?
(2017)
Presentation / Conference
Lessons learned in training ‘safe users’ of confidential data
(2017)
Presentation / Conference
The "Five Safes": A framework for planning, designing and evaluating data access solutions
(2017)
Presentation / Conference
Spontaneous recognition: An unnecessary control on data access?
(2017)
Journal Article