Confidentiality and linked data

Ritchie, Felix; Smith, Jim

Confidentiality and linked data

Ritchie, Felix; Smith, Jim

Authors

Felix Ritchie Felix.Ritchie@uwe.ac.uk
Professor in Economics

Jim Smith James.Smith@uwe.ac.uk
Professor in Interactive Artificial Intelligence

Contributors

Gentiana Roarson
Editor

Abstract

This chapter considers the confidentiality issues around linked data. It notes that the use and availability of secondary (adminstrative or social media) data, allied to powerful processing and machine learning techniques, in theory means that re-identification of confidential source data is likely in all types of releases.
In practice, there are barriers. Data linking is a complex and difficult process, and there are many things that could go wrong. However, this is less of a problem for a potential intruder, who is not concerned about re-identifying all data, but just enough to achieve his or her ends; the accuracy of the re-identification may not even be important if there is just a perception of poor confidentiality protection.
More importantly, focusing on the data-centred models misleads us into thinking "what can go wrong?" instead of "what will go wrong?". Aggregate statistics can be attacked to show hidden numbers of observations, but this does not necessarily disclose confidential information; aggregate statistics that could re-identify soruce data are not typically useful statistics. For the release of microdata, a user-centred perspective allows one to consider a range of non-statistical solutions which are both robust and fairly future-proof.
In summary, linked data does present a strong theoretical challenge to the protection of data, as statistical protection is outgunned by technology and software; but in practice a shift in focus to the evidence-based user-centred view shows that there are many directions for practical data protection to go.

Online Publication Date	Dec 11, 2018
Deposit Date	Jan 22, 2019
Peer Reviewed	Peer Reviewed
Pages	1-34
Series Title	National Statistician's Quality Review
Book Title	Privacy and Data Confidentiality Methods – a National Statistician’s Quality Review
Keywords	confidentiality, privacy, linked data, artificial intelligence, data mining, machine learning, data-centred, user-centred
Public URL	https://uwe-repository.worktribe.com/output/856040
Publisher URL	https://gss.civilservice.gov.uk/guidances/quality/nsqr/privacy-and-data-confidentiality-methods-a-national-statisticians-quality-review/
Contract Date	Jan 22, 2019

Operationalising ‘safe statistics’: The case of linear regression (-0001)
Preprint / Working Paper

Addressing the human factor in data access: Incentive compatibility, legitimacy and cost-effectiveness in public data resources (-0001)
Preprint / Working Paper

Resistance to change in government: Risk, inertia and incentives (-0001)
Preprint / Working Paper

Access to sensitive data: Satisfying objectives rather than constraints (2014)
Journal Article

Evidence-based, context-sensitive, user-centred, risk-managed SDC planning: Designing data access solutions for scientific use (2015)
Presentation / Conference Contribution

Confidentiality and linked data

Ritchie, Felix; Smith, Jim

Authors

Contributors

Abstract

You might also like

Downloadable Citations