Felix Ritchie Felix.Ritchie@uwe.ac.uk
Professor in Economics
10 is the safest number that there's ever been
Ritchie, Felix
Authors
Abstract
When checking frequency and magnitude tables for disclosure risk, the cell threshold (the minimum number of observations in each cell) is a crucial parameter. In rules-based environments, this is a hard limit on what can or can't be published. In principles-based environments, this is less important but has an impact on the operational effectiveness of statistical disclosure control (SDC) processes. Determining the appropriate threshold is an unsolved problem. Ten is a common threshold value for both national statistics and research outputs, but five or twenty are also popular. Some organisations use multiple thresholds for different data sources. These higher thresholds are all entirely subjective. Three is the only threshold which has an objective statistical foundation, but most organisations argue that this leaves little margin for error. Unfortunately, there is no equivalent statistical case for any number larger than three: ten is popular because it is popular. This is particularly the case for research environments, where there is no guidance. This paper provides the first empirical foundation for threshold selection by modelling alternative threshold values on both synthetic data and real datasets. The paper demonstrates that this is a complex question. The trade-off between risk and value is well-known, but we demonstrate that the protection of a higher threshold depends on the risk measure. There is no monotonic relation between a threshold and risk, as higher thresholds can increase disclosure risk in particular scenarios. The blind application of high-threshold rules might mask new risks. There is no unambiguous result, other than the simplistic ones that more observations reduces risk and higher thresholds reduce utility. Finally, the paper notes that a reconsideration of disclosure checking practices can reduce risk irrespective of the threshold for some risk scenarios.
Journal Article Type | Article |
---|---|
Acceptance Date | Aug 1, 2022 |
Online Publication Date | Aug 31, 2022 |
Publication Date | Aug 31, 2022 |
Deposit Date | Aug 11, 2022 |
Publicly Available Date | Oct 1, 2022 |
Journal | Transactions on Data Privacy |
Print ISSN | 1888-5063 |
Peer Reviewed | Peer Reviewed |
Volume | 15 |
Issue | 2 |
Pages | 109-140 |
Series ISSN | 1888-5063 |
Keywords | privacy, confidentiality, data governance, statistical disclosure control |
Public URL | https://uwe-repository.worktribe.com/output/9853172 |
Publisher URL | http://www.tdp.cat/issues21/abs.a445a21.php |
Files
10 is the safest number that there's ever been
(680 Kb)
PDF
Licence
http://www.rioxx.net/licenses/all-rights-reserved
Publisher Licence URL
http://www.rioxx.net/licenses/all-rights-reserved
Copyright Statement
This is the author’s accepted manuscript. The final published version is available here: http://www.tdp.cat/issues21/abs.a445a21.php
The author(s) retain any copyright on the submitted material. The contributors grant the journal the right to publish, distribute, index, archive and publicly display the article (and the abstract)
in printed, electronic or any other form.
You might also like
Operationalising ‘safe statistics’: The case of linear regression
(-0001)
Preprint / Working Paper
Addressing the human factor in data access: Incentive compatibility, legitimacy and cost-effectiveness in public data resources
(-0001)
Preprint / Working Paper
Resistance to change in government: Risk, inertia and incentives
(-0001)
Preprint / Working Paper
Access to sensitive data: Satisfying objectives rather than constraints
(2014)
Journal Article
Evidence-based, context-sensitive, user-centred, risk-managed SDC planning: Designing data access solutions for scientific use
(2015)
Presentation / Conference Contribution
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search