Felix Ritchie Felix.Ritchie@uwe.ac.uk
Professor in Economics
Analyzing the disclosure risk of regression coefficients
Ritchie, Felix
Authors
Abstract
A major growth area in social science research this century has been access to highly sensitive confidential microdata, often via restricted-access remote facilities. These allow researchers highly unlimited access to manipulate the data but with checks for disclosure risk before the statistical results can be published. Effective output-based statistical disclosure control (OSDC) is therefore central to effective use of confidential microdata for research.
Multiple regression is a key anaytical tool for researchers, and so knowing whether multiple regression results are ‘safe’ for release is essential for research facilities. This is a relatively unexplored field; guidelines used by almost all restricted-access facilities reference an informal document from 2006, but more recent work suggests that problems may exist.
This paper demonstrates that linear regression coefficients show no substantive disclosure risks in realistic environments, and so should be considered as ‘safe statistics’ in the terminology of this field. Conflicting results in the literature reflect institutional perceptions rather than statistical differences, the confusion of statistical quality with disclosure risk, or the failure to identify the source of risk. The result has important implications for those responsible for providing research access to sensitive data.
The paper explores this result on simple linear regression models; more complex models are shown to be ‘safer’ subsets. Non-linear models pose slightly different problems, but this paper indicates a way such models may be tackled.
Citation
Ritchie, F. (2019). Analyzing the disclosure risk of regression coefficients. Transactions on data privacy, 12(2), 145-173
Journal Article Type | Article |
---|---|
Acceptance Date | Apr 30, 2019 |
Publication Date | Aug 1, 2019 |
Deposit Date | Jul 11, 2019 |
Publicly Available Date | Jul 11, 2019 |
Journal | Transactions on Data Privacy |
Print ISSN | 1888-5063 |
Peer Reviewed | Peer Reviewed |
Volume | 12 |
Issue | 2 |
Pages | 145-173 |
Keywords | privacy, confidentiality, statistical disclosure control, output SDC, principles-based, linear regression, safe statistics |
Public URL | https://uwe-repository.worktribe.com/output/1493597 |
Publisher URL | http://www.tdp.cat/issues16/abs.a303a18.php |
Additional Information | This is the accepted manuscript of an item currently in press and due to be published in Transactions on Data Privacy. |
Files
sdc for linear regression coefficients for TDP rev 2 v01 notracks.docx
(69 Kb)
Document
You might also like
Spontaneous recognition: An unneccessary control on data access?
(2017)
Book Chapter
Open data: Who needs it?
(2017)
Presentation / Conference
Lessons learned in training ‘safe users’ of confidential data
(2017)
Presentation / Conference
The "Five Safes": A framework for planning, designing and evaluating data access solutions
(2017)
Presentation / Conference
Spontaneous recognition: An unnecessary control on data access?
(2017)
Journal Article