Skip to main content

Research Repository

Advanced Search

Analyzing the disclosure risk of regression coefficients

Ritchie, Felix

Authors



Abstract

A major growth area in social science research this century has been access to highly sensitive confidential microdata, often via restricted-access remote facilities. These allow researchers highly unlimited access to manipulate the data but with checks for disclosure risk before the statistical results can be published. Effective output-based statistical disclosure control (OSDC) is therefore central to effective use of confidential microdata for research.
Multiple regression is a key anaytical tool for researchers, and so knowing whether multiple regression results are ‘safe’ for release is essential for research facilities. This is a relatively unexplored field; guidelines used by almost all restricted-access facilities reference an informal document from 2006, but more recent work suggests that problems may exist.
This paper demonstrates that linear regression coefficients show no substantive disclosure risks in realistic environments, and so should be considered as ‘safe statistics’ in the terminology of this field. Conflicting results in the literature reflect institutional perceptions rather than statistical differences, the confusion of statistical quality with disclosure risk, or the failure to identify the source of risk. The result has important implications for those responsible for providing research access to sensitive data.
The paper explores this result on simple linear regression models; more complex models are shown to be ‘safer’ subsets. Non-linear models pose slightly different problems, but this paper indicates a way such models may be tackled.

Citation

Ritchie, F. (2019). Analyzing the disclosure risk of regression coefficients. Transactions on data privacy, 12(2), 145-173

Journal Article Type Article
Acceptance Date Apr 30, 2019
Publication Date 2019-08
Publicly Available Date Nov 30, -0001
Journal Transactions on Data Privacy
Print ISSN 1888-5063
Peer Reviewed Peer Reviewed
Volume 12
Issue 2
Pages 145-173
Keywords privacy, confidentiality, statistical disclosure control, output SDC, principles-based, linear regression, safe statistics
Public URL https://uwe-repository.worktribe.com/output/1493597
Publisher URL http://www.tdp.cat/issues16/tdp.a303a18.pdf
Additional Information This is the accepted manuscript of an item currently in press and due to be published in Transactions on Data Privacy.

Files







You might also like



Downloadable Citations