Social scientists increasingly expect to have access to detailed source microdata for research purposes. As the level of detail increases, data owners worry about ‘spontaneous recognition’, the likelihood that a microdata user believes that he or she has accidentally identified one of the data subjects in the dataset, and may share that information. This concern, particularly in respect of microdata on businesses, leads to excessive restrictions on data use.
We argue that spontaneous recognition presents no meaningful risk to confidentiality. The standard ‘intruder’ model covers re-identification risk to an acceptable standard under most current legislation. If a spontaneous re-identification did occur, the user is very unlikely to be in breach of any law or condition of access. Any breach would only occur as a result of further actions by the user to confirm or assert identity, and these should be seen as a managerial problem.
Nevertheless, a consideration of spontaneous recognition does highlight some of the implicit assumptions make in data access decisions. It also shows the importance of the data owner’s institutional culture: for a default-open data owner, spontaneous recognition is a useful check on whether all relevant risks have been addressed, but for a default-closed data owner spontaneous recognition provides a way to place insurmountable barriers in front of those wanting to increase data access.
This is a shorter version of the paper published in the ECB Statistical Paper No. 24
Ritchie, F. (2017). Spontaneous recognition: An unneccessary control on data access?. In G. Sandor, Z. Vereczkei, J. Poggi, P. Nymand-Andersen, A. Manninen, M. Karlberg, …C. Boldsen (Eds.), Selected papers from the 2016 Conference of European Statistics Stakeholders, 148-158. Publications Office of the European Union. https://doi.org/10.2785/091435