Jim Smith James.Smith@uwe.ac.uk
Professor in Interactive Artificial Intelligence
A genetic approach to statistical disclosure control
Smith, Jim; Clark, Alistair; Staggemeier, Andrea T.; Serpell, Martin
Authors
Alistair Clark
Andrea T. Staggemeier
Martin Serpell Martin2.Serpell@uwe.ac.uk
Senior Lecturer in Computer Systems and Networks
Abstract
Statistical disclosure control is the collective name for a range of tools used by data providers such as government departments to protect the confidentiality of individuals or organizations. When the published tables contain magnitude data such as turnover or health statistics, the preferred method is to suppress the values of certain cells. Assigning a cost to the information lost by suppressing any given cell creates the cell suppression problem. This consists of finding the minimum cost solution which meets the confidentiality constraints. Solving this problem simultaneously for all of the sensitive cells in a table is NP-hard and not possible for medium to large sized tables. In this paper, we describe the development of a heuristic tool for this problem which hybridizes linear programming (to solve a relaxed version for a single sensitive cell) with a genetic algorithm (to seek an order for considering the sensitive cells which minimizes the final cost). Considering a range of real-world and representative artificial datasets, we show that the method is able to provide relatively low cost solutions for far larger tables than is possible for the optimal approach to tackle. We show that our genetic approach is able to significantly improve on the initial solutions provided by existing heuristics for cell ordering, and outperforms local search. This approach is then extended and applied to large statistical tables with over 200000 cells. © 2012 IEEE.
Journal Article Type | Article |
---|---|
Publication Date | Jan 1, 2012 |
Deposit Date | Feb 19, 2013 |
Publicly Available Date | Apr 4, 2016 |
Journal | IEEE Transactions on Evolutionary Computation |
Print ISSN | 1089-778X |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 16 |
Issue | 3 |
Pages | 431-441 |
DOI | https://doi.org/10.1109/TEVC.2011.2159271 |
Keywords | algorithm design and analysis, analysis of variance, equations, genetic algorithms, genetics, linear programming, perturbation methods, statistical disclosure control |
Public URL | https://uwe-repository.worktribe.com/output/965842 |
Publisher URL | http://dx.doi.org/10.1109/TEVC.2011.2159271 |
Additional Information | Additional Information : This article is copyright 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. |
Contract Date | Apr 4, 2016 |
Files
Download.pdf
(3.2 Mb)
PDF
You might also like
The inadvertently revealing statistic: A systemic gap in statistical training?
(2024)
Journal Article
Machine learning models in trusted research environments - Understanding operational risks
(2023)
Journal Article
SACRO guide to statistical output checking
(2023)
Other
A novel mirror neuron inspired decision-making architecture for human–robot interaction
(2023)
Journal Article
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search