Michael Carlson: Statistical disclosure control– an introduction with some examples and some technical aspects concerning risk assessment
Tid: On 2016-05-18 kl 13.00 - 14.00
Plats: Room B705, Department of Statistics, SU
Medverkande: Michael Carlson, Department of Statistics, Stockholm University
Abstract:
For official statistics, protection of confidential information about individual respondents in a statistical survey or in a database is in most countries a legal requirement but it should for obvious reasons also be viewed as an ethical requirement for all types of statistical activity. From a business perspective, it is also a matter of self-preservation; a statistical institute who fails to safeguard information given in confidence will soon lose public trust and their main source of information, i.e. respondents to their surveys. The aim of statistical disclosure control (SDC) is to protect confidential data by distorting the original data and thereby minimizing the risk of disclosure but doing so in a manner that preserves as much as possible of the statistical properties of the original unprotected data, i.e. maximizing the usefulness (utility) of the data.
During the first part of the seminar we will shortly review the legal framework in Sweden, present some of the basic concepts and problems of SDC and illustrate these with some simple examples. This introduction will be relatively non-technical and should appeal to bachelor and master level students. The second part or the seminar will be slightly more technical and we will formalize some concepts relating to risk assessment. Statisticians typically assess disclosure risks with a probabilistic approach whereas the data science community has specified risk criteria in a slightly different manner. Furthermore, recent research has expanded SDC into information theory, suggesting entropy measures of both risk and utility. So-called diversity indices, familiar to biologists, can be viewed as yet another way of measuring risks. These seemingly different approaches are just different perspectives of the same thing and we will demonstrate how they connect to each other and also point to some ideas for future investigation.
