Statistical confidentiality is an essential principle
of modern demographic and economic data-gathering and related statistical
research. Today, the term "statistical confidentiality" encompasses
in shorthand form a bundle of now widely-recognized activities. These
may include
legal protections, standards of professional conduct, a set of non-disclosure
assurances provided to respondents, and compilation and dissemination
practices designed to protect data providers from improper use of their
answers.
It wasn't always this way. As I and others have discussed, the practice
of statistical confidentiality has a history. See for example the website
of documents and papers that William Seltzer and I have prepared as
a
resource:
http://www.uwm.edu/~margo/govstat/integrity.htm.
Generations of statisticians and social scientists came to define the
principles and practices we take as givens as they built the data production
infrastructure primarily in the public sector of official statistics
and social science that evolved in the nineteenth and twentieth centuries,
notably long before the computer revolution and the internet.
A great deal of work has been done in recent years developing technical
methods to protect confidentiality and to develop confidentiality standards
for data collection, preservation and management. Much less
systematic work exists on the breaches, failures, and weaknesses of
such protections, beyond the recounting of scandals and controversies
when confidentiality is breached. I have called for more systematic
work on
the history of the development of the practice and ethical issues involved
with protecting data confidentiality. That history reveals how and why
current practices came to be developed and also where systematic weaknesses
exist. On the simplest level, such work involves examination
of the intentions and capacities of intruders, most notably those who
have the legal, political, or economic power to breach confidentiality
successfully.
In other words, in addition to intruders who one can characterize as
data criminals (identity thieves, scandal mongering journalists), there
have been in the past and will be in the future intruders who have legitimate
competing claims to access information which nevertheless shatter the
commitments of data confidentiality. Included here are national security
claims, subpoenas of data for civil and criminal prosecutions. This
work involves historical and archival analysis of past controversies
and breaches in other to understand the technical, legal, and political
factors involved. See for example, Margo Anderson and William Seltzer
(2007). "Challenges to the Confidentiality of U.S. Federal Statistics,
1910-1965." Journal of Official Statistics. 23(1): 1-34, available
at
http://www.uwm.edu/~margo/govstat/integrity.htm.