Data Confidentiality Workshop
Home Workshop Agenda Participants Travel Information

 

Contact

 


WORKSHOP ON DATA CONFIDENTIALITY

September 6-7, 2007 in Arlington, VA

White Paper & Bio


First a rant:

A supermarket I regularly shop at recently added beer and wine to its selections. When I went to purchase some beer, I was asked to present my license as proof of age. The clerk took the identification and entered my birth date into the register. This is one of the worst cases of "collect more data" I have ever encountered--no consent was asked to add data entry to a routine social interaction. In a subsequent transaction, I requested that my birth date not be entered, and as I suspected, the transaction could not be completed without the data entry. The best I could achieve was for the clerk to volunteer to enter false data, which she did with some zeal.

That experience hit a number of interests:

Fair information practices
Social interactions involving privacy
Proliferation of data

And primed the ground for a few more:

Techniques that compensate for false data
Checking a certain company’s responsiveness to privacy complaints

To continue the saga, the customer service department assured me that the data is entered but not captured (as evidenced by the reapplication each time). Should I trust them? More importantly, why should I have to trust them? What happens if the policy changes? I wouldn’t be able to tell the difference. What if the ownership changes, will they honor the same policy? Is there any benefit to data entry? If it is a guarantee of the social interaction (she really did check my age!) does it really help? … It seems more likely that the problem of not really looking at the id isn’t fixed in the long term (eventually you don’t really look, just enter the data). In short, it seems like the intent was ok but probably flawed, and the application was seriously flawed.

Professional interests:

Since a good chunk of my job goes toward making the Census Bureau’s microdata publications safe, the proliferation of data is a major concern. My current research project is secure regression analysis. Can statistical modeling be done without examination of record level data, without running into differencing problems, without disclosing low level counts, and with synthetic diagnostic information (e.g. residuals)? This is partly software development, partly understanding the ground of statistical perception (what really constitutes “playing with the data”?) and partly a good set of data protection rules.

Philip Steel
Disclosure Avoidance Group
Statistical Research Division
Census Bureau

Philip Steel

U.S Census Bureau

 

 

Biographical Data

Philip Steel serves on the Census Bureau’s Disclosure Review Board and is a member of the Disclosure Avoidance Group in the Bureau’s Statistical Research Division. He is the current chair of the Confidentiality and Data Access Committee, an interest group of the Federal Committee on Statistical Methodology. He maintains an active interest in the application of the Health Insurance Portability & Accountability Act's provisions on confidentiality, as well as its interaction with the Common rule. His background is in probability theory and variance estimation. His research interests include disclosure risk measures, record linkage, table enumeration, and he is currently involved in designing software for secure regression analysis.