Data Confidentiality Workshop
Home Workshop Agenda Participants Travel Information

 

Contact

 


WORKSHOP ON DATA CONFIDENTIALITY

September 6-7, 2007 in Arlington, VA

White Paper & Bio


Large, multi-institute, scientific collaborations typically publish large data sets consisting of terabytes of information; they also replicate these data sets for reasons of performance and fault tolerance and to make them available to scientists at different sites. A number of tools are currently used in scientific Grid environments that provide data replication capabilities, efficient data transfer, and catalog support for registration and discovery of data sets. However, most of these tools provide limited support for protecting replicated data from the risks of tampering or unauthorized access.

To address the risks that scientific applications are exposed to when sharing their data sets, we are investigating enhancing existing tools for data management to protect users’ data. These enhancements would make scientific applications less vulnerable to data security threats that could jeopardize multi-institute collaborations and the integrity of scientific results.

These improvements would also satisfy the needs of additional application domains, such as medical applications that require more stringent security and protection. In the medical domain, we are particularly interested in using Grid tools to store and retrieve radiology images and other medical data sets associated with a patient. Protecting the confidentiality of this patient information is a key requirement.

In addition, a health Grid of this type could be used by researchers conducting clinical trials. These researchers could query the Grid to obtain images with a specified set of characteristics relevant to the research study. Such studies would require that patient data are sufficiently anonymized so that researchers can examine collections of images without danger of discovering patient identities.

 

Dr. Ann Chervenak

USC

 

 

Biographical Data

 

Dr. Ann Chervenak is a Research Assistant Professor in the University of Southern California Computer Science Department and a Project Leader in the Center for Grid Technologies at USC Information Sciences Institute. Dr. Chervenak leads research and development efforts related to management of petascale data sets for scientific grid computing environments. Her current research interests include replica management, data placement and data confidentiality. Previously, Dr. Chervenak was an Assistant Professor in the College of Computing at the Georgia Institute of Technology. Dr. Chervenak received her Ph.D. from the University of California at Berkeley, where her research included work on RAID (Redundant Array of Independent Disks) and efficient management of tertiary storage systems.