Loukides, Grigorios ORCID: https://orcid.org/0000-0003-0888-5061, Gkoulalas-Divanis, Aris and Malin, Bradley 2010. Anonymization of electronic medical records for validating genome-wide association studies. Proceedings of the National Academy of Sciences 107 (17) , pp. 7898-7903. 10.1073/pnas.0911686107 |
Abstract
Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks.
Item Type: | Article |
---|---|
Date Type: | Publication |
Status: | Published |
Schools: | Computer Science & Informatics |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Publisher: | National Academy of Sciences |
ISSN: | 0027-8424 |
Last Modified: | 18 Oct 2022 13:29 |
URI: | https://orca.cardiff.ac.uk/id/eprint/14103 |
Citation Data
Cited 93 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
Edit Item |