Re-identification by phenotypic prediction aims to determine whether a genome belongs to a specific individual by comparing the individual's known traits with those predicted from the genome. This type of tracing attack is widely discussed in the genomic privacy literature, yet previous studies have been criticized for overstating its practical risks. Over the past decade, genome-wide association studies (GWAS) with increasing sample size improved the accuracy of phenotypic prediction, potentially enhancing such attacks. To quantify their real-world threat, we developed a probabilistic framework that estimates the likelihood of a match between an individual's observed traits and polygenic scores (PGS) derived from a genome, while accounting for prediction accuracy and genetic and environmental correlations between the traits. We benchmarked this re-identification method and examined how the prior probability (reflecting the a priori chance that a random genome and set of traits correspond to the same person) affects performance. Finally, we assessed whether sensitive information could be inferred through this attack by attempting to predict multiple sensitive haplotypes, such as APOE-{epsilon}4 (linked with Alzheimer's disease). Our re-identification method outperformed a state-of-the-art tool, and reached a precision above 99% for a recall of 40% when considering a prior of 50%. However, after considering real-world settings, we estimated that realistic priors would not exceed 4 x 10-4%, resulting in a precision lower than 0.13% at the same recall (40%). The inference of sensitive genotypes also proved ineffective, as achieving a precision above 50% for identifying APOE-{epsilon}4 carriers was only possible at a recall below 20%. To conclude, although re-identification by phenotypic prediction is technically feasible, our findings indicate that its effectiveness in real-world conditions is limited. These results counterpoint to earlier claims of severe genomic privacy risks and offer guidance for policymakers, biobank administrators, and research participants.
Cavinato, T., Hofmeister, R. J., Kutalik, Z.
Advertisement
Stats
- Recommendations n/a n/a positive of 0 vote(s)
- Views 9
- Comments 0
