Contact Us Search Paper

Privacy-Utility Equilibrium Protocol for Federated Aggregating Multiparty Genome Data

Hai Liu1,2,3, Changgen Peng1,2,3,*, Youliang Tian2,3, Feng Tian4, and Zhenqiang Wu4

Corresponding Author:

Changgen Peng

Affiliation(s):

1 Guizhou Big Data Academy, Guizhou University, Guiyang 550025, China

2 College of Computer Science and Technology, Guizhou University, Guiyang 550025, China

3 State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China

4 School of Computer Science, Shaanxi Normal University, Xi’an 710119, China

*Corresponding author

Abstract:

Cloud server aggregates a large amount of genome data from multi genome donors to facilitate scientific research. However, the untrusted cloud server is prone to violate privacy of aggregating genome data. Thus, each genome donor can randomly perturb her genome data using differential privacy mechanism before aggregating. But this is easy to lead to utility disaster of aggregating genome data due to the different privacy preferences of each genome donor, and privacy leakage of aggregating genome data because of the kinship between genome donors. The key challenge here is to achieve an equilibrium between privacy preserving and data utility of aggregating multiparty genome data. To this end, we proposed federated aggregation protocol of multiparty genome data (MGD-FAP) with privacy-utility equilibrium for guaranteeing desired privacy protection and desired data utility. First, we regarded the privacy budget and the accuracy as the desired privacy-utility metrics of genome data respectively. Second, we constructed the federated aggregation model of multiparty genome data by combining random perturbation method of genome data guaranteeing desired data utility with federated comparing update method of local privacy budget achieving desired privacy preserving. Third, we presented the MGD-FAP maintaining privacy-utility equilibrium under the federated aggregation model of multiparty genome data. Finally, our theoretical and experimental analysis showed that MGD-FAP can maintain privacy-utility equilibrium. The MGD-FAP is practical and feasible to ensure the privacy-utility equilibrium of cloud server aggregating multiparty genome data.

Keywords:

Multiparty genome data aggregation, cloud server, federated comparing, strategic game, privacy-utility equilibrium

Downloads: 79 Views: 787
Cite This Paper:

Hai Liu, Changgen Peng, Youliang Tian, Feng Tian, and Zhenqiang Wu (2021). Privacy-Utility Equilibrium Protocol for Federated Aggregating Multiparty Genome Data. Journal of Networking and Network Applications, Volume 1, Issue 3, pp. 103–111. https://doi.org/10.33969/J-NaNA.2021.010303.

References:

[1] M. Naveed, E. Ayday, E. W. Clayton, J. Fellay, C. A. Gunter, J.-P. Hubaux, B. A. Malin, and X.F. Wang, “Privacy in the genomic era,” ACM Comput. Surv., vol. 48, no. 1, pp. 6:1-6:44, 2015.

[2] S. D. Constable, Y. Tang, S. Wang, X. Jiang, and S. Chapin, “Privacy-preserving GWAS analysis on federated genomic datasets,” BMC Med. Inform. Decis., vol. 15, no. Suppl 5, pp. S2:1-S2:9, 2015.

[3] M. N. Sadat, M. M. A. Aziz, N. Mohammed, F. Chen, X. Jiang, and

S. Wang, “SAFETY: Secure GWAS in federated environment through a hybrid solution,” IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 16, no. 1, pp. 93-102, 2019.

[4] C. Dwork and A. Roth, “The algorithmic foundations of differential privacy,” Found. Trends®Theor. Comput. Sci., vol. 9, no. 3-4, pp. 211-407, 2014.

[5] M. M. A. Aziz, S. Kamali, N. Mohammed, and X. Jiang, “Online algorithm for differentially private genome-wide association studies,” ACM Trans. Comput. Heal., vol. 2, no. 2, pp. 13:1-13:27, 2021.

[6] S. Simmons and B. Berger, “Realizing privacy preserving genome-wide association studies,” Bioinform., vol. 32, no. 9, pp. 1293-1300, 2016.

[7] S. Simmons, C. S. Sahinalp, and B. Berger, “Enabling privacy-preserving GWAS in heterogeneous human populations,” Cell Syst., vol. 3, no. 1, pp. 54-61, 2016.

[8] F. Tram`er, Z. Huang, J.-P. Hubaux, and E. Ayday, “Differential privacy with bounded priors: Reconciling utility and privacy in genome-wide as-sociation studies,” in Proceedings of the 2015 ACM SIGSAC Conference on Computer and Communications Security, 2015, pp. 1286-1297.

[9] I. Hagestedt, Y. Zhang, M. Humbert, P. Berrang, H. Tang, X.F. Wang, and M. Backes, “MBeacon: Privacy-preserving beacons for DNA methy-lation data,” in Proceedings of the 26th Annual Network and Distributed System Security Symposium, 2019.

[10] S. Simmons, B. Berger, and C. Sahinalp, “Protecting genomic data privacy with probabilistic modeling,” in Proceedings of the Pacific Symposium on Biocomputing, 2018, pp. 403-414.

[11] E. Yilmaz, E. Ayday, T. Ji, and P. Li, “Preserving genomic privacy via selective sharing,” in Proceedings of the 19th Workshop on Privacy in the Electronic Society, 2020, pp. 163-179.

[12] A. Yamamoto and T. Shibuya, “More practical differentially private publication of key statistics in GWAS,” Bioinform. Adv., vol. 1, no. 1, pp. 1-10, 2021.

[13] M. Fredrikson, E. Lantz, S. Jha, S. M. Lin, D. Page, and T. Ristenpart, “Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing,” in Proceedings of the 23rd USENIX Security Sympo-sium, 2014, pp. 17-32.

[14] A. Honkela, M. Das, O. Dikmen, and S. Kaski, “Efficient differentially private learning improves drug sensitivity prediction,” Biol. Direct, vol. 13, Article no. 1, 2018.

[15]T. T. Le, W. K. Simmons, M. Misaki, J. Bodurka, B. C. White, J. Savitz, and B. A. McKinney, “Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests,” Bioinform., vol. 33, no. 18, pp. 2906-2913, 2017.

[16] H. Liu, Z. Wu, C. Peng, X. Lei, F. Tian, and L. Lu, “Adaptive differential privacy of character and its application for genome data sharing,” in Proceedings of the International Conference on Networking and Network Applications, 2019, pp. 429-436.

[17] N. Almadhoun, E. Ayday, and ¨Ozg¨ur Ulusoy, “Differential privacy under dependent tuples - the case of genomic privacy,” Bioinform., vol. 36, no. 6, pp. 1696-1703, 2020.

[18] H. Liu, Z. Wu, C. Peng, F. Tian, and L. Lu, “Bounded privacy-utility monotonicity indicating bounded tradeoff of differential privacy mechanisms,” Theor. Comput. Sci., vol. 816, pp. 195-220, 2020.

[19] D. Kifer and A. Machanavajjhala, “No free lunch in data privacy,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2011, pp. 193-204.

[20] R. Chen, B. C. M. Fung, P. S. Yu, and B. C. Desai, “Correlated network data publication via differential privacy,” VLDB J., vol. 23, no. 4, pp.

653- 676, 2014.

[21] D. Kifer and A. Machanavajjhala, “Pufferfish: A framework for mathe-matical privacy definitions,” ACM Trans. Database Syst., vol. 39, no. 1, pp. 3:1-3:36, 2014.

[22] B. Yang, I. Sato, and H. Nakagawa, ”Bayesian differential privacy on correlated data,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2015, pp. 747-762.

[23] S. Song, Y. Wang, and K. Chaudhuri, “Pufferfish privacy mechanisms for correlated data,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2017, pp. 1291-1306.

[24] H. Wang and H. Wang, “Correlated tuple data release via differential privacy,” Inf. Sci., vol. 560, pp. 347-369, 2021.

[25] N. Homer, S. Szelinger, M. Redman, D. Duggan, W. Tembe, J. Muehling, J. V. Pearson, D. A. Stephan, S. F. Nelson, and D. W. Craig, “Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays,” PLoS Genet., vol. 4, no. 8, pp. e1000167:1-e1000167:9, 2008.

[26] E. Ayday and M. Humbert, “Inference attacks against kin genomic privacy,” IEEE Secur. Priv., vol. 15, no. 5, pp. 29-37, 2017.

[27] M. Humbert, E. Ayday, J.-P. Hubaux, and A. Telenti, “Addressing the concerns of the lacks family: Quantification of kin genomic privacy,” in Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2013, pp. 1141-1152.

[28] T. Pascoal, J. Decouchant, A. Boutet, and P. Esteves-Verissimo, “DyPS: Dynamic, private and secure GWAS,” Proc. Priv. Enhancing Technol., vol. 2021, no. 2, pp. 1-21, 2021.

[29] X. Wu, H. Zheng, Z. Dou, F. Chen, J. Deng, X. Chen, S. Xu, G. Gao, M. Li, Z. Wang, Y. Xiao, K. Xie, S. Wang, and H. Xu, “A novel privacy-preserving federated genome-wide association study framework and its application in identifying potential risk variants in ankylosing spondylitis,” Briefings Bioinform., vol. 22, no. 3, pp. 1-10, 2021.

[30] Y. Zhang, C. Xu, N. Cheng, H. Li, H. Yang, and X. Shen, “Chronos+: An accurate blockchain-based time-stamping scheme for cloud storage,” IEEE Trans. Serv. Comput., vol. 13, no .2, pp. 216-229, 2020.

[31] J. C. Duchi, M. I. Jordan, and M. J. Wainwright, “Local privacy and statistical minimax rates,” in Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science, 2013, pp. 429-438.

[32] M. Maschler, E. Solan, and S. Zamir, Game Theory. Cambridge, UK: Cambridge University Press, 2013.

[33] S. L. Warner, “Randomized response: A survey technique for eliminating evasive answer bias,” J. Am. Stat. Assoc., vol. 60, no. 309, pp. 63-69, 1965.

[34] F. McSherry, “Privacy integrated queries: An extensible platform for privacy-preserving data analysis,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, 2009, pp. 19-30.