Machine Learning with Distributed Processing using Secure Divided Data: Towards Privacy-Preserving Advanced AI Processing in a Super-Smart Society

<p>Hirofumi Miyajima<sup>1</sup>, Noritaka Shigei<sup>2,*</sup>, Hiromi Miyajima<sup>2</sup>, and Norio Shiratori<sup>3</sup></p>

doi:10.33969/J-NaNA.2022.020105

Machine Learning with Distributed Processing using Secure Divided Data: Towards Privacy-Preserving Advanced AI Processing in a Super-Smart Society

Hirofumi Miyajima¹, Noritaka Shigei^2,*, Hiromi Miyajima², and Norio Shiratori³

Corresponding Author:

Noritaka Shigei

Affiliation(s):

¹ Nagasaki University, 1-14 Bunkyomachi, Nagasaki city, Nagasaki 852-8521, Japan

² Kagoshima University, 1-21-40, Korimoto, Kagoshima, 890-0065, Japan

³ Chuo University, 1-13-27, Kasuga, Bunkyoku, Tokyo, 112-8551, Japan

*Corresponding author

Abstract:

Towards the realization of a super-smart society, AI analysis methods that preserve the privacy of big data in cyberspace are being developed. From the viewpoint of developing machine learning as a secure and safe AI analysis method for users, many studies have been conducted in this field on 1) secure multiparty computation (SMC), 2) quasi-homomorphic encryption, and 3) federated learning, among other techniques. Previous studies have shown that both security and utility are essential for machine learning using confidential data. However, there is a trade-off between these two properties, and there are no known methods that satisfy both simultaneously at a high level.

In this paper, as a superior method in both privacy-preserving of data and utility, we propose a learning method based on distributed processing using simple, secure, divided data and parameters. In this method, individual data and parameters are divided into multiple pieces using random numbers in advance, and each piece is stored in each server. The learning of the proposed method is achieved by using these data and parameters as they are divided and by repeating partial computations on each server and integrated computations at the central server. The advantages of the proposed method are the preservation of data privacy by not restoring the data and parameters during learning; the improvement of usability by realizing a machine learning method based on distributed processing, as federated learning does; and almost no degradation in accuracy compared to conventional methods. Based on the proposed method, we propose backpropagation and neural gas (NG) algorithms as examples of supervised and unsupervised machine learning applications. Our numerical simulation shows that these algorithms can achieve accuracy comparable to conventional models.

Keywords:

Machine learning, secure divided data, distributed processing, federated learning, neural networks, and neural gas

PDF

Downloads: 123 Views: 984

Cite This Paper:

Hirofumi Miyajima, Noritaka Shigei, Hiromi Miyajima, and Norio Shiratori (2022). Machine Learning with Distributed Processing using Secure Divided Data: Towards Privacy-Preserving Advanced AI Processing in a Super-Smart Society. Journal of Networking and Network Applications, Volume 2, Issue 1, pp. 48–60. https://doi.org/10.33969/J-NaNA.2022.020105.

References:

[1] United Nations Foundation, SUSTAINABLE DEVELOPMENT GOALS, https://unfoundation.org/, 2021.

[2] UNDP Ukraine, Transforming our world: the 2030 Agenda for Sustain-able Development, https://www.ua.undp.org/, 2021

[3] Cabinet Office of Japan, Society 5.0, https://www.cao.go.jp/, 2021.

[4] United Nations, Global Sustainable Development Report, 2015 edition, https://www.un.org/en/desa, 2021

[5] C. C. Aggarwal and P. S. Yu, Privacy-Preserving Data Mining: Models and Algorithms, ISBN 978-0-387-70991-8, Springer-Verlag, 2009.

[6] A. Shamir, How to share a secret, Comm. ACM, vol. 22, no. 11, pp. 612-613, 1979.

[7] A. Beimel, Secret-sharing schemes: a survey, in Proc. of the Third international conference on Coding and cryptology (IWCC 11), 2011.

[8] W. Stallings, Cryptography and Network Security Principles and Prac-tices (7th ed.), Pearson Education, Inc., 2017.

[9] D. Evans, V. Kolesnikov, and M. Rosulek, A Pragmatic Introduction to Secure Multi-Party Computation, Foundations and Trends in Privacy and Security, vol.2, Issue 2-3, pp. 70-246, 2022.

[10] R. Canetti, U. Feige, O. Goldreich, and M. Naor, Adaptively secure multi-party computation, STOC’ 96, pp. 639-648, 1996.

[11] R. Cramer, I. Damg˚ard, and U. Maurer, General secure multi-party computation from any linear secret-sharing scheme, EUROCRYPT’, pp.331-339, 2000.

[12] A. Ben-David, N. Nisan, and B. Pinkas, Fair play MP: a system for secure multi-party computation, ACM CCS’ 08, pp. 257-266, 2008.

[13] C. Gentry, Fully Homomorphic Encryption Using Ideal Lattices, STOC2009, pp.169-178, 2009.

[14] HElib, An Implementation of homomorphic encryption, https://github.com/shaih/HElib

[15] Q. Yang, Y. Li, T. Chen, and Y. Tong, Federated Machine Learaning : Concept and Applications, ACM Trans. Intell. Syst. Technol., vol. 10, no. 2, Article 12, 2019.

[16] J. Koneˇcn´y, H. B. McMahan, F. X. Yu, P. Richt´arik, A. T. Suresh, and D. Bacon, Federated Learning: Strategies for Improving Communication Efficiency, arXiv:1610.05492, 2017.

[17] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B.A.y. Arcas, Communication-Efficient Learning of Deep Networks from Decentralized Data, Proc. of Machine Learning Research, vol. 54, pp. 1273-1282, 2017.

[18] K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, Differentially private empirical risk minimization, Journal of Machine Learning Research, vol. 12, no. 29, pp. 1069-1109, 2011.

[19] E. Sotthiwat, L. Zhen, Z. Li, and C. Zhang, Partially Encrypted Multi-Party Computation for Federated Learning, IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID), DOI: 10.1109/CCGrid51090.2021.00101, 2021.

[20] J. Duan, J. Zhou, Y. Li, and C. Huang, Privacy-preserving and verifiable deep learning inference based on secret sharing, Neurocomputing, vol. 483, pp.221-234, 2022.

[21] X. Ma, F. Zhang, X. Chen, and J. Shen, Privacy preserving multi-party computation delegation for deep learning in cloud computing, Information Sciences, vol. 459, pp. 103-116, 2018.

[22] O. Nassef, W. Sun, H. Purmehdi, M. Tatipamula, and T. Mahmoodi, A survey: Distributed Machine Learning for 5G and beyond, Computer Networks, 108820, ISSN 1389-1286, 2022.

[23] X. Yin, Y. Zhu, and J. Hu, A Comprehensive Survey of Privacy-preserving Federated Learning: A Taxonomy, Review, and Future Di-rections, ACM Computing Surveys, vol. 54, no. 6, pp. 1-36, 2022.

[24] S. Truex, N. Baracaldo, A. Anwar, T. Steinke, H. Ludwig, R. Zhang, and Y. Zhou, A Hybrid Approach to Privacy-Preserving Federated Learning, Proc. of the 12th ACM Workshop on Artificial Intelligence and Security, pp. 1-11, 2019.

[25] K. Wei, J. Li, M. Ding, C. Ma, H. H. Yang, F. Farokhi, S. Jin, T. Q. S. Quek, and H. V. Poor, Federated Learning With Differential Privacy: Algorithms and Performance Analysis, IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3454-3469, 2020.

[26] R. Hu, Y. Guo, and Y. Gong, Concentrated Differentially Private Federated Learning With Performance Analysis, IEEE Open Journal of the Computer Society, vol. 2, pp. 276-289, 2021.

[27] J. Gao, B. Hou, X. Guo, Z. Liu, Y. Zhang, K. Chen, and J. Li, Secure Aggregation is Insecure: Category Inference Attack on Federated Learning, IEEE Trans. on Dependable and Secure Computing, DOI: 10.1109/TDSC.2021.3128679, 2021.

[28] A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augen-stein, H. Eichner, C. Kiddon, and D. Ramage, Federatedlearning for mobile keyboard prediction, arXiv:1811.03604, 2018.

[29] L. Huang, Y. Yin, Z. Fu, S. Zhang, H. Deng, and D. Liu, Load-aboost: Loss-based adaboost federated machinelearning on medical data, arXiv:1811.12629, 2018.

[30] S. Samarakoon, M. Bennis, W. Saad, and M. Debbah, Federated learning for ultra-reliable low-latency v2vcommunications, 2018 IEEE Global Communications Conference, pp. 1-7, 2018.

[31] Y. Qin, H. Mastutani, and M. Kondo, A Selective Model Aggregation Approach in Federated Learning for Online Anomaly Detection, Inter-national Conference on Cyber, Physical and Social Computing, pp. 178-185, 2020.

[32] M. von Maltitz, S. Smarzly, H. Kinkelin, and G. Carle, A management framework for secure multiparty computation in dynamic environments, Proc. of IEEE/IFIP Network Operations and Management Symposium, pp. 1-7, 2018.

[33] M. Asad, A. Moustafa, and T. Ito, FedOpt: Towards Communication Efficiency and Privacy Preservation in Federated Learning, Applied Sciences, vol. 10, no. 8, p. 2864, 2020.

[34] R. Liu, Y. Cao, M. Yoshikawa, and H. Chen, FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection, arXiv:2003.10637, 2020.

[35] S. S. Rathna and T. Karthikeyan, Survey on Recent Algorithms for Privacy Preserving Data mining, International Journal of Computer Science and Information Technologies, vol. 6, issue 2, pp. 1835-1840, 2015.

[36] J. Yuan and S. Yu, Privacy Preserving Back-Propagation Neural Net-work Learning Made Practical with Cloud Computing, IEEE Trans. On Parallel and Distributed Systems, Vol. 25, issue 1, pp. 212-221, 2013.

[37] N. Schlitter, A Protocol for Privacy Preserving Neural Network Learning on Horizontal Partitioned Data, Privacy Statistics in Databases (PSD), 2008.

[38] H. Miyajima, H. Miyajima, and N. Shiratori, Fast and Secure Back-Propagation Learning using Vertically Partitioned Data with IoT, CAN-DAR 2019 : The Seventh International Symposium on Computing and Networking, Nagasaki, November, pp. 450-454, 2019.

[39] Y. Miyanishi, A. Kanaoka, F. Sato, X. Han, S. Kitagami, Y.Urano, and N. Shiratori, New Methods to Ensure Security to Increase User’s Sense of Safety in Cloud Services, Proc. of The 14th IEEE Int. Conference on Scalable Computing and Communications (ScalCom-2014), pp. 859-865, 2014.

[40] H. Miyajima, N. Shigei, H. Miyajima, Y. Miyanishi, S. Kitagami and N. Shiratori, New Privacy Preserving Back Propagation Learning for Secure Multiparty Computation, IAENG International Journal of Computer Science, vol. 43, no. 3, pp. 270-276, 2016.

[41] S. Ruder, An Overview of Gradient Descent Optimization Algo-rithms, http://ruder.io/optimizing-gradient-descent/, 2016 (accessed 14 Mar. 2018).

[42] M. M. Gupta, L. Jin, and N. Honma, Static and Dynamic Neural Networks, IEEE Pres, Wiley-Interscience, 2003.

[43] T. M. Martinetz, S. G. Berkovich, and K. J. Schulten, ’Neural-Gas’ Network for Vector Quantization and its Application to Time-series Prediction, IEEE Trans. Neural Network, vol. 4, no. 4, pp. 558-569, 1993.

[44] UCI Repository of Machine Learning Databases and Domain Theories, https://archive.ics.uci.edu/ml/datasets.php.

Machine Learning with Distributed Processing using Secure Divided Data: Towards Privacy-Preserving Advanced AI Processing in a Super-Smart Society

Related Links

Resources & Policies

Resources & Policies

Contact Us

Related Links

Resources & Policies

Resources & Policies

Contact Us