Contact Us Search Paper

Soft-Tempering Deep Belief Networks Parameters Through Genetic Programming

Gustavo H. de Rosa*, João P. Papa

Corresponding Author:

Gustavo H. de Rosa

Affiliation(s):

Recogna Laboratory, School of Sciences, Department of Computing, S˜ao Paulo State University, Bauru, SP, Brazil
Email: {gustavo.rosa, joao.papa}@unesp.br
*Corresponding Author: Gustavo H. de Rosa, Email: [email protected]

Abstract:

Deep neural networks have been widely fostered throughout the last years, primarily on account of their outstanding performance in various tasks, such as objects, images, faces, and speeches recognition. However, such complex models usually require large-scale datasets for training purposes; otherwise, they can get overfitted and therefore not achieve consistent results over unseen data. Another problem among deep models concerns their hyperparameter setting, which may require an experienced user and much effort to calibrate them, despite being application-dependent. In this paper, we present an evolutionary-inspired optimization, known as Genetic Programming, regarding Deep Belief Networks hyperparameter selection, where the terminal nodes encode the hyperparameters of the model, and proper function nodes allow an excellent combination of mathematical operators. The experimental results over distinct datasets showed Genetic Programming could outperform some state-of-the-art results obtained through other meta-heuristic techniques, thus showing to be an exciting alternative to them.

Keywords:

Machine Learning, Deep Belief Networks, Optimization, Evolutionary Algorithms, Genetic Programming

Downloads: 234 Views: 2647
Cite This Paper:

G. H. Rosa, J. P. Papa (2019). Soft-Tempering Deep Belief Networks Parameters Through Genetic Programming. Journal of Artificial Intelligence and Systems, 1, 43–59. https://doi.org/10.33969/AIS.2019.11003.

References:

[1] G. H. Rosa, J. P. Papa, A. N. Marana, W. Scheirer, and D. D. Cox, “Fine-tuning convolutional neural networks using harmony search,” in Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, ser. Lecture Notes in Computer Science, 2015, vol. 9423, pp. 683–690, 20th Iberoamerican Congress on Pattern Recognition.
[2] Z. W. Geem, Music-Inspired Harmony Search Algorithm: Theory and Applications, 1st ed. Springer Publishing Company, Incorporated, 2009.
[3] J. P. Papa, G. H. Rosa, K. A. P. Costa, A. N. Marana, W. Scheirer, and D. D. Cox, “On the model selection of bernoulli restricted boltzmann machines through harmony search,” in Proceedings of the Genetic and Evolutionary Computation Conference, ser. GECCO ’15. New York, USA: ACM, 2015, pp. 1449–1450.
[4] J. P. Papa, G. H. Rosa, A. N. Marana, W. Scheirer, and D. D. Cox, “Model selection for discriminative restricted boltzmann machines through meta-heuristic techniques,” Journal of Computational Science, vol. 9, pp. 14–18, 2015.
[5] J. P. Papa, W. Scheirer, and D. D. Cox, “Fine-tuning deep belief networks using harmony search,” Applied Soft Computing, vol. 46, pp. 875–885, 2016.
[6] G. H. Rosa, J. P. Papa, K. A. P. Costa, L. A. Passos, C. R. Pereira, and X.S. Yang, “Learning parameters in deep belief networks through firefly algorithm,” in Artificial Neural Networks in Pattern Recognition: 7th IAPR TC3 Workshop, ANNPR, 2016, pp. 138–149.
[7] X. S. Yang, “Firefly algorithm, stochastic test functions and design optimisation,” International Journal Bio-Inspired Computing, vol. 2, no. 2, pp. 78–84, 2010.
[8] D. Rodrigues, X. S. Yang, and J. P. Papa, “Fine-tuning deep belief networks using cuckoo search,” in Bio-Inspired Computation and Applications in Image Processing, X. S. Yang and J. P. Papa, Eds. Academic Press, 2016, pp. 47–59.
[9] J. P. Papa, D. R. Pereira, A. B., and X. S. Yang, On the Harmony Search Using Quaternions. Cham: Springer International Publishing, 2016, pp. 126–137.
[10] L. A. Passos and J. P. Papa, “Fine-tuning infinity restricted boltzmann machines,” in 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Oct 2017, pp. 63–70.
[11] J. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” in Feature extraction, construction and selection. Springer, 1998, pp. 117–136.
[12] J. Koza, Genetic programming: on the programming of computers by means of natural selection. Cambridge, USA: The MIT Press, 1992.
[13] J. Y. Lin, H. R. Ke, B. C. Chien, and W. P. Yang, “Classifier design with feature selection and feature extraction using layered genetic programming,” Expert Systems with Applications, vol. 34, no. 2, pp. 1384–1393, 2008.
[14] R. Ramirez and M. Puiggros, “A genetic programming approach to feature selection and classification of instantaneous cognitive states,” in Workshops on Applications of Evolutionary Computation. Springer, 2007, pp. 311–319.
[15] K. Liu, L. M. Zhang, and Y. W. Sun, “Deep boltzmann machines aided design based on genetic algorithms,” in Applied Mechanics and Materials, vol. 568. Trans Tech Publ, 2014, pp. 848–851.
[16] E. Levy, O. E. David, and N. S. Netanyahu, “Genetic algorithms and deep learning for automatic painter classification,” in Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation. ACM, 2014, pp. 1143–1150.
[17] G. E. Hinton, “A practical guide to training restricted boltzmann machines,” in Neural Networks: Tricks of the Trade, ser. Lecture Notes in Computer Science, G. Montavon, G. Orr, and K. R. Muller, Eds. Springer Berlin Heidelberg, 2012, vol. 7700, pp. 599–619.
[18] G. E. Hinton, S. Osindero, and Y. W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006.
[19] J. P. Papa, G. H. Rosa, D. Rodrigues, and X. S. Yang, “Libopt: An open-source platform for fast prototyping soft optimization techniques,” arXiv preprint arXiv:1704.05174, 2017.
[20] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[21] B. Marlin, K. Swersky, B. Chen, and N. Freitas, “Inductive principles for restricted boltzmann machine learning,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 509–516.
[22] “Semeion handwritten digit data set,” Semeion Research Center of Sciences of Communication, via Sersale 117, 00128 Rome, Italy, Tattile Via Gaetano Donizetti 1-3-5, 25030 Mairano (Brescia), Italy, Tech. Rep., 2008.
[23] J. S. Bergstra, D. Yamins, and D. D. Cox, “Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms,” in Python for Scientific Computing Conference, 2013, pp. 1–7.
[24] M. Mahdavi, M. Fesanghary, and E. Damangir, “An improved harmony search algorithm for solving optimization problems,” Applied Mathematics and Computation, vol. 188, no. 2, pp. 1567–1579, 2007.
[25] M. G. Omran and M. Mahdavi, “Global-best harmony search,” Applied Mathematics and Computation, vol. 198, no. 2, pp. 643 – 656, 2008.
[26] D. Zou, L. Gao, J. Wu, S. Li, and Y. Li, “A novel global harmony search algorithm for reliability problems,” Computers & Industrial Engineering, vol. 58, no. 2, pp. 307–316, 2010, scheduling in Healthcare and Industrial Systems.
[27] Q. K. Pan, P. Suganthan, M. F. Tasgetiren, and J. Liang, “A self-adaptive global best harmony search algorithm for continuous optimization problems,” Applied Mathematics and Computation, vol. 216, no. 3, pp. 830–848, 2010.
[28] Z. W. Geem and K. B. Sim, “Parameter-setting-free harmony search algorithm,” Applied Mathematics and Computation, vol. 217, no. 8, pp. 3881–3889, 2010.
[29] G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Computation, vol. 14, no. 8, pp. 1771–1800, 2002.
[30] T. Tieleman, “Training restricted boltzmann machines using approximations to the likelihood gradient,” in Proceedings of the 25th International Conference on Machine Learning. New York, NY, USA: ACM, 2008, pp. 1064–1071.
[31] T. Tieleman and G. E. Hinton, “Using fast weights to improve persistent contrastive divergence,” in Proceedings of the 26th Annual International Conference on Machine Learning. New York, NY, USA: ACM, 2009, pp. 1033–1040.
[32] F. Wilcoxon, “Individual comparisons by ranking methods,” Biometrics Bulletin, vol. 1, no. 6, pp. 80–83, 1945.