Research article Special Issues

Bridge the gap between fixed-length and variable-length evolutionary neural architecture search algorithms

  • Received: 18 October 2023 Revised: 26 November 2023 Accepted: 12 December 2023 Published: 22 December 2023
  • Evolutionary neural architecture search (ENAS) aims to automate the architecture design of deep neural networks (DNNs). In recent years, various ENAS algorithms have been proposed, and their effectiveness has been demonstrated. In practice, most ENAS methods based on genetic algorithms (GAs) use fixed-length encoding strategies because the generated chromosomes can be directly processed by the standard genetic operators (especially the crossover operator). However, the performance of existing ENAS methods with fixed-length encoding strategies can also be improved because the optimal depth is regarded as a known priori. Although variable-length encoding strategies may alleviate this issue, the standard genetic operators are replaced by the developed operators. In this paper, we proposed a framework to bridge this gap and to improve the performance of existing ENAS methods based on GAs. First, the fixed-length chromosomes were transformed into variable-length chromosomes with the encoding rules of the original ENAS methods. Second, an encoder was proposed to encode variable-length chromosomes into fixed-length representations that can be efficiently processed by standard genetic operators. Third, a decoder cotrained with the encoder was adopted to decode those processed high-dimensional representations which cannot directly describe architectures into original chromosomal forms. Overall, the performances of existing ENAS methods with fixed-length encoding strategies and variable-length encoding strategies have both improved by the proposed framework, and the effectiveness of the framework was justified through experimental results. Moreover, ablation experiments were performed and the results showed that the proposed framework does not negatively affect the original ENAS methods.

    Citation: Yunhong Gong, Yanan Sun, Dezhong Peng, Xiangru Chen. Bridge the gap between fixed-length and variable-length evolutionary neural architecture search algorithms[J]. Electronic Research Archive, 2024, 32(1): 263-292. doi: 10.3934/era.2024013

    Related Papers:

  • Evolutionary neural architecture search (ENAS) aims to automate the architecture design of deep neural networks (DNNs). In recent years, various ENAS algorithms have been proposed, and their effectiveness has been demonstrated. In practice, most ENAS methods based on genetic algorithms (GAs) use fixed-length encoding strategies because the generated chromosomes can be directly processed by the standard genetic operators (especially the crossover operator). However, the performance of existing ENAS methods with fixed-length encoding strategies can also be improved because the optimal depth is regarded as a known priori. Although variable-length encoding strategies may alleviate this issue, the standard genetic operators are replaced by the developed operators. In this paper, we proposed a framework to bridge this gap and to improve the performance of existing ENAS methods based on GAs. First, the fixed-length chromosomes were transformed into variable-length chromosomes with the encoding rules of the original ENAS methods. Second, an encoder was proposed to encode variable-length chromosomes into fixed-length representations that can be efficiently processed by standard genetic operators. Third, a decoder cotrained with the encoder was adopted to decode those processed high-dimensional representations which cannot directly describe architectures into original chromosomal forms. Overall, the performances of existing ENAS methods with fixed-length encoding strategies and variable-length encoding strategies have both improved by the proposed framework, and the effectiveness of the framework was justified through experimental results. Moreover, ablation experiments were performed and the results showed that the proposed framework does not negatively affect the original ENAS methods.



    加载中


    [1] A. Esteva, K. Chou, S. Yeung, N. Naik, A. Madani, A. Mottaghi, et al., Deep learning-enabled medical computer vision, NPJ Digit. Med., 4 (2021), 1–9. https://doi.org/10.1038/s41746-020-00376-2 doi: 10.1038/s41746-020-00376-2
    [2] A. Bhargava, A. Bansal, Fruits and vegetables quality evaluation using computer vision: A review, J. King Saud Univ. Comput. Inf. Sci., 33 (2021), 243–257. https://doi.org/10.1016/j.jksuci.2018.06.002 doi: 10.1016/j.jksuci.2018.06.002
    [3] Z. Wang, Q. She, T. E. Ward, Generative adversarial networks in computer vision: A survey and taxonomy, ACM Comput. Surv., 54 (2021), 1–38. https://doi.org/10.1145/3439723 doi: 10.1145/3439723
    [4] Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, et al., Domain-specific language model pretraining for biomedical natural language processing, ACM Trans. Comput. Healthcare, 3 (2021), 1–23. https://doi.org/10.1145/3458754 doi: 10.1145/3458754
    [5] D. H. Maulud, S. R. Zeebaree, K. Jacksi, M. A. M. Sadeeq, K. H. Sharif, State of art for semantic analysis of natural language processing, Qubahan Acad. J., 1 (2021), 21–28. https://doi.org/10.48161/qaj.v1n2a40 doi: 10.48161/qaj.v1n2a40
    [6] I. Guellil, H. Saâdane, F. Azouaou, B. Gueni, D. Nouvel, Arabic natural language processing: An overview, J. King Saud Univ. Comput. Inf. Sci., 3 (2021), 497–507. https://doi.org/10.1016/j.jksuci.2019.02.006 doi: 10.1016/j.jksuci.2019.02.006
    [7] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint, arXiv: 1409.1556.
    [8] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2016), 770–778. https://doi.org/10.1109/cvpr.2016.90
    [9] G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger, Densely connected convolutional networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2017), 4700–4708. https://doi.org/10.1109/cvpr.2017.243
    [10] J. Bergstra, R. Bardenet, Y. Bengio, B. K{é}gl, Algorithms for hyper-parameter optimization, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 24 (2011), 1–9.
    [11] N. Mitschke, M. Heizmann, K. H. Noffz, R. Wittmann, Gradient based evolution to optimize the structure of convolutional neural networks, in 2018 25th IEEE International Conference on Image Processing, IEEE, (2018), 3438–3442. https://doi.org/10.1109/icip.2018.8451394
    [12] L. Xie, A. Yuille, Genetic cnn, in Proceedings of the IEEE International Conference on Computer Vision, IEEE, (2017), 1379–1388. https://doi.org/10.1109/iccv.2017.154
    [13] Z. Lu, I. Whalen, Y. Dhebar, K. Deb, E. D. Goodman, W. Banzhaf, et al., Multiobjective evolutionary design of deep convolutional neural networks for image classification, IEEE Trans. Evol. Comput., 25 (2020), 277–291. https://doi.org/10.1109/tevc.2020.3024708 doi: 10.1109/tevc.2020.3024708
    [14] X. Xiao, M. Yan, S. Basodi, C. Ji, Y. Pan, Efficient hyperparameter optimization in deep learning using a variable length genetic algorithm, preprint, arXiv: 2006.12703.
    [15] K. Zhou, Y. Dong, K. Wang, W. S. Lee, B. Hooi, H. Xu, et al., Understanding and resolving performance degradation in deep graph convolutional networks, in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, ACM, (2021), 2728–2737. https://doi.org/10.1145/3459637.3482488
    [16] Q. Li, Z. Han, X. M. Wu, Deeper insights into graph convolutional networks for semi-supervised learning, in Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press, 32 (2018), 1–8. https://doi.org/10.1609/aaai.v32i1.11604
    [17] K. Oono, T. Suzuki, On asymptotic behaviors of graph cnns from dynamical systems perspective, preprint, arXiv: 1905.10947.
    [18] D. Chen, Y. Lin, W. Li, P. Li, J. Zhou, X. Sun, Measuring and relieving the over-smoothing problem for graph neural networks from the topological view, in Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press, 34 (2020), 3438–3445. https://doi.org/10.1609/aaai.v34i04.5747
    [19] L. Ma, Y. Liu, G. Yu, X. Wang, H. Mo, G. G. Wang, et al., Decomposition-based multiobjective optimization for variable-length mixed-variable pareto optimization and its application in cloud service allocation, IEEE Trans. Syst. Man Cybern.: Syst., 53 (2023), 7138–7151. https://doi.org/10.1109/tsmc.2023.3295371 doi: 10.1109/tsmc.2023.3295371
    [20] L. Muwafaq, N. K. Noordin, M. Othman, A. Ismail, F. Hashim, Cloudlet based computing optimization using variable-length whale optimization and differential evolution, IEEE Access, 11 (2023), 45098–45112. https://doi.org/10.1109/access.2023.3272901 doi: 10.1109/access.2023.3272901
    [21] R. Domala, U. Singh, A survey on state-of-the-art applications of variable length chromosome (vlc) based ga, in Advances in Artificial Intelligence and Data Engineering, Springer, 1133 (2021), 615–630. https://doi.org/10.1007/978-981-15-3514-7_47
    [22] A. Maruyama, N. Shibata, Y. Murata, K. Yasumoto, M. Ito, P-tour: A personal navigation system with travel schedule planning and route guidance based on schedule, IPSJ J., 45 (2004), 2678–2687.
    [23] M. Alajlan, A. Koubaa, I. Chaari, H. Bennaceur, A. Ammar, Global path planning for mobile robots in large-scale grid environments using genetic algorithms, in 2013 International Conference on Individual and Collective Behaviors in Robotics, IEEE, (2013), 1–8. https://doi.org/10.1109/icbr.2013.6729271
    [24] J. J. Lee, D. W. Kim, An effective initialization method for genetic algorithm-based robot path planning using a directed acyclic graph, Inf. Sci., 332 (2016), 1–18. https://doi.org/10.1016/j.ins.2015.11.004 doi: 10.1016/j.ins.2015.11.004
    [25] Z. Qiongbing, D. Lixin, A new crossover mechanism for genetic algorithms with variable-length chromosomes for path optimization problems, Expert Syst. Appl., 60 (2016), 183–189. https://doi.org/10.1016/j.eswa.2016.04.005 doi: 10.1016/j.eswa.2016.04.005
    [26] Y. Sun, B. Xue, M. Zhang, G. G. Yen, Evolving deep convolutional neural networks for image classification, IEEE Trans. Evol. Comput., 24 (2019), 394–407. https://doi.org/10.1109/TEVC.2019.2916183 doi: 10.1109/TEVC.2019.2916183
    [27] M. H. Aliefa, S. Suyanto, Variable-length chromosome for optimizing the structure of recurrent neural network, in 2020 International Conference on Data Science and its Applications, IEEE, (2020), 1–5. https://doi.org/10.1109/icodsa50139.2020.9213012
    [28] A. Rawal, J. Liang, R. Miikkulainen, Discovering gated recurrent neural network architectures, in Deep Neural Evolution, Springer, (2020), 233–251. https://doi.org/10.1007/978-981-15-3685-4_9
    [29] Y. Li, I. King, Autograph: Automated graph neural network, in International Conference on Neural Information Processing, Springer, 12533 (2020), 189–201. https://doi.org/10.1007/978-3-030-63833-7_16
    [30] Y. Gong, Y. Sun, D. Peng, P. Chen, Z. Yan, K. Yang, Analyze covid-19 ct images based on evolutionary algorithm with dynamic searching space, Complex Intell. Syst., 7 (2021), 3195–3209. https://doi.org/10.1007/s40747-021-00513-8 doi: 10.1007/s40747-021-00513-8
    [31] T. Elsken, J. H. Metzen, F. Hutter, Neural architecture search: A survey, preprint, arXiv: 1808.05377.
    [32] B. Zoph, Q. V. Le, Neural architecture search with reinforcement learning, preprint, arXiv: 1611.01578.
    [33] B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le, Learning transferable architectures for scalable image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2018), 8697–8710. https://doi.org/10.1109/cvpr.2018.00907
    [34] Z. Zhong, J. Yan, W. Wu, J. Shao, C. L. Liu, Practical block-wise neural network architecture generation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2018), 2423–2432. https://doi.org/10.1109/cvpr.2018.00257
    [35] H. Jin, Q. Song, X. Hu, Auto-keras: Efficient neural architecture search with network morphism, preprint, arXiv: 1806.10282.
    [36] K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, E. P. Xing, Neural architecture search with bayesian optimisation and optimal transport, preprint, arXiv: 1802.07191.
    [37] F. Hutter, H. H. Hoos, K. Leyton-Brown, Sequential model-based optimization for general algorithm configuration, in Learning and Intelligent Optimization, Springer, 6683 (2021), 507–523. https://doi.org/10.1007/978-3-642-25566-3_40
    [38] Y. Sun, B. Xue, M. Zhang, G. G. Yen, An experimental study on hyper-parameter optimization for stacked auto-encoders, in 2018 IEEE Congress on Evolutionary Computation, IEEE, (2018), 1–8. https://doi.org/10.1109/cec.2018.8477921
    [39] Y. Du, Y. Fan, X. Liu, Y. Luo, J. Tang, P. Liu, et al., Multiscale cooperative differential evolution algorithm, Comput. Intell. Neurosci., 2019 (2019), 1–18. https://doi.org/10.1155/2019/5259129 doi: 10.1155/2019/5259129
    [40] V. P. Ha, T. K. Dao, N. Y. Pham, M. H. Le, A variable-length chromosome genetic algorithm for time-based sensor network schedule optimization, Sensors, 21 (2021), 3990. https://doi.org/10.3390/s21123990 doi: 10.3390/s21123990
    [41] E. Real, A. Aggarwal, Y. Huang, Q. V. Le, Regularized evolution for image classifier architecture search, in Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Press, 33 (2019), 4780–4789. https://doi.org/10.1609/aaai.v33i01.33014780
    [42] J. Snoek, O. Rippel, K. Swersky, R. Kiros, N. Satish, N. Sundaram, et al., Scalable bayesian optimization using deep neural networks, preprint, arXiv: 1502.05700.
    [43] Y. Liu, Y. Sun, B. Xue, M. Zhang, G. G. Yen, K. C. Tan, A survey on evolutionary neural architecture search, IEEE Trans. Neural Networks Learn. Syst., 34 (2021), 550–570. https://doi.org/10.1109/TNNLS.2021.3100554 doi: 10.1109/TNNLS.2021.3100554
    [44] Y. Wang, H. Yao, S. Zhao, Auto-encoder based dimensionality reduction, Neurocomputing, 184 (2016), 232–242. https://doi.org/10.1016/j.neucom.2015.08.104 doi: 10.1016/j.neucom.2015.08.104
    [45] M. Sakurada, T. Yairi, Anomaly detection using autoencoders with nonlinear dimensionality reduction, in Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, ACM, (2014), 4–11. https://doi.org/10.1145/2689746.2689747
    [46] W. Wang, Y. Huang, Y. Wang, L. Wang, Generalized autoencoder: A neural network framework for dimensionality reduction, in 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, (2014), 496–503. https://doi.org/10.1109/cvprw.2014.79
    [47] X. Lu, Y. Tsao, S. Matsuda, C. Hori, Speech enhancement based on deep denoising autoencoder, in Interspeech, ISCA, (2013), 436–440. https://doi.org/10.21437/interspeech.2013-130
    [48] H. T. Chiang, Y. Y. Hsieh, S. W. Fu, K. H. Hung, Y. Tsao, S. Y. Chien, Noise reduction in ecg signals using fully convolutional denoising autoencoders, IEEE Access, 7 (2019), 60806–60813. https://doi.org/10.1109/access.2019.2912036 doi: 10.1109/access.2019.2912036
    [49] L. Yasenko, Y. Klyatchenko, O. Tarasenko-Klyatchenko, Image noise reduction by denoising autoencoder, in 2020 IEEE 11th International Conference on Dependable Systems, Services and Technologies, IEEE, (2020), 351–355. https://doi.org/10.1109/dessert50317.2020.9125027
    [50] Z. Wan, Y. Zhang, H. He, Variational autoencoder based synthetic data generation for imbalanced learning, in 2017 IEEE Symposium Series on Computational Intelligence, IEEE, (2017), 1–7. https://doi.org/10.1109/ssci.2017.8285168
    [51] S. Semeniuta, A. Severyn, E. Barth, A hybrid convolutional variational autoencoder for text generation, preprint, arXiv: 1702.02390.
    [52] W. Xu, S. Keshmiri, G. Wang, Adversarially approximated autoencoder for image generation and manipulation, IEEE Trans. Multimedia, 21 (2019), 2387–2396. https://doi.org/10.1109/tmm.2019.2898777 doi: 10.1109/tmm.2019.2898777
    [53] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P. A. Manzagol, L. Bottou, Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, J. Mach. Learn. Res., 11 (2010), 3371–3408.
    [54] M. Tschannen, O. Bachem, M. Lucic, Recent advances in autoencoder-based representation learning, preprint, arXiv: 1812.05069.
    [55] S. Lauly, H. Larochelle, M. M. Khapra, B. Ravindran, V. Raykar, A. Saha, et al., An autoencoder approach to learning bilingual word representations, preprint, arXiv: 1402.1454.
    [56] I. Sutskever, O. Vinyals, Q. V. Le, Sequence to sequence learning with neural networks, in Advances in Neural Information Processing Systems, Curran Associates, Inc. 27 (2014), 1–9.
    [57] Y. A. Chung, C. C. Wu, C. H. Shen, H. Y. Lee, L. S. Lee, Audio word2vec: Unsupervised learning of audio segment representations using sequence-to-sequence autoencoder, preprint, arXiv: 1603.00982.
    [58] H. Suresh, P. Szolovits, M. Ghassemi, The use of autoencoders for discovering patient phenotypes, preprint, arXiv: 1703.07004.
    [59] Z. Chen, Y. Zhou, Z. Huang, Auto-creation of effective neural network architecture by evolutionary algorithm and resnet for image classification, in 2019 IEEE International Conference on Systems, Man and Cybernetics, IEEE, (2019), 3895–3900. https://doi.org/10.1109/smc.2019.8914267
    [60] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, et al., Learning phrase representations using rnn encoder-decoder for statistical machine translation, preprint, arXiv: 1406.1078.
    [61] P. Koehn, Europarl: A parallel corpus for statistical machine translation, in Proceedings of Machine Translation Summit X: Papers, (2005), 79–86.
    [62] C. Moon, J. Kim, G. Choi, Y. Seo, An efficient genetic algorithm for the traveling salesman problem with precedence constraints, Eur. J. Oper. Res., 140 (2002), 606–617. https://doi.org/10.1016/s0377-2217(01)00227-2 doi: 10.1016/s0377-2217(01)00227-2
    [63] K. Chen, W. Pang, Immunetnas: An immune-network approach for searching convolutional neural network architectures, preprint, arXiv: 2002.12704.
    [64] M. Shi, D. A. Wilson, X. Zhu, Y. Huang, Y. Zhuang, J. Liu, et al., Evolutionary architecture search for graph neural networks, preprint, arXiv: 2009.10199.
    [65] A. Karpathy, Lessons learned from manually classifying cifar-10, Andrej Karpathy blog, 2011. Available from: http://karpathy.github.io/2011/04/27/manually-classifying-cifar10.
    [66] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE, (2015), 1–9. https://doi.org/10.1109/cvpr.2015.7298594
    [67] K. He, X. Zhang, S. Ren, J. Sun, Identity mappings in deep residual networks, in European Conference on Computer Vision, Springer, 9908 (2016), 630–645. https://doi.org/10.1007/978-3-319-46493-0_38
    [68] S. Zagoruyko, N. Komodakis, Wide residual networks, preprint, arXiv: 1605.07146.
    [69] T. Desell, A. ElSaid, A. G. Ororbia, An empirical exploration of deep recurrent connections using neuro-evolution, in International Conference on the Applications of Evolutionary Computation, Springer, 12104 (2020), 546–561. https://doi.org/10.1007/978-3-030-43722-0_35
    [70] A. Krizhevsky, Learning Multiple Layers of Features from Tiny Images, 2009.
    [71] M. P. Marcus, B. Santorini, M. A. Marcinkiewic, Building a large annotated corpus of english: The penn treebank, Tech. Rep. (CIS), 1993 (1993), 237.
    [72] S. Merity, C. Xiong, J. Bradbury, R. Socher, Pointer sentinel mixture models, preprint, arXiv: 1609.07843.
    [73] A. K. McCallum, K. Nigam, J. Rennie, K. Seymore, Automating the construction of internet portals with machine learning, Inf. Retr., 3 (2000), 127–163. https://doi.org/10.1023/A:1009953814988 doi: 10.1023/A:1009953814988
    [74] J. Wei, Y. Tay, Q. V. Le, Inverse scaling can become u-shaped, preprint, arXiv: 2211.02011.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(484) PDF downloads(60) Cited by(0)

Article outline

Figures and Tables

Figures(7)  /  Tables(9)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog