Deep learning neural networks based on the manual design for image classification tasks usually require a large amount of a priori knowledge and experience from experts; thus, research on designing neural network architectures automatically has been widely performed. The neural architecture search (NAS) method based on the differentiable architecture search (DARTS) ignores the interrelationships within the searched network architecture cells. The optional operations in the architecture search space lack diversity, and the large parametric and non-parametric operations in the search space make the search process inefficient. We propose a NAS method based on a dual attention mechanism (DAM-DARTS). An improved attention mechanism module is introduced to the cell of the network architecture to deepen the interrelationships between the important layers within the architecture by enhancing the attention between them, which improves the accuracy of the architecture and reduces the architecture search time. We also propose a more efficient architecture search space by adding attention operations to increase the complex diversity of the searched network architectures and reduce the computational cost consumed in the search process by reducing non-parametric operations. Based on this, we further analyze the impact of changing some operations in the architecture search space on the accuracy of the architectures. Through extensive experiments on several open datasets, we demonstrate the effectiveness of the proposed search strategy, which is highly competitive with other existing neural network architecture search methods.
Citation: Cong Jin, Jinjie Huang, Tianshu Wei, Yuanjian Chen. Neural architecture search based on dual attention mechanism for image classification[J]. Mathematical Biosciences and Engineering, 2023, 20(2): 2691-2715. doi: 10.3934/mbe.2023126
Deep learning neural networks based on the manual design for image classification tasks usually require a large amount of a priori knowledge and experience from experts; thus, research on designing neural network architectures automatically has been widely performed. The neural architecture search (NAS) method based on the differentiable architecture search (DARTS) ignores the interrelationships within the searched network architecture cells. The optional operations in the architecture search space lack diversity, and the large parametric and non-parametric operations in the search space make the search process inefficient. We propose a NAS method based on a dual attention mechanism (DAM-DARTS). An improved attention mechanism module is introduced to the cell of the network architecture to deepen the interrelationships between the important layers within the architecture by enhancing the attention between them, which improves the accuracy of the architecture and reduces the architecture search time. We also propose a more efficient architecture search space by adding attention operations to increase the complex diversity of the searched network architectures and reduce the computational cost consumed in the search process by reducing non-parametric operations. Based on this, we further analyze the impact of changing some operations in the architecture search space on the accuracy of the architectures. Through extensive experiments on several open datasets, we demonstrate the effectiveness of the proposed search strategy, which is highly competitive with other existing neural network architecture search methods.
[1] | A. Krizhevsky, I. Sutskever, E. G. Hinton, Imagenet classification with deep convolutional neural networks, Commun. ACM, 60 (2017), 84–90. https://doi.org/10.1145/3065386 doi: 10.1145/3065386 |
[2] | I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial networks, Commun. ACM, 63 (2020), 139–144. https://doi.org/10.1145/3422622 doi: 10.1145/3422622 |
[3] | S. Xie, R. Girshick, P. Dollar, Z. Tu, K. He, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, (2017), 1492–1500. https://doi.org/10.1109/CVPR.2017.634 |
[4] | X. Zhang, X. Zhou, M. Lin, R. Sun, Shufflenet: An extremely efficient convolutional neural network for mobile devices, in Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, (2018), 6848–6856. https://doi.org/10.1109/CVPR.2018.00716 |
[5] | N. Ma, X. Zhang, T. H. Zheng, J. Sun, Shufflenet v2: Practical guidelines for efficient CNN architecture design, in Proceedings of the European conference on computer vision (ECCV), Munich, GERMANY, 11218 (2018), 122–138. https://doi.org/10.1007/978-3-030-01264-9_8 |
[6] | M. Zhu, Q. Chen, Big data image classification based on distributed deep representation learning model, IEEE Access, 8 (2020), 133890–133904. https://doi.org/10.1109/ACCESS.2020.3011127 doi: 10.1109/ACCESS.2020.3011127 |
[7] | Y. Chen, D. Zhao, L. Lv, Q. Zhang, Multi-task learning for dangerous object detection in autonomous driving, Inf. Sci., 432 (2018), 559–571. https://doi.org/10.1016/j.ins.2017.08.035 doi: 10.1016/j.ins.2017.08.035 |
[8] | H. Zhao, Y. Zhang, S. Liu, J. Shi, C. Loy, D. Lin, et al., Psanet: Point-wise spatial attention network for scene parsing, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 267–283. https://doi.org/10.1007/978-3-030-01240-3_17 |
[9] | J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, et al., Deformable convolutional networks, in Proceedings of the IEEE international conference on computer vision, (2017), 764–773. https://doi.org/10.1109/ICCV.2017.89 |
[10] | O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Cham, (2015), 234–241. https://doi.org/10.1007/978-3-319-24574-4_28 |
[11] | F. Jia, J. Liu, C. X. Tai, A regularized convolutional neural network for semantic image segmentation, Anal. Appl., 19 (2021), 147–165. https://doi.org/10.1142/S0219530519410148 doi: 10.1142/S0219530519410148 |
[12] | P. Wang, P. Chen, Y. Yuan, D. Liu, Z. Huang, X. Hou, et al., Understanding convolution for semantic segmentation, in 2018 IEEE winter conference on applications of computer vision (WACV), NV, (2018), 1451–1460. https://doi.org/10.1109/WACV.2018.00163 |
[13] | K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90 |
[14] | G. Huang, Z. Liu, L. V. D. Maaten, K. Q. Weinberger, Densely connected convolutional networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, (2017), 4700–4708. https://doi.org/10.1109/CVPR.2017.243 |
[15] | C. L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs, preprint, arXiv: 1412.7062. https://doi.org/10.48550/arXiv.1412.7062 |
[16] | C. L. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE Trans. Pattern Anal. Mach. Intell., 40 (2017), 834–848. https://doi.org/10.1109/TPAMI.2017.2699184 doi: 10.1109/TPAMI.2017.2699184 |
[17] | C. L. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in Proceedings of the European conference on computer vision (ECCV), 11211 (2018), 833–851. https://doi.org/10.1007/978-3-030-01234-2_49 |
[18] | C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition, Boston, MA, (2015), 1–9. https://doi.org/10.1109/cvpr.2015.7298594 |
[19] | S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in International conference on machine learing, PMLR, 37 (2015), 448–456. |
[20] | C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception architecture for computer vision, in Proceedings of the IEEE conference on computer vision and pattern recognition, Seattle, WA, (2016), 2818–2826. https://doi.org/10.1109/CVPR.2016.308 |
[21] | C. Szegedy, S. Ioffe, V. Vanhoucke, A. Alemi, Inception-v4, inception-resnet and the impact of residual connections on learning, in Thirty-first AAAI conference on artificial intelligence, 2017. |
[22] | J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 42 (2020), 2011–2023. https://doi.org/10.1109/TPAMI.2019.2913372 |
[23] | P. Ren, Y. Xiao, X. Chang, P. Huang, Z. Li, X. Chen, et al., A comprehensive survey of neural architecture search: Challenges and solutions, ACM Comput. Surv., 54 (2021), 1–34. https://doi.org/10.1145/3447582 doi: 10.1145/3447582 |
[24] | H. Cai, C. Gan, T. Wang, Z. Zhang, S. Han, Once-for-all: Train one network and specialize it for efficient deployment, preprint, arXiv: 1908.09791. https://doi.org/10.48550/arXiv.1908.09791 |
[25] | Z. Ding, Y. Chen, N. Li, D. Zhao, Z. Sun, C. Chen, BNAS: Efficient neural architecture search using broad scalable architecture, IEEE Trans. Neural Networks Learn. Syst., 33 (2021), 5004–5018. https://doi.org/10.1109/TNNLS.2021.3067028 doi: 10.1109/TNNLS.2021.3067028 |
[26] | J. Zhao, R. Zhang, Z. Zhou, S. Chen, J. Jin, Q. Liu, A neural architecture search method based on gradient descent for remaining useful life estimation, Neurocomputing, 438 (2021), 184–194. https://doi.org/10.1016/j.neucom.2021.01.072 doi: 10.1016/j.neucom.2021.01.072 |
[27] | H. Zhang, I. Goodfellow, D. Metaxas, A. Odena, Self-attention generative adversarial networks, in International conference on machine learning, 97 (2019), 7354–7363. |
[28] | J. Park, S. Woo, Y. J. Lee, I. Kweon, Bam: Bottleneck attention module, preprint, arXiv: 1807.06514. https://doi.org/10.48550/arXiv.1807.0651 |
[29] | S. Woo, J. Park, Y. J. Lee, I. Kweon, Cbam: Convolutional block attention module, in Proceedings of the European conference on computer vision (ECCV), (2018), 3–19. https://doi.org/10.1007/978-3-030-01234-2_1 |
[30] | E. Real, A. Aggarwal, Y. Huang, Q. Le, Regularized evolution for image classifier architecture search, in Proceedings of the aaai conference on artificial intelligence, 33 (2019), 4780–4789. https://doi.org/10.1609/aaai.v33i01.33014780 |
[31] | C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, J. L. Li, et al., Progressive neural architecture search, in Proceedings of the European conference on computer vision (ECCV), (2018), 19–34. https://doi.org/10.1007/978-3-030-01246-5_2 |
[32] | H. Liu, K. Simonyan, O. Vinyals, C. Fernando, K. Kavukcuoglu, Hierarchical representations for efficient architecture search, preprint, arXiv: 1711.00436. https://doi.org/10.48550/arXiv.1711.00436 |
[33] | Z. Lu, I. Whalen, Y. Dhebar, K. Deb, E. D. Goodman, W. Banzhaf, et al., Multiobjective evolutionary design of deep convolutional neural networks for image classification, IEEE Trans. Evol. Comput., 25 (2020), 277–291. https://doi.org/10.1109/TEVC.2020.3024708 doi: 10.1109/TEVC.2020.3024708 |
[34] | B. Zoph, V. Vasudevan, J. Shlens, Q. Le, Learning transferable architectures for scalable image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, (2018), 8697–8710. https://doi.org/10.1109/CVPR.2018.00907 |
[35] | B. Zoph, V. Q. Le, Neural architecture search with reinforcement learning, preprint, arXiv: 1611.01578. https://doi.org/10.48550/arXiv.1611.01578 |
[36] | M. Wistuba, Practical deep learning architecture optimization, in 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), (2018), 263–272. https://doi.org/10.1109/DSAA.2018.00037 |
[37] | H. Liu, K. Simonyan, Y. Yang, Darts: Differentiable architecture search, preprint, arXiv: 1806.09055. https://doi.org/10.48550/arXiv.1806.09055 |
[38] | Y. Xu, L. Xie, X. Zhang, X. Chen, G. Qi, Q. Tian, et al., Pc-darts: Partial channel connections for memory-efficient differentiable architecture search, preprint, arXiv: 1907.05737. https://doi.org/10.48550/arXiv.1907.05737 |
[39] | X. Chen, L. Xie, J. Wu, Q. Tian, Progressive differentiable architecture search: Bridging the depth gap between search and evaluation, in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2019), 1294–1303. |
[40] | H. Cai, C. Gan, T. Wang, Z. Zhang, S. Han, Once-for-All: Train one network and specialize it for efficient deployment, preprint, arXiv: 1908.09791. https://doi.org/10.48550/arXiv.1908.09791 |
[41] | M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, et al., Mnasnet: Platform-aware neural architecture search for mobile, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 2820–2828. |
[42] | Z. Zhang, Y. Chen, C. Zhou, Self-growing binary activation network: A novel deep learning model with dynamic architecture, IEEE Trans. Neural Networks Learn. Syst., 2022. https://doi.org/10.1109/TNNLS.2022.3176027 doi: 10.1109/TNNLS.2022.3176027 |
[43] | Q. M. Phan, H. N. Luong, Enhancing multi-objective evolutionary neural architecture search with training-free Pareto local search, Appl. Intell., 2022 (2022), 1–19. https://doi.org/10.1007/s10489-022-04032-y doi: 10.1007/s10489-022-04032-y |
[44] | Q. M. Phan, H. N. Luong, Enhancing multi-objective evolutionary neural architecture search with surrogate models and potential point-guided local searches, in International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, Springer, Cham, 12798 (2021), 460–472. https://doi.org/10.1007/978-3-030-79457-639 |
[45] | A. Ma, Y. Wan, Y. Zhong, J. Wang, L. Zhang, SceneNet: Remote sensing scene classification deep learning network using multi-objective neural evolution architecture search, ISPRS J. Photogramm. Remote Sens., 172 (2021), 171–188. https://doi.org/10.1016/j.isprsjprs.2020.11.025 doi: 10.1016/j.isprsjprs.2020.11.025 |
[46] | M. Song, Y. Zhong, A. Ma, R. Feng, Multiobjective sparse subpixel mapping for remote sensing imagery, IEEE Trans. Geosci. Remote Sens., 57 (2019), 4490–4508. https://doi.org/10.1109/TGRS.2019.2891354 doi: 10.1109/TGRS.2019.2891354 |
[47] | M. Ahmad, M. Abdullah, H. Moon, S. Yoo, D. Han, Image classification based on automatic neural architecture search using binary crow search algorithm, IEEE Access, 8 (2020), 189891–189912. https://doi.org/10.1109/ACCESS.2020.3031599 doi: 10.1109/ACCESS.2020.3031599 |
[48] | H. Pham, M. Guan, B. Zoph, Q. Le, J. Dean, Efficient neural architecture search via parameters sharing, in International Conference on Machine Learning, PMLR, 80 (2018), 4095–4104. |
[49] | X. Chu, T. Zhou, B. Zhang, J. Li, Fair darts: Eliminating unfair advantages in differentiable architecture search, in European conference on computer vision, Springer, Cham, 12360 (2020), 465–480. https://doi.org/10.1007/978-3-030-58555-6_28 |
[50] | H. Cai, L. Zhu, S. Han, Proxylessnas: Direct neural architecture search on target task and hardware, preprint, arXiv: 1812.00332. https://doi.org/10.48550/arXiv.1812.00332 |
[51] | Y. Bian, Q. Song, M. Du, J. Yao, H. Chen, Subarchitecture ensemble pruning in neural architecture search, IEEE Trans. Neural Networks Learn. Syst., 2021. https://doi.org/10.1109/TNNLS.2021.3085299 |
[52] | J. Zhang, D. Li, L. Wang, L. Zhang, One-shot neural architecture search by dynamically pruning supernet in hierarchical order, Int. J. Neural Syst., 31 (2021), 2150029. https://doi.org/10.1142/S0129065721500295 doi: 10.1142/S0129065721500295 |
[53] | T. M. Luong, H. Pham, D. C. Manning, Effective approaches to attention-based neural machine translation, preprint, arXiv: 1508.04025. https://doi.org/10.48550/arXiv.1508.04025 |
[54] | K. Nakai, T. Matsubara, K. Uehara, Neural architecture search for convolutional neural networks with attention, IEICE Trans. Inf. Syst., 104 (2021), 312–321. https://doi.org/10.1587/transinf.2020EDP7111 doi: 10.1587/transinf.2020EDP7111 |
[55] | J. Hao, W. Zhu, Architecture self-attention mechanism: nonliner optimization for neural architecture search, J. Nonlinear Var. Anal., 5 (2021), 119–140. https://doi.org/10.23952/jnva.5.2021.1.08 doi: 10.23952/jnva.5.2021.1.08 |
[56] | Y. Weng, T. Zhou, L. Liu, C. Xia, Automatic convolutional neural architecture search for image classification under different scenes, IEEE Access, 7 (2019), 38495–38506. https://doi.org/10.1109/ACCESS.2019.2906369 doi: 10.1109/ACCESS.2019.2906369 |
[57] | M. Tanveer, K. H. Tan, F. H. Ng, K. M. Leung, H. J. Chuah, Regularization of deep neural network with batch contrastive loss, IEEE Access, 9 (2021), 124409–124418. https://doi.org/10.1109/ACCESS.2021.3110286 doi: 10.1109/ACCESS.2021.3110286 |
[58] | A. Ouahabi, A review of wavelet denoising in medical imaging, in 2013 8th International Workshop on Systems, Signal Processing and their Applications (WoSSPA), (2013), 19–26. https://doi.org/10.1109/WoSSPA.2013.6602330 |
[59] | A. E. Mahdaoui, A. Ouahabi, M. S. Moulay, Image denoising using a compressive sensing approach based on regularization constraints, Sensors, 22 (2022), 2199. https://doi.org/10.3390/s22062199 doi: 10.3390/s22062199 |
[60] | X. Dong, Y. Yang, Searching for a robust neural architecture in four GPU hours, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 1761–1770. |
[61] | X. Chu, X. Wang, B. Zhang, S. Lu, X. Wei, J. Yan, Darts-: robustly stepping out of performance collapse without indicators, preprint, arXiv: 2009.01027. https://doi.org/10.48550/arXiv.2009.01027 |
[62] | H. Liang, S. Zhang, J. Sun, X. He, W. Huang, K. Zhuang, et al., Darts+: Improved differentiable architecture search with early stopping, preprint, arXiv: 1909.06035. https://doi.org/10.48550/arXiv.1909.06035 |