Research article

New regularization methods for convolutional kernel tensors

  • Received: 28 June 2023 Revised: 21 August 2023 Accepted: 31 August 2023 Published: 12 September 2023
  • MSC : 15B05, 65F15

  • Convolution is a very basic and important operation for convolutional neural networks. For neural network training, how to bound the convolutional layers is a currently popular research topic. Each convolutional layer is represented by a tensor, which corresponds to a structured transformation matrix. The objective is to ensure that the singular values of each transformation matrix are bounded around 1 by changing the entries of the tensor. We propose three new regularization terms for a convolutional kernel tensor and derive the gradient descent algorithm for each penalty function. Numerical examples are presented to demonstrate the effectiveness of the algorithms.

    Citation: Pei-Chang Guo. New regularization methods for convolutional kernel tensors[J]. AIMS Mathematics, 2023, 8(11): 26188-26198. doi: 10.3934/math.20231335

    Related Papers:

  • Convolution is a very basic and important operation for convolutional neural networks. For neural network training, how to bound the convolutional layers is a currently popular research topic. Each convolutional layer is represented by a tensor, which corresponds to a structured transformation matrix. The objective is to ensure that the singular values of each transformation matrix are bounded around 1 by changing the entries of the tensor. We propose three new regularization terms for a convolutional kernel tensor and derive the gradient descent algorithm for each penalty function. Numerical examples are presented to demonstrate the effectiveness of the algorithms.



    加载中


    [1] P. L. Bartlett, D. J. Foster, M. Telgarsky, Spectrally-normalized margin bounds for neural networks, Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, 6241–6250.
    [2] A. Brock, T. Lim, J. M. Ritchie, N. Weston, Neural photo editing with introspective adversarial networks, arXiv, 2017. https://doi.org/10.48550/arXiv.1609.07093 doi: 10.48550/arXiv.1609.07093
    [3] R. H. F. Chan, X. Jin, An introduction to iterative toeplitz solvers, SIAM Press, 2007.
    [4] M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin, N. Usunier, Parseval networks: improving robustness to adversarial examples, Proceedings of the 34th International Conference on Machine Learning, 70 (2017), 854–863.
    [5] W. Ding, Y. Wei, Theory and computation of tensors: multi-dimensional arrays, Academic Press, 2016. https://doi.org/10.1016/C2014-0-04764-8
    [6] V. Dumoulin, F. Visin, A guide to convolution arithmetic for deep learning, arXiv, 2018. https://doi.org/10.48550/arXiv.1603.07285 doi: 10.48550/arXiv.1603.07285
    [7] G. H. Golub, C. F. Van Loan, Matrix computations, Johns Hopkins University Press, 2013. https://doi.org/10.56021/9781421407944
    [8] I. J. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT Press, 2016.
    [9] I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, arXiv, 2015. https://doi.org/10.48550/arXiv.1412.6572 doi: 10.48550/arXiv.1412.6572
    [10] P. C. Guo, Q. Ye, On the regularization of convolutional kernels in neural networks, Linear Multilinear Algebra, 70 (2022), 2318–2330. https://doi.org/10.1080/03081087.2020.1795058 doi: 10.1080/03081087.2020.1795058
    [11] J. F. Kolen, S. C. Kremer, Gradient flow in recurrent nets: the difficulty of learning long-term dependencies, Wiley-IEEE Press, 2001. https://doi.org/10.1109/9780470544037.ch14
    [12] X. Q. Jin, Developments and applications of block Toeplitz iterative solvers, Springer Science & Business Media, 2003.
    [13] J. Kovačević, A. Chebira, An introduction to frames, Now Publishers Inc., 2008.
    [14] P. Li, Y. Lu, C. Xu, J. Ren, Insight into Hopf bifurcation and control methods in fractional order BAM neural networks incorporating symmetric structure and delay, Cognit. Comput., 2023. https://doi.org/10.1007/s12559-023-10155-2 doi: 10.1007/s12559-023-10155-2
    [15] L. H. Lim, Tensors in computations, Acta Numer., 30 (2021), 555–764. https://doi.org/10.1017/S0962492921000076 doi: 10.1017/S0962492921000076
    [16] T. Miyato, T. Kataoka, M. Koyama, Y. Yoshida, Spectral normalization for generative adversarial networks, arXiv, 2018. https://doi.org/10.48550/arXiv.1802.05957 doi: 10.48550/arXiv.1802.05957
    [17] H. Sedghi, V. Gupta, P. M. Long, The singular values of convolutional layers, arXiv, 2018. https://doi.org/10.48550/arXiv.1805.10408 doi: 10.48550/arXiv.1805.10408
    [18] G. W. Stewart. Matrix algorithms, SIAM Publications Library, 2001. https://doi.org/10.1137/1.9780898718058
    [19] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, et al., Intriguing properties of neural networks, arXiv, 2013. https://doi.org/10.48550/arXiv.1312.6199 doi: 10.48550/arXiv.1312.6199
    [20] Y. Tsuzuku, I. Sato, M. Sugiyama, Lipschitz-Margin training: scalable certification of perturbation invariance for deep neural networks, Adv. Neural Inf. Process., 31 (2018), 6542–6551.
    [21] J. Wang, Y. Chen, R. Chakraborty, S. X. Yu, Orthogonal convolutional neural networks, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. https://doi.org/10.1109/CVPR42600.2020.01152 doi: 10.1109/CVPR42600.2020.01152
    [22] C. Xu, Z. Liu, P. Li, J. Yan, L. Yao, Bifurcation mechanism for fractional-order three-triangle multi-delayed neural networks, Neural Process. Lett., 2022. https://doi.org/10.1007/s11063-022-11130-y doi: 10.1007/s11063-022-11130-y
    [23] C. Xu, W. Zhang, Z. Liu, L. Yao, Delay-induced periodic oscillation for fractional-order neural networks with mixed delays, Neurocomputing, 488 (2022), 681–693. https://doi.org/10.1016/j.neucom.2021.11.079 doi: 10.1016/j.neucom.2021.11.079
    [24] Y. Yoshida, T. Miyato, Spectral norm regularization for improving the generalizability of deep learning, arXiv, 2017. https://doi.org/10.48550/arXiv.1705.10941 doi: 10.48550/arXiv.1705.10941
    [25] C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals, Understanding deep learning (still) requires rethinking generalization, Commun. ACM, 64 (2021), 107–115. https://doi.org/10.1145/3446776 doi: 10.1145/3446776
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(678) PDF downloads(66) Cited by(0)

Article outline

Figures and Tables

Figures(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog