Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

A visual transformer-based smart textual extraction method for financial invoices


  • In era of big data, the computer vision-assisted textual extraction techniques for financial invoices have been a major concern. Currently, such tasks are mainly implemented via traditional image processing techniques. However, they highly rely on manual feature extraction and are mainly developed for specific financial invoice scenes. The general applicability and robustness are the major challenges faced by them. As consequence, deep learning can adaptively learn feature representation for different scenes and be utilized to deal with the above issue. As a consequence, this work introduces a classic pre-training model named visual transformer to construct a lightweight recognition model for this purpose. First, we use image processing technology to preprocess the bill image. Then, we use a sequence transduction model to extract information. The sequence transduction model uses a visual transformer structure. In the stage target location, the horizontal-vertical projection method is used to segment the individual characters, and the template matching is used to normalize the characters. In the stage of feature extraction, the transformer structure is adopted to capture relationship among fine-grained features through multi-head attention mechanism. On this basis, a text classification procedure is designed to output detection results. Finally, experiments on a real-world dataset are carried out to evaluate performance of the proposal and the obtained results well show the superiority of it. Experimental results show that this method has high accuracy and robustness in extracting financial bill information.

    Citation: Tao Wang, Min Qiu. A visual transformer-based smart textual extraction method for financial invoices[J]. Mathematical Biosciences and Engineering, 2023, 20(10): 18630-18649. doi: 10.3934/mbe.2023826

    Related Papers:

    [1] Debao Yan . Existence results of fractional differential equations with nonlocal double-integral boundary conditions. Mathematical Biosciences and Engineering, 2023, 20(3): 4437-4454. doi: 10.3934/mbe.2023206
    [2] Abdon Atangana, Jyoti Mishra . Analysis of nonlinear ordinary differential equations with the generalized Mittag-Leffler kernel. Mathematical Biosciences and Engineering, 2023, 20(11): 19763-19780. doi: 10.3934/mbe.2023875
    [3] Allaberen Ashyralyev, Evren Hincal, Bilgen Kaymakamzade . Crank-Nicholson difference scheme for the system of nonlinear parabolic equations observing epidemic models with general nonlinear incidence rate. Mathematical Biosciences and Engineering, 2021, 18(6): 8883-8904. doi: 10.3934/mbe.2021438
    [4] Sebastian Builes, Jhoana P. Romero-Leiton, Leon A. Valencia . Deterministic, stochastic and fractional mathematical approaches applied to AMR. Mathematical Biosciences and Engineering, 2025, 22(2): 389-414. doi: 10.3934/mbe.2025015
    [5] Hardik Joshi, Brajesh Kumar Jha, Mehmet Yavuz . Modelling and analysis of fractional-order vaccination model for control of COVID-19 outbreak using real data. Mathematical Biosciences and Engineering, 2023, 20(1): 213-240. doi: 10.3934/mbe.2023010
    [6] Barbara Łupińska, Ewa Schmeidel . Analysis of some Katugampola fractional differential equations with fractional boundary conditions. Mathematical Biosciences and Engineering, 2021, 18(6): 7269-7279. doi: 10.3934/mbe.2021359
    [7] Jian Huang, Zhongdi Cen, Aimin Xu . An efficient numerical method for a time-fractional telegraph equation. Mathematical Biosciences and Engineering, 2022, 19(5): 4672-4689. doi: 10.3934/mbe.2022217
    [8] Yingying Xu, Chunhe Song, Chu Wang . Few-shot bearing fault detection based on multi-dimensional convolution and attention mechanism. Mathematical Biosciences and Engineering, 2024, 21(4): 4886-4907. doi: 10.3934/mbe.2024216
    [9] H. M. Srivastava, Khaled M. Saad, J. F. Gómez-Aguilar, Abdulrhman A. Almadiy . Some new mathematical models of the fractional-order system of human immune against IAV infection. Mathematical Biosciences and Engineering, 2020, 17(5): 4942-4969. doi: 10.3934/mbe.2020268
    [10] Guodong Li, Ying Zhang, Yajuan Guan, Wenjie Li . Stability analysis of multi-point boundary conditions for fractional differential equation with non-instantaneous integral impulse. Mathematical Biosciences and Engineering, 2023, 20(4): 7020-7041. doi: 10.3934/mbe.2023303
  • In era of big data, the computer vision-assisted textual extraction techniques for financial invoices have been a major concern. Currently, such tasks are mainly implemented via traditional image processing techniques. However, they highly rely on manual feature extraction and are mainly developed for specific financial invoice scenes. The general applicability and robustness are the major challenges faced by them. As consequence, deep learning can adaptively learn feature representation for different scenes and be utilized to deal with the above issue. As a consequence, this work introduces a classic pre-training model named visual transformer to construct a lightweight recognition model for this purpose. First, we use image processing technology to preprocess the bill image. Then, we use a sequence transduction model to extract information. The sequence transduction model uses a visual transformer structure. In the stage target location, the horizontal-vertical projection method is used to segment the individual characters, and the template matching is used to normalize the characters. In the stage of feature extraction, the transformer structure is adopted to capture relationship among fine-grained features through multi-head attention mechanism. On this basis, a text classification procedure is designed to output detection results. Finally, experiments on a real-world dataset are carried out to evaluate performance of the proposal and the obtained results well show the superiority of it. Experimental results show that this method has high accuracy and robustness in extracting financial bill information.



    Fractional calculus is a main branch of mathematics that can be considered as the generalisation of integration and differentiation to arbitrary orders. This hypothesis begins with the assumptions of L. Euler (1730) and G. W. Leibniz (1695). Fractional differential equations (FDEs) have lately gained attention and publicity due to their realistic and accurate computations [1,2,3,4,5,6,7]. There are various types of fractional derivatives, including Riemann–Liouville, Caputo, Grü nwald–Letnikov, Weyl, Marchaud, and Atangana. This topic's history can be found in [8,9,10,11]. Undoubtedly, fractional calculus applies to mathematical models of different phenomena, sometimes more effectively than ordinary calculus [12,13]. As a result, it can illustrate a wide range of dynamical and engineering models with greater precision. Applications have been developed and investigated in a variety of scientific and engineering fields over the last few decades, including bioengineering [14], mechanics [15], optics [16], physics [17], mathematical biology, electrical power systems [18,19,20] and signal processing [21,22,23].

    One of the definitions of fractional derivatives is Caputo-Fabrizo, which adds a new dimension in the study of FDEs. The new derivative's feature is that it has a nonsingular kernel, which is made from a combination of an ordinary derivative with an exponential function, but it has the same supplementary motivating properties with various scales as in the Riemann-Liouville fractional derivatives and Caputo. The Caputo-Fabrizio fractional derivative has been used to solve real-world problems in numerous areas of mathematical modelling for example, numerical solutions for groundwater pollution, the movement of waves on the surface of shallow water modelling [24], RLC circuit modelling [25], and heat transfer modelling [26,27] were discussed.

    Rach (1987), Bellomo and Sarafyan (1987) first compared the Adomian Decomposition method (ADM) [28,29,30,31,32] to the Picard method on a variety of examples. These methods have many benefits: they effectively work with various types of linear and nonlinear equations and also provide an analytic solution for all of these equations with no linearization or discretization. These methods are more realistic compared with other numerical methods as each technique is used to solve a specific type of equations, on the other hand ADM and Picard are useful for many types of equations. In the numerical examples provided, we compare ADM and Picard solutions of multidimentional fractional order equations with Caputo-Fabrizio.

    The fractional derivative of Caputo-Fabrizio for the function x(t) is defined as [33]

    CFDα0x(t)=B(α)1αt0dds(x(s)) eα1α(ts)ds, (1.1)

    and its corresponding fractional integral is

    CFIαx(t)=1αB(α)x(t)+αB(α)t0x (s)ds,    0<α<1, (1.2)

    where x(t) be continuous and differentiable on [0, T]. Also, in the above definition, the function B(α)>0 is a normalized function which satisfy the condition B(0)=B(1)=0. The relation between the Caputo–Fabrizio fractional derivate and its corresponding integral is given by

    (CFIα0)(CFDα0f(t))=f(t)f(a). (1.3)

    In this section, we will introduce a multidimentional FDE subject to the initial condition. Let α(0,1], 0<α1<α2<...,αm<1, and m is integer real number,

    CFDx=f(t,x,CFDα1x,CFDα2x,...,CFDαmx,) ,x(0)=c0, (2.1)

    where x=x(t),tJ=[0,T],TR+,xC(J).

    To facilitate the equation and make it easy for the calculation, we let x(t)=c0+X(t) so Eq (2.1) can be witten as

    CFDαX=f(t,c0+X,CFDα1X,CFDα2X,...,CFDαmX), X(0)=0. (2.2)

    the algorithm depends on converting the initial condition from a constant c0 to 0.

    Let CFDαX=y(t) then X=CFIαy, so we have

    CFDαiX= CFIααi CFDαX= CFIααiy,  i=1,2,...,m. (2.3)

    Substituting in Eq (2.2) we obtain

    y=f(t,c0+ CFIαy, CFIαα1y,..., CFIααmy). (2.4)

    Assume f satisfies Lipschtiz condition with Lipschtiz constant L given by,

    |f(t,y0,y1,...,ym)||f(t,z0,z1,...,zm)|Lmi=0|yizi|, (2.5)

    which implies

    |f(t,c0+CFIαy,CFIαα1y,..,CFIααmy)f(t,c0+CFIαz,CFIαα1z,..,CFIααmz)|Lmi=0| CFIααiy CFIααiz|. (2.6)

    The solution algorithm of Eq (2.4) using ADM is,

    y0(t)=a(t)yn+1(t)=An(t), j0. (2.7)

    where a(t) pocesses all free terms in Eq (2.4) and An are the Adomian polynomials of the nonlinear term which takes the form [34]

    An=f(Sn)n1i=0Ai, (2.8)

    where f(Sn)=ni=0Ai. Later, this accelerated formula of Adomian polynomial will be used in convergence analysis and error estimation. The solution of Eq (2.4) can be written in the form,

    y(t)=i=0yi(t). (2.9)

    lastly, the solution of the Eq (2.4) takes the form

    x(t)=c0+X(t)=c0+ CFIαy(t). (2.10)

    At which we convert the parameter to the initial form y to x in Eq (2.10), so we have the solution of the original Eq (2.1).

    Define a mapping F:EE where E=(C[J],) is a Banach space of all continuous functions on J with the norm x= maxtϵJx(t).

    Theorem 3.1. Equation (2.4) has a unique solution whenever 0<ϕ<1 where ϕ=L(mi=0[(ααi)(T1)]+1B(ααi)).

    Proof. First, we define the mapping F:EE as

    Fy=f(t,c0+ CFIαy, CFIαα1y,..., CFIααmy).

    Let y and zE are two different solutions of Eq (2.4). Then

    FyFz=f(t,c0+CFIαy,CFIαα1y,..,CFIααmy)f(t,c0+CFIαz,CFIαα1z,...,CFIααmz)

    which implies that

    |FyFz|=|f(t,c0+ CFIαy, CFIαα1y,..., CFIααmy)f(t,c0+ CFIαz, CFIαα1z,..., CFIααmz)|Lmi=0| CFIααiy CFIααiz|Lmi=0|1(ααi)B(ααi)(yz)+ααiB(ααi)t0(yz)ds|FyFzLmi=01(ααi)B(ααi)maxtϵJ|yz|+ααiB(ααi)maxtϵJ|yz|t0dsLmi=01(ααi)B(ααi)yz+ααiB(ααi)yzTLyz(mi=01(ααi)B(ααi)+ααiB(ααi)T)Lyz(mi=0[(ααi)(T1)]+1B(ααi))ϕyz.

    under the condition 0<ϕ<1, the mapping F is contraction and hence there exists a unique solution yC[J] for the problem Eq (2.4) and this completes the proof.

    Theorem 3.2. The series solution of the problem Eq (2.4)converges if |y1(t)|<c and c isfinite.

    Proof. Define a sequence {Sp} such that Sp=pi=0yi(t) is the sequence of partial sums from the series solution i=0yi(t), we have

    f(t,c0+ CFIαy, CFIαα1y,..., CFIααmy)=i=0Ai,

    So

    f(t,c0+ CFIαSp, CFIαα1Sp,..., CFIααmSp)=pi=0Ai,

    From Eq (2.7) we have

    i=0yi(t)=a(t)+i=0Ai1

    let Sp,Sq be two arbitrary sums with pq. Now, we are going to prove that {Sp} is a Caushy sequence in this Banach space. We have

    Sp=pi=0yi(t)=a(t)+pi=0Ai1,Sq=qi=0yi(t)=a(t)+qi=0Ai1.
    SpSq=pi=0Ai1qi=0Ai1=pi=q+1Ai1=p1i=qAi1=f(t,c0+ CFIαSp1, CFIαα1Sp1,..., CFIααmSp1)f(t,c0+ CFIαSq1, CFIαα1Sq1,..., CFIααmSq1)
    |SpSq|=|f(t,c0+ CFIαSp1, CFIαα1Sp1,..., CFIααmSp1)f(t,c0+ CFIαSq1, CFIαα1Sq1,..., CFIααmSq1)|Lmi=0| CFIααiSp1 CFIααiSq1|Lmi=0|1(ααi)B(ααi)(Sp1Sq1)+ααiB(ααi)t0(Sp1Sq1)ds|Lmi=01(ααi)B(ααi)|Sp1Sq1|+ααiB(ααi)t0|Sp1Sq1|ds
    SpSqLmi=01(ααi)B(ααi)maxtϵJ|Sp1Sq1|+ααiB(ααi)maxtϵJ|Sp1Sq1|t0dsLSpSqmi=0(1(ααi)B(ααi)+ααiB(ααi)T)LSpSq(mi=0[(ααi)(T1)]+1B(ααi))ϕSpSq

    let p=q+1 then,

    Sq+1SqϕSqSq1ϕ2Sq1Sq2...ϕqS1S0

    From the triangle inequality we have

    SpSqSq+1Sq+Sq+2Sq+1+...SpSp1[ϕq+ϕq+1+...+ϕp1]S1S0ϕq[1+ϕ+...+ϕpq+1]S1S0ϕq[1ϕpq1ϕ]y1(t)

    Since 0<ϕ<1,pq then (1ϕpq)1. Consequently

    SpSqϕq1ϕy1(t)ϕq1ϕmaxtϵJ|y1(t)| (3.1)

    but |y1(t)|< and as q then, SpSq0 and hence, {Sp} is a Caushy sequence in this Banach space then the proof is complete.

    Theorem 3.3. The maximum absolute truncated error Eq (2.4)is estimated to be maxtϵJ|y(t)qi=0yi(t)|ϕq1ϕmaxtϵJ|y1(t)|

    Proof. From the convergence theorm inequality (Eq 3.1) we have

    SpSqϕq1ϕmaxtϵJ|y1(t)|

    but, Sp=pi=0yi(t) as p then, Spy(t) so,

    y(t)Sqϕq1ϕmaxtϵJ|y1(t)|

    so, the maximum absolute truncated error in the interval J is,

    maxtϵJ|y(t)qi=0yi(t)|ϕq1ϕmaxtϵJ|y1(t)| (3.2)

    and this completes the proof.

    In this part, we introduce several numerical examples with unkown exact solution and we will use inequality (Eq 3.2) to estimate the maximum absolute truncated error.

    Example 4.1. Application of linear FDE

    CFDx(t)+2aCFD1/2x(t)+bx(t)=0,       x(0)=1. (4.1)

    A Basset problem in fluid dynamics is a classical problem which is used to study the unsteady movement of an accelerating particle in a viscous fluid under the action of the gravity [36]

    Set

    X(t)=x(t)1

    Equation (4.1) will be

    CFDX(t)+2aCFD1/2X(t)+bX(t)=0,       X(0)=0. (4.2)

    Appling Eq (2.3) to Eq (4.2), and using initial condition, also we take a = 1, b = 1/2,

    y=122I1/2y12I y (4.3)

    Appling ADM to Eq (4.3), we find the solution algorithm become

    y0(t)=12,yi(t)=2 CFI1/2yi112 CFI yi1,     i1. (4.4)

    Appling Picard solution to Eq (4.2), we find the solution algorithm become

    y0(t)=12,yi(t)=122 CFI1/2yi112 CFI yi1,     i1. (4.5)

    From Eq (4.4), the solution using ADM is given by y(t)=Limqqi=0yi(t) while from Eq (4.5), the solution using Picard technique is given by y(t)=Limiyi(t). Lately, the solution of the original problem Eq (4.2), is

    x(t)=1+ CFI y(t).

    One the same processor (q = 20), the time consumed using ADM is 0.037 seconds, while the time consumed using Picard is 7.955 seconds.

    Figure 1 gives a comparison between ADM and Picard solution of Ex. 4.1.

    Figure 1.  ADM and Picard solution of Ex. 4.1.

    Example 4.2. Consider the following nonlinear FDE [35]

    CFD1/2x=8t3/23πt7/44Γ(114)t44+18 CFD1/4x+14x2, x(0)=0. (4.6)

    Appling Eq (2.3) to Eq (4.6), and using initial condition,

    y=8t3/23πt7/44Γ(114)t44+18 CFI1/4y+14(CFI1/2y)2. (4.7)

    Appling ADM to Eq (4.7), we find the solution algorithm will be become

    y0(t)=8t3/23πt7/44Γ(114)t44,yi(t)=18 CFI1/4yi1+14(Ai1),     i1. (4.8)

    at which Ai are Adomian polynomial of the nonliner term (CFI1/2y)2.

    Appling Picard solution to Eq (4.7), we find the the solution algorithm become

    y0(t)=8t3/23πt7/44Γ(114)t44,yi(t)=y0(t)+18 CFI1/4yi1+14(CFI1/2yi1)2,     i1. (4.9)

    From Eq (4.8), the solution using ADM is given by y(t)=Limqqi=0yi(t) while from Eq (4.9), the solution using Picard technique is given by y(t)=Limiyi(t). Finally, the solution of the original problem Eq (4.7), is.

    x(t)= CFI1/2y.

    One the same processor (q = 2), the time consumed using ADM is 65.13 seconds, while the time consumed using Picard is 544.787 seconds.

    Table 1 showed the maximum absolute truncated error of of ADM solution (using Theorem 3.3) at different values of m (when t = 0:5; N = 2):

    Table 1.  Max. absolute error.
    q max. absolute error
    2 0.114548
    5 0.099186
    10 0.004363

     | Show Table
    DownLoad: CSV

    Figure 2 gives a comparison between ADM and Picard solution of Ex. 4.2.

    Figure 2.  ADM and Picard solution of Ex. 4.2.

    Example 4.3. Consider the following nonlinear FDE [35]

    CFDαx=3t2128125πt5+110(CFD1/2x)2,x(0)=0. (4.10)

    Appling Eq (2.3) to Eq (4.10), and using initial condition,

    y=3t2128125πt5+110(CFI1/2y)2 (4.11)

    Appling ADM to Eq (4.11), we find the solution algorithm become

    y0(t)=3t2128125πt5,yi(t)=110(Ai1),     i1 (4.12)

    at which Ai are Adomian polynomial of the nonliner term (CFI1/2y)2.

    Then appling Picard solution to Eq (4.11), we find the solution algorithm become

    y0(t)=3t2128125πt5,yi(t)=y0(t)+110(CFI1/2yi1)2,     i1. (4.13)

    From Eq (4.12), the solution using ADM is given by y(t)=Limqqi=0yi(t) while from Eq (4.13), the solution is y(t)=Limiyi(t). Finally, the solution of the original problem Eq (4.11), is

    x(t)=CFIy(t).

    One the same processor (q = 4), the time consumed using ADM is 2.09 seconds, while the time consumed using Picard is 44.725 seconds.

    Table 2 showed the maximum absolute truncated error of of ADM solution (using Theorem 3.3) at different values of m (when t = 0:5; N = 4):

    Table 2.  Max. absolute error.
    q max. absolute error
    2 0.00222433
    5 0.0000326908
    10 2.88273*108

     | Show Table
    DownLoad: CSV

    Figure 3 gives a comparison between ADM and Picard solution of Ex. 4.3 with α=1.

    Figure 3.  ADM and Picard solution where of Ex. 4.3.

    Example 4.4. Consider the following nonlinear FDE [35]

    CFDαx=t2+12 CFDα1x+14 CFDα2x+16 CFDα3x+18x4,x(0)=0. (4.14)

    Appling Eq (2.3) to Eq (4.10), and using initial condition,

    y=t2+12(CFIαα1y)+14(CFIαα2y)+16(CFIαα3y)+18(CFIαy)4, (4.15)

    Appling ADM to Eq (4.15), we find the solution algorithm become

    y0(t)=t2,yi(t)=12(CFIαα1y)+14(CFIαα2y)+16(CFIαα3y)+18Ai1,  i1 (4.16)

    where Ai are Adomian polynomial of the nonliner term (CFIαy)4.

    Then appling Picard solution to Eq (4.15), we find the solution algorithm become

    y0(t)=t2,yi(t)=t2+12(CFIαα1yi1)+14(CFIαα2yi1)+16(CFIαα3yi1)+18(CFIαyi1)4     i1. (4.17)

    From Eq (4.16), the solution using ADM is given by y(t)=Limqqi=0yi(t) while from Eq (4.17), the solution using Picard technique is y(t)=Limiyi(t). Finally, the solution of the original problem Eq (4.14), is

    x(t)=CFIαy(t).

    One the same processor (q = 3), the time consumed using ADM is 0.437 seconds, while the time consumed using Picard is (16.816) seconds. Figure 4 shows a comparison between ADM and Picard solution of Ex. 4.4 atα=0.7,α1=0.1,α2=0.3,α3=0.5.

    Figure 4.  ADM and Picard solution where of Ex. 4.4.

    The Caputo-Fabrizo fractional deivative has a nonsingular kernel, and consequently, this definition is appropriate in solving nonlinear multidimensional FDE [37,38]. Since the selected numerical problems have an unkown exact solution, the formula (3.2) can be used to estimate the maximum absolute truncated error. By comparing the time taken on the same processor (i7-2670QM), it was found that the time consumed by ADM is much smaller compared with the Picard technique. Furthermore Picard gives a more accurate solution than ADM at the same interval with the same number of terms.

    The authors declare there is no conflict of interest.



    [1] Y. Chen, C. Liu, W. Huang, S. Cheng, R. Arcucci, Z. Xiong, Generative text-guided 3d vision-language pretraining for unified medical image segmentation, preprint, arXiv: 2306.04811. https://doi.org/10.48550/arXiv.2306.04811
    [2] Z. Wan, C. Liu, M. Zhang, J. Fu, B. Wang, S. Cheng, et al., Med-unic: Unifying cross-lingual medical vision-language pre-training by diminishing bias, preprint, arXiv: 2305.19894. https://doi.org/10.48550/arXiv.2305.19894
    [3] C. Liu, S. Cheng, C. Chen, M. Qiao, W. Zhang, A. Shah, et al., M-FLAG: medical vision-language pre-training with frozen language models and latent space geometry optimization, preprint, arXiv: 2307.08347. https://doi.org/10.48550/arXiv.2307.08347
    [4] Z. Guo, K. Yu, N. Kumar, W. Wei, S. Mumtaz, M. Guizani, Deep distributed learning-based poi recommendation under mobile edge networks, IEEE Internet Things J., 10 (2023), 303–317. https://doi.org/10.1109/JIOT.2022.3202628 doi: 10.1109/JIOT.2022.3202628
    [5] Y. Jin, L. Hou, Y. Chen, A time series transformer based method for the rotating machinery fault diagnosis, Neurocomputing, 494 (2022), 379–395. https://doi.org/10.1016/j.neucom.2022.04.111 doi: 10.1016/j.neucom.2022.04.111
    [6] Q. Li, L. Liu, Z. Guo, P. Vijayakumar, F. Taghizadeh-Hesary, K. Yu, Smart assessment and forecasting framework for healthy development index in urban cities, Cities, 131 (2022), 103971. https://doi.org/10.1016/j.cities.2022.103971 doi: 10.1016/j.cities.2022.103971
    [7] J. Zhang, X. Liu, W. Liao, X. Li, Deep-learning generation of poi data with scene images, ISPRS J. Photogramm. Remote Sens., 188 (2022), 201–219. https://doi.org/10.1016/j.isprsjprs.2022.04.004 doi: 10.1016/j.isprsjprs.2022.04.004
    [8] Z. Guo, Y. Shen, S. Wan, W. Shang, K. Yu, Hybrid intelligence-driven medical image recognition for remote patient diagnosis in internet of medical things, IEEE J. Biomed. Health. Inf., 26 (2022), 5817–5828. https://doi.org/10.1109/JBHI.2021.3139541 doi: 10.1109/JBHI.2021.3139541
    [9] D. Zhang, X. Gao, A digital twin dosing system for iron reverse flotation, J. Manuf. Syst., 63 (2022), 238–249. https://doi.org/10.1016/j.jmsy.2022.03.006 doi: 10.1016/j.jmsy.2022.03.006
    [10] Z. Guo, Q. Zhang, F. Ding, X. Zhu, K. Yu, A novel fake news detection model for context of mixed languages through multiscale transformer, IEEE Trans. Comput. Social Syst., 2023. https://doi.org/10.1109/TCSS.2023.3298480 doi: 10.1109/TCSS.2023.3298480
    [11] X. Sun, Y. Zou, S. Wang, H. Su, B. Guan, A parallel network utilizing local features and global representations for segmentation of surgical instruments, Int. J. Comput. Assisted Radiol. Surg., 17 (2022), 1903–1913. https://doi.org/10.1007/s11548-022-02687-z doi: 10.1007/s11548-022-02687-z
    [12] Z. Chen, J. Chen, S. Liu, Y. Feng, S. He, E. Xu, Multi-channel calibrated transformer with shifted windows for few-shot fault diagnosis under sharp speed variation, ISA Trans., 131 (2022), 501–515. https://doi.org/10.1016/j.isatra.2022.04.043 doi: 10.1016/j.isatra.2022.04.043
    [13] M. Sun, L. Xu, R. Luo, Y. Lu, W. Jia, Ghformer-net: Towards more accurate small green apple/begonia fruit detection in the nighttime, J. King Saud Univ.-Comput. Inf. Sci., 34 (2022), 4421–4432. https://doi.org/10.1016/j.jksuci.2022.05.005 doi: 10.1016/j.jksuci.2022.05.005
    [14] D. Chen, J. Zheng, G. Wei, F. Pan, Extracting predictive representations from hundreds of millions of molecules, J. Phys. Chem. Lett., 12 (2021), 10793–10801.
    [15] N. P. Tigga, S. Garg, Efficacy of novel attention-based gated recurrent units transformer for depression detection using electroencephalogram signals, Health Inf. Sci. Syst., 11 (2023). https://doi.org/10.1007/s13755-022-00205-8 doi: 10.1007/s13755-022-00205-8
    [16] B. Wang, Q. Li, Z. You, Self-supervised learning based transformer and convolution hybrid network for one-shot organ segmentation, Neurocomputing, 527 (2023). https://doi.org/10.1016/j.neucom.2022.12.028 doi: 10.1016/j.neucom.2022.12.028
    [17] S. Xiao, S. Wang, Z. Huang, Y. Wang, H. Jiang, Two-stream transformer network for sensor-based human activity recognition, Neurocomputing, 512 (2022), 253–268. https://doi.org/10.1016/j.neucom.2022.09.099 doi: 10.1016/j.neucom.2022.09.099
    [18] M. Mao, R. Zhang, H. Zheng, T. Ma, Y. Peng, E. Ding, et al., Dual-stream network for visual recognition, Adv. Neural Inf. Process. Syst., 34 (2021), 25346–25358.
    [19] R. Kozik, M. Pawlicki, M. Chorś, A new method of hybrid time window embedding with transformer-based traffic data classification in iot-networked environment, Pattern Anal. Appl., 24 (2021), 1441–1449. https://doi.org/10.1007/s10044-021-00980-2 doi: 10.1007/s10044-021-00980-2
    [20] A. A. Khan, R. Jahangir, R. Alroobaea, S. Y. Alyahyan, A. H. Almulhi, M. Alsafyani, et al., An efficient text-independent speaker identification using feature fusion and transformer model, Comput. Mater. Contin., 75 (2023), 4085–4100.
    [21] D. Li, B. Li, S. Long, H. Feng, T. Xi, S. Kang, et al., Rice seedling row detection based on morphological anchor points of rice stems, Biosyst. Eng., 226 (2023), 71–85. https://doi.org/10.1016/j.biosystemseng.2022.12.012 doi: 10.1016/j.biosystemseng.2022.12.012
    [22] Y. Yang, J. Yu, H. Jiang, W. Han, J. Zhang, W. Jiang, A contrastive triplet network for automatic chest x-ray reporting, Neurocomputing, 502 (2022), 71–83. https://doi.org/10.1016/j.neucom.2022.06.063 doi: 10.1016/j.neucom.2022.06.063
    [23] B. Zhang, J. Abbing, A. Ghanem, D. Fer, J. Barker, R. Abukhalil, et al., Towards accurate surgical workflow recognition with convolutional networks and transformers, Comput. Methods Biomech. Biomed. Eng.: Imaging Visualization, 10 (2022), 349–356. https://doi.org/10.1080/21681163.2021.2002191 doi: 10.1080/21681163.2021.2002191
    [24] X. Pan, X. Gao, H. Wang, W. Zhang, Y. Mu, X. He, Temporal-based swin transformer network for workflow recognition of surgical video, Int. J. Comput. Assisted Radiol. Surg., 18 (2023), 139–147.
    [25] Y. J. Shin, S. B. Jeong, H. I. Seo, W. Y. Kim, D. H. Seo, A study on handwritten parcel delivery invoice understanding model, J. Adv. Mar. Eng. Technol. (JAMET), 46 (2022), 430–438. https://doi.org/10.5916/jamet.2022.46.6.430 doi: 10.5916/jamet.2022.46.6.430
    [26] Y. Liu, T. Bai, Y. Tian, Y. Wang, J. Wang, X. Wang, et al., Segdq: Segmentation assisted multi-object tracking with dynamic query-based transformers, Neurocomputing, 481 (2022), 91–101. https://doi.org/10.1016/j.neucom.2022.01.073 doi: 10.1016/j.neucom.2022.01.073
    [27] L. Tang, X. Xiang, H. Zhang, M. Gong, J. Ma, Divfusion: Darkness-free infrared and visible image fusion, Inf. Fusion, 91 (2023), 477–493. https://doi.org/10.1016/j.inffus.2022.10.034 doi: 10.1016/j.inffus.2022.10.034
    [28] H. Jiang, M. Gao, H. Li, R. Jin, H. Miao, J. Liu, Multi-learner based deep meta-learning for few-shot medical image classification, IEEE J. Biomed. Health Inf., 27 (2023), 17–28. https://doi.org/10.1109/JBHI.2022.3215147 doi: 10.1109/JBHI.2022.3215147
    [29] M. Luo, H. Wu, H. Huang, W. He, R. He, Memory-modulated transformer network for heterogeneous face recognition, IEEE Trans. Inf. Forensics Secur., 17 (2022), 2095–2109. https://doi.org/10.1109/TIFS.2022.3177960 doi: 10.1109/TIFS.2022.3177960
    [30] J. Izquierdo-Domenech, J. Linares-Pellicer, J. Orta-Lopez, Towards achieving a high degree of situational awareness and multimodal interaction with ar and semantic ai in industrial applications, Multimedia Tools Appl., 82 (2023), 15875–15901. https://doi.org/10.1007/s11042-022-13803-1 doi: 10.1007/s11042-022-13803-1
    [31] Z. Yu, Y. Shen, J. Shi, H. Zhao, Y. Cui, J. Zhang, et al., Physformer++: Facial video-based physiological measurement with slowfast temporal difference transformer, Int. J. Comput. Vision, 131 (2023), 1307–1330.
    [32] H. Ji, X. Cui, W. Ren, L. Liu, W. Wang, Visual inspection for transformer insulation defects by a patrol robot fish based on deep learning, IET Sci. Meas. Technol., 15 (2021), 606–618. https://doi.org/10.1049/smt2.12062 doi: 10.1049/smt2.12062
    [33] Y. Wu, K. Liao, J. Chen, J. Wang, D. Z. Chen, H. Gao, et al., D-former: A u-shaped dilated transformer for 3d medical image segmentation, Neural Comput. Appl., 35 (2023), 1931–1944. https://doi.org/10.1007/s00521-022-07859-1 doi: 10.1007/s00521-022-07859-1
    [34] C. Liu, Z. Mao, A. Liu, T. Zhang, B. Wang, Y. Zhang, Focus your attention: A bidirectional focal attention network for image-text matching, in Proceedings of the 27th ACM International Conference on Multimedia, ACM, (2019), 3–11. https://doi.org/10.1145/3343031.3350869
    [35] C. Liu, Z. Mao, T. Zhang, H. Xie, B. Wang, Y. Zhang, Graph structured network for image-text matching, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 10918–10927. https://doi.org/10.1109/CVPR42600.2020.01093
    [36] H. Diao, Y. Zhang, L. Ma, H. Lu, Similarity reasoning and filtration for image-text matching, in Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2021), 1218–1226. https://doi.org/10.1609/aaai.v35i2.16209
  • This article has been cited by:

    1. Eman A. A. Ziada, Salwa El-Morsy, Osama Moaaz, Sameh S. Askar, Ahmad M. Alshamrani, Monica Botros, Solution of the SIR epidemic model of arbitrary orders containing Caputo-Fabrizio, Atangana-Baleanu and Caputo derivatives, 2024, 9, 2473-6988, 18324, 10.3934/math.2024894
    2. H. Salah, M. Anis, C. Cesarano, S. S. Askar, A. M. Alshamrani, E. M. Elabbasy, Fourth-order differential equations with neutral delay: Investigation of monotonic and oscillatory features, 2024, 9, 2473-6988, 34224, 10.3934/math.20241630
    3. Said R. Grace, Gokula N. Chhatria, S. Kaleeswari, Yousef Alnafisah, Osama Moaaz, Forced-Perturbed Fractional Differential Equations of Higher Order: Asymptotic Properties of Non-Oscillatory Solutions, 2024, 9, 2504-3110, 6, 10.3390/fractalfract9010006
    4. A.E. Matouk, Monica Botros, Hidden chaotic attractors and self-excited chaotic attractors in a novel circuit system via Grünwald–Letnikov, Caputo-Fabrizio and Atangana-Baleanu fractional operators, 2025, 116, 11100168, 525, 10.1016/j.aej.2024.12.064
    5. Zahra Barati, Maryam Keshavarzi, Samaneh Mosaferi, Anatomical and micromorphological study of Phalaris (Poaceae) species in Iran, 2025, 68, 1588-4082, 9, 10.14232/abs.2024.1.9-15
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1600) PDF downloads(53) Cited by(1)

Figures and Tables

Figures(9)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog