Processing math: 48%
Research article

Composition operators on Hardy-Sobolev spaces with bounded reproducing kernels

  • Received: 17 July 2022 Revised: 19 September 2022 Accepted: 10 October 2022 Published: 09 November 2022
  • MSC : 47B33, 47A53

  • For any real β let H2β be the Hardy-Sobolev space on the unit disc D. H2β is a reproducing kernel Hilbert space and its reproducing kernel is bounded when β>1/2. In this paper, we prove that Cφ has dense range in H2β if and only if the polynomials are dense in a certain Dirichlet space of the domain φ(D) for 1/2<β<1. It follows that if the range of Cφ is dense in H2β, then φ is a weak-star generator of H, although the conclusion is false for the classical Dirichlet space D. Moreover, we study the relation between the density of the range of Cφ and the cyclic vector of the multiplier Mβφ.

    Citation: Li He. Composition operators on Hardy-Sobolev spaces with bounded reproducing kernels[J]. AIMS Mathematics, 2023, 8(2): 2708-2719. doi: 10.3934/math.2023142

    Related Papers:

    [1] He Yuan, Zhuo Liu . Lie n-centralizers of generalized matrix algebras. AIMS Mathematics, 2023, 8(6): 14609-14622. doi: 10.3934/math.2023747
    [2] Junyuan Huang, Xueqing Chen, Zhiqi Chen, Ming Ding . On a conjecture on transposed Poisson n-Lie algebras. AIMS Mathematics, 2024, 9(3): 6709-6733. doi: 10.3934/math.2024327
    [3] Xinfeng Liang, Mengya Zhang . Triangular algebras with nonlinear higher Lie n-derivation by local actions. AIMS Mathematics, 2024, 9(2): 2549-2583. doi: 10.3934/math.2024126
    [4] He Yuan, Qian Zhang, Zhendi Gu . Characterizations of generalized Lie n-higher derivations on certain triangular algebras. AIMS Mathematics, 2024, 9(11): 29916-29941. doi: 10.3934/math.20241446
    [5] Dan Liu, Jianhua Zhang, Mingliang Song . Local Lie derivations of generalized matrix algebras. AIMS Mathematics, 2023, 8(3): 6900-6912. doi: 10.3934/math.2023349
    [6] Mohd Arif Raza, Huda Eid Almehmadi . Lie (Jordan) σcentralizer at the zero products on generalized matrix algebra. AIMS Mathematics, 2024, 9(10): 26631-26648. doi: 10.3934/math.20241295
    [7] Baiying He, Siyu Gao . The nonisospectral integrable hierarchies of three generalized Lie algebras. AIMS Mathematics, 2024, 9(10): 27361-27387. doi: 10.3934/math.20241329
    [8] Xinfeng Liang, Lingling Zhao . Bi-Lie n-derivations on triangular rings. AIMS Mathematics, 2023, 8(7): 15411-15426. doi: 10.3934/math.2023787
    [9] Huizhang Yang, Wei Liu, Yunmei Zhao . Lie symmetry reductions and exact solutions to a generalized two-component Hunter-Saxton system. AIMS Mathematics, 2021, 6(2): 1087-1100. doi: 10.3934/math.2021065
    [10] Guangyu An, Xueli Zhang, Jun He, Wenhua Qian . Characterizations of local Lie derivations on von Neumann algebras. AIMS Mathematics, 2022, 7(5): 7519-7527. doi: 10.3934/math.2022422
  • For any real β let H2β be the Hardy-Sobolev space on the unit disc D. H2β is a reproducing kernel Hilbert space and its reproducing kernel is bounded when β>1/2. In this paper, we prove that Cφ has dense range in H2β if and only if the polynomials are dense in a certain Dirichlet space of the domain φ(D) for 1/2<β<1. It follows that if the range of Cφ is dense in H2β, then φ is a weak-star generator of H, although the conclusion is false for the classical Dirichlet space D. Moreover, we study the relation between the density of the range of Cφ and the cyclic vector of the multiplier Mβφ.



    As the speedy expansion of technology is about acquiring and processing information, most factual data exists in tensor form, e.g., electroencephalography signal data, video volume data, hyperspectral image data, color image data, and functional magnetic resonance imaging data. Enormous and intricate data is ubiquitous in real-world application scenarios; see [1,2,3]. Typical dimensional reduction methods, such as singular value decomposition [4], local linear embedding [5,6], vector quantization [7], principal component analysis [8,9], nonnegative matrix factorization (NMF) [10,11,12], etc., extract their low-dimensional representation from high-dimensional ones. A commonality of the abovementioned low rank approximation methods is that converting the data matrices (tensors) of samples into a new large matrix may substantially corrupt the architecture of sample data space. To preserve the internal nature of tensor data and study the low-dimensional representation of high-order domain, many tensor decomposition methods [13,14,15] have been developed. Kim et al. introduced a nonnegative Tucker decomposition (NTD) [16,17] by combining Tucker decomposition with nonnegativity constraints on the core tensor and factor matrices. By adding nonnegativity constraints, NTD could not only obtain the parts-based representation like NMF, but also improve the uniqueness of Tucker decomposition. Recent efforts have extended NTD to boost calculation efficiency and meet different demands in actual applications by incorporating suitable constraint conditions [18,19,20] with NTD, including smoothness, graph Laplacian, sparsity, orthogonality, and supervision, just to name a few.

    The model that adds effective and feasible constraints to the NTD model is called the NTD-like model. For example, Liu et al. [21] stated a graph regularized Lp smooth NTD method by adding the graph regularization and Lp smooth constraint into NTD to retain smooth and more accurate solutions of the objective function. Qiu et al. [22] proposed a graph Laplacian-regularized NTD (GNTD) method. GNTD extracts the low-dimensional parts-based representation and preserves the geometrical information simultaneously from the high-dimensional tensor data. Subsequently, Qiu et al. [23] developed an alternating proximate gradient descent method to solve the proposed GNTD framework. Chen et al. [24] designed an adaptive graph regularized NTD model, which adaptively learns the optimal graph to capture local manifold information. Li et al. [25] asserted a manifold regularization NTD (MR-NTD) by employing a manifold regularization term for the core tensor constructed in the NTD to preserve geometric information in tensor data. Huang et al. [26] gave a dynamic hypergraph regularized NTD method by incorporating the hypergraph structure and NTD in a unified framework. Jing et al. [27] recommended a label constrained NTD using partial labels to construct a label matrix. Then, they embedded the label term and graph regularization term into NTD for guiding the algorithm to obtain more correct categories in clustering tasks. To make use of the available label information of sample data, Qiu et al. [28] built up a semi-supervised NTD (SNTD) model by propagating the limited label information and learning the nonnegative tensor representation. This part can only cover a small subset of the many important and interesting ideas that have emerged. There are other related studies; see [29,30,31] for details.

    However, the aforementioned NTD-like model does not take into account the orthogonality constraint. In fact, the orthogonality structure of factor matrices makes sense in practical use. In [32], the equivalence of the orthogonal nonnegative matrix factorization (ONMF) problem and K-means clustering has been well discussed. To maintain this characteristic, Pan et al. [33] developed an orthogonal NTD model by considering the orthogonality on each factor matrix. It can get the clustering information from the factor matrices and their joint connection weight from the core tensor. The orthogonal NTD model not only helps to keep the inherent tensor structure but also performs well in data compression. Lately, drawing on the idea of approximate orthogonality from [34], Qiu et al. [35] affirmed a flexible multi-way clustering model called approximately orthogonal NTD for soft clustering. This model provides extra flexibility to handle crossed memberships while making the orthogonality of the factor matrix adjustable. However, these methods that consider orthogonality still have their shortcomings. In clustering problems, it is common to consider the factor matrix of the last direction as an approximation of the original data. Using graph regularization constraints to the factor matrix of the last direction can capture more internal manifold information of the data tensor. They fail to pay attention to the geometrical structure of the data space and the potential connections between samples.

    The geometrical structure and resilient orthogonality of the data space are indispensable. To balance the two purposes, in this paper, we propose two approximately orthogonal NTD with graph regularized (AOGNTD) models by jointly blending the graph regularization and approximately orthogonal constraints into the NTD framework. The new model we propose is essentially an NTD-like model. The main contributions of this paper are fourfold:

    ● Coupling the graph regularized term, the approximately orthogonal term, and the objective function of NTD, we construct a novel tensor-based frame. Based on whether we add the approximately orthogonal constraint to the N-th factor matrix, two models naturally generated.

    ● By regulating the quality of approximation and picking the appropriate graph for the learning task, the models allow us to detect more complex, latent, and structural correlations among sample data.

    ● This algorithm is sensitive to the input of parameters, which means this option determines the performance of clustering results. To overcome the issue, we use the grid method to select competent parameters.

    ● Numerical experiments on frequently adopted datasets are conducted to illustrate the feasibility of the proposed methods for image clustering and classification tasks. Extensive experimental results show that the AOGNTD models could significantly improve the performance.

    The rest of this paper is organized as follows. In Section 2, we review the related models. In Section 3, the two AOGNTD methods are proposed. In Section 4, we have discussed its theoretical convergence, provided a solution process and computational complexity. Finally, experiments for clustering tasks are presented in Section 5, and conclusions are drawn in Section 6.

    NTD could be considered as a special case of NMF with sparser and multi-linear basic vectors. Given a nonnegative data tensor

    XRI1×I2××IN1×IN+,

    NTD aims at decomposing the nonnegative tensor X into a nonnegative core tensor

    GRJ1×J2××JN+

    multiplied by N nonnegative factor matrices

    A(r)RIr×Jr+(r=1,2,,N)

    along each mode. To achieve this goal, NTD minimizes the sum of squared residues between the data tensor X and the multi-linear product of core tensor G and factor matrices A(r), which can be formulated as

    minG,A(1),,A(N)ONTD=12XG×1A(1)×2A(2)×NA(N)2,s.t.G0,  A(r)0,    r=1,2,,N, (2.1)

    where in the following, the operator ×r is referred to as the r-mode product. The r-mode product of a tensor

    YRJ1×J2××JN

    and a matrix URIr×Jr, denoted by Y×rU, is

    (Y×rU)j1jr1irjr+1jN=Jrjr=1yj1jr1jrjr+1jNuirjr.

    Equation (2.1) can be represented as a matrix form:

    minG(N),A(1),,A(N)ONTD=12X(N)A(N)G(N)(pNA(p))2,s.t.G(N)0,  A(r)0,    r=1,2,,N,

    where

    A(N)RIN×JN+,   pNA(p)=A(N1)A(N2)A(1),

    denotes the Kronecker product, and

    X(N)RIN×I1I2IN1+andG(N)RJN×J1J2JN1+

    are the mode-N unfolding matrices of data tensor X and core tensor G, respectively. Therefore, the NTD regard as the NMF with the encoding matrix

    A(N)RIN×JN+

    and the basis matrix

    G(N)(pNA(p)),

    where IN and JN can be regarded as the number of all samples and the dimension of low-dimensional representation of X.

    The local geometric structure can be effectively modeled through a nearest neighbor graph on data points, which originates from spectral graph theory and manifold learning. Graphs using the broad range of p nearest neighbors are generated and employed. The most commonly weighted matrices are heat kernel and 0-1 weighting. The heat kernel is

    Wij={eyiyj2σ2,yjN(yi) or yiN(yj),0,otherwise,

    where N(yi) is composed of the p-nearest neighbors of sample yi, and σ is the Gaussian kernel parameter to control the values of similarity. In our experiment, it was set to 1. Alternatively, 0-1 weighting gave a binarization of weight definition as

    Wij={1,yjN(yi) or yiN(yj),0,otherwise.

    This 0-1 weight is the cosine weight method, which is used in the subsequent image experiments.

    To measure the dissimilarity of data points zi,zj in the low-dimensional representation A(N),

    d=(zi,zj)=||zizj||2

    is under consideration. The definition of graph Laplacian is

    12ni,j=1||zizj||2Wij=ni=1zizinj=1Wijni,j=1zizjWij=ni=1ziziDiini,j=1zizjWij=Tr(A(N)TDA(N))Tr(A(N)TWA(N))=Tr(A(N)TLA(N)),

    where W is a weight matrix, L=DW is called the graph Laplacian, and

    Dii=jWij

    is a diagonal matrix whose elements are column sums of W. Minimizing the above formula, we hope if yi and yj are close, zi and zj are consistent with this trend. If data zi and zj are similar, the value of Wij is relatively large. The operation of trace effectively characterizes the smoothness of low-rank representations.

    GNTD is obtained by minimizing the following objective function:

    minG(N),A(1),,A(N)OGNTD=12XG×1A(1)×2A(2)×NA(N)2+λ2Tr(A(N)TLA(N)),s.t.G0,   A(r)0,    r=1,2,,N,

    where Tr() represents the trace of a matrix, L is called the graph Laplacian matrix, and λ is a nonnegative parameter for balancing the importance of a graph regularization term and reconstruction error term. It is worthwhile to integrate the graph regularization into mode-N low-dimensional representation A(N).

    In this section, an approximately orthogonal NTD with graph regularized model is proposed, the specific updating rules are introduced, the convergence of stated algorithms are proved, and their computational complexity is investigated.

    As an orthogonal version of NMF, an orthogonal nonnegative matrix factorization has been proven to be closely related to k-means clustering. For a given nonnegative data matrix XRM×N+, the ONMF seeks nonnegative factors URM×R+ and VRN×R+ through the following optimization:

    minU0,V0OONMF=12XUV2,s.t.VV=I,

    where

    V=[v1,v2,,vr],  (r=1,2,,R).

    The rigid orthogonality constraint of V actually consists of two parts: the unit-norm constraints

    vTrvr=1

    and the orthogonality (zero) constraints

    vTrvj=0,  rj,

    where vi denotes the i-th column of the encoding matrix V. It is reasonable to generalize the orthogonality of NMF to the orthogonality of each directional matrix in NTD:

    minG,A(1),,A(N)OONTD=12XG×1A(1)×2A(2)×NA(N)2,s.t.G,A(r)0,   A(r)A(r)=I,    r=1,2,,N. (3.1)

    The orthogonality of the direct absorption about the factor matrix makes model solving more complex, so we cast around for its approximate orthogonal form. The unit-norm constraints are not obligatory. In the paper, we attempted not to put to use this condition. Hence, we only need to focus on the orthogonality constraints that explain the independent relationship between columns. Under the non-negativity constraints, we have

    a(r)ia(r)j0

    for any i and j. The form of trace

    Ri=1Rj=1,jia(r)ia(r)j=Tr(A(r)QA(r))

    is a convenient mathematical expression of a partially and approximately orthogonality of matrix, where

    Q=1R1TRI,

    1R is an all-one R-by-1 column vector, I is an identity matrix, and R will be assigned different values depending on the corresponding issue. The larger the value of this item, the greater the degree of orthogonality of the matrix.

    Notice that

    XG×1A(1)×2A(2)×NA(N)2=X(N)A(N)G(N)(pNA(p))2F=X(N)USV2F

    holds, where

    U=A(N),   S=G(N),   V=pNA(p).

    We see that the nonnegative 3-factorization:

    X(N)USV

    gives a good framework for clustering the rows and columns of X(N). We consider uni-side and bi-side approximately orthogonal cases in the tri-factorization, respectively, so two models are spontaneously born:

    ● Uni-sided approximately orthogonality:

    VV=I is weakened to i,j,ijVTiVj (Vi and Vj stand for the different column of V), or simply

    N1r=1Tr(A(r)TQA(r))

    for convenience of calculation. The original expression contains the mass of Kronecker product operations, which is too expensive.

    ● Bi-sided approximately orthogonality:

    UU=I and VV=I are weakened to i,j,ijUTiUj and i,j,ijVTiVj, or simply

    Nr=1Tr(A(r)TQA(r))

    for convenience of calculation.

    It is easy to detect that uni-sides situation is a special case of bi-sides situation. Out of such inspiration, we can embed the approximately orthogonal regularization into regular GNTD to develop AOGNTD (or bi-AOGNTD):

    minG,A(1),,A(N)Obj=12XG×1A(1)×2A(2)×NA(N)2+μr2Nr=1Tr(A(r)TQA(r))+λ2Tr(A(N)TLA(N)),s.t.G0,  A(r)0,   r=1,2,,N, (3.2)

    where L is graph Laplacian matrix,

    Q=1R1TRI,

    and μr, λ are positive integers. When μN=0 is held, the bi-AOGNTD model could degenerate into uni-AOGNTD. The objective function is:

    minG,A(1),,A(N)12XG×1A(1)×2A(2)×NA(N)2+μr2N1r=1Tr(A(r)TQA(r))+λ2Tr(A(N)TLA(N)),s.t.G0,  A(r)0,   r=1,2,,N1.

    In the AOGNTD model,

    XG×1A(1)×2A(2)×NA(N)2 is the main part, which is the key to feature extraction.

    Tr(A(r)TQA(r)),r=1,,N is used to ensure that the first n1 factor matrices are partially orthogonal, which makes global structure more accurate.

    Tr(A(N)TLA(N)) is the basis for graph optimization, which is achieved by the Laplacian regularizer to conserve the smooth structure of low-rank expression.

    We expect that the proposed AOGNTD model can capture more global structure and local manifold information.

    Solving the global optimal solution of (3.2) is tough, so we adopt the block coordinate descent framework that updates the core tensor or one factor matrix each time while fixing others. This update rule is also known as the multiplicative update rule. It is a validated compromise plan between speed and convenient of implementation.

    We use the Lagrange multiplier method and apply the mode-n unfolding form. The Lagrange function is

    L=12X(n)A(n)G(n)(pnA(p))2F+μr2Nr=1Tr(A(r)TQA(r))+λ2Tr(A(N)TLA(N))+Tr(ΦnGT(n))+Nr=1Tr(ΨrA(r)), (4.1)

    where Φn and Ψr are the Lagrange multipliers matrices of G(n) and A(r), respectively. The function (4.1) can be rewritten as

    L=12Tr(X(n)X(n))Tr(X(n)(pnA(p))G(n)A(n))+12Tr(A(n)G(n)(pnA(p)A(p))GT(n)A(n))+μr2Nr=1Tr(A(r)TQA(r))+λ2Tr(A(N)TLA(N))+Tr(ΦnG(n))+Nr=1Tr(ΨrA(r)).

    The entire updating rule is divided into tripartite, containing the solutions of factor matrices A(n), (n=1,2,,N1), the solutions of factor matrices A(N), and the solutions of core tensor G.

    To update A(r), r=1,,N, we fix G and the other factor matrix.

    The partial derivatives of L in (4.1) with respect to A(n),(n=1,2,,N1) are

    LA(n)=X(n)(pnA(p))G(n)+A(n)G(n)(pnA(p)A(p))G(n)+μnQA(n)+Ψn.

    By using the Karush-Kuhn-Tucker (KKT) condition, i.e.,

    L/A(n)=0andA(n)Ψn=0,

    where in the following, denotes the Hadamard product. According to L/A(n)=0, we obtain

    Ψn=X(n)(pnA(p))G(n)A(n)G(n)(pnA(p)A(p))G(n)μnQA(n).

    By calculating

    A(n)Ψn=A(n)(X(n)(pnA(p))G(n))A(n)(A(n)G(n)(pnA(p)A(p))G(n)μnQA(n)),

    which together with

    A(n)Ψn=0

    yields the following updating rule for A(n)(n=1,2,,N1):

    A(n)ijA(n)ijP+[[X(n)(pnA(p))G(n)+μnIA(n)]ij][A(n)G(n)(pnA(p)A(p))G(n)+μn1R1TRA(n)]ij, (4.2)

    where

    P+[η]=max(0,η).

    The partial derivative of L in (4.1) with respect to A(N) is

    LA(N)=X(N)(pNA(p))G(N)+A(N)G(N)(pNA(p)A(p))G(N)+μNQA(N)+λLA(N)+ΨN.

    Similarly, we consider the KKT condition

    L/A(N)=0andA(N)ΨN=0.

    As a result, we obtain the following updating rule for A(N):

    A(N)ijA(N)ijP+[[X(N)(pNA(p))G(N)+μNIA(N)+λWA(N)]ij][A(N)G(N)(pNA(p)A(p))G(N)+μN1R1TRA(N)+λDA(N)]ij. (4.3)

    When only considering the unilateral information of the triple decomposition, the iterative format will undergo slight changes and the following updating rule for A(N) is

    A(N)ijA(N)ijP+[[X(N)(pNA(p))G(N)+λWA(N)]ij][A(N)G(N)(pNA(p)A(p))G(N)+λDA(N)]ij. (4.4)

    To update G, we fix A(r), r=1,,N.

    The objective function in (4.1) about G can be changed into

    L=12vec(X)Fvec(G)22+vec(G)Tvec(Φ), (4.5)

    where

    F=A(N)A(N1)A(1)RI1I2IN×J1J2JN,

    vec(X) expands the tensor X into a 1-dimension vector and vec(Φ) represents the Lagrange multiplier of vec(G). The partial derivative of L in (4.5) with respect to vec(G) is

    Lvec(G)=FFvec(G)Fvec(X)+vec(Φ).

    By applying

    L/vec(G)=0and(vec(G))i(vec(Φ))i=0,

    we obtain the following updating rule:

    (vec(G))i(vec(G))iP+[(Fvec(X))i](FFvec(G))i. (4.6)

    At this point, the optimization problem has been solved. According to the above iteration rules, we summarize the process of AOGNTD method, as shown in Algorithm 1.

    Algorithm 1. Algorithm of the AOGNTD method.
    Require: Data matrix X, cluster number k; parameter λ, μ.
    Ensure: Core tensor G, nonnegative factor matrices A(r),r=1,2,,N.
    1: Initialize A(r) as random matrices and G as an arbitrary positive tensor.
    2: Calculate the weight matrix W.
    3: repeat
    4:  Update A(n), by (4.2), where n=1,2,,N1.
    5:  Update A(N) by (4.3) or (4.4).
    6:  Update G by (4.6).
    7: until the algorithm convergence condition is satisfied.

    In [11], we are well aware that if G(u,u) is an auxiliary function for F(u), then F(u) is nonincreasing under the updating rule

    ut+1=argminuG(u,ut). (4.7)

    The equality

    F(ut+1)=F(ut)

    holds only if ut is a local minimum of G(u,ut). Each subproblem has an optimal solution, which is the fundamental basis and theorem support about our algorithm. The construction and design of auxiliary functions are of utmost importance in this part. Before expounding the local convergence theorem, we offer three lemmas about diverse auxiliary functions.

    For any element a(n)ij in A(n)(n=1,2,,N1), let Fij(a(n)ij) denote the part of the objective function in (3.2) relevant to a(n)ij. The iterative process is element-wise, so it is necessary to imply that each Fij(a(n)ij) is nonincreasing under the iteration rule. The first-order derivative of the Fij(a(n)ij) with respect to a(n)ij is

    Fij(a(n)ij)=[X(n)(pnA(p))G(n)+A(n)G(n)(pnA(p)A(p))G(n)+μn1R1TRA(n)]ij,

    and

    Fij(a(n)ijt)=[G(n)(pnA(p)A(p))G(n)+μn1R1TR]jj

    is the second order derivative of the Fij(a(n)ij) relevant to a(n)ijt.

    Lemma 1. The function

    G(a(n)ij,a(n)ijt)=Fij(a(n)ijt)+Fij(a(n)ijt)(a(n)ija(n)ijt)+12[A(n)tG(n)(pnA(p)A(p))G(n)+μn1R1TRA(n)t]ija(n)ijt×(a(n)ija(n)ijt)2 (4.8)

    is an auxiliary function for Fij(a(n)ij), with the matrix

    A(n)t=(a(n)ijt).

    Proof. Basically, G(u,u) is an auxiliary function for F(u) if the conditions

    G(u,u)F(u)   andG(u,u)=F(u)

    are satisfied.

    G(a(n)ijt,a(n)ijt)=Fij(a(n)ijt)

    clearly holds, so we only need to testify that

    G(a(n)ij,a(n)ijt)Fij(a(n)ij).

    The Taylor expansion of Fij(a(n)ij) at a(n)ijt is

    Fij(a(n)ij)=Fij(a(n)ijt)+Fij(a(n)ijt)(a(n)ija(n)ijt)+12Fij(a(n)ijt)(a(n)ija(n)ijt)2. (4.9)

    Comparing (4.8) with (4.9), we can see that

    G(a(n)ij,a(n)ijt)Fij(a(n)ij)

    is equivalent to

    [A(n)tG(n)(pnA(p)A(p))G(n)]ija(n)ijt[G(n)(pnA(p)A(p))G(n)]jj.

    We have

    [A(n)tG(n)(pnA(p)A(p))G(n)]ij=Jnl=1a(n)ilt[G(n)(pnA(p)A(p))G(n)]lja(n)ijt[G(n)(pnA(p)A(p))G(n)]jj,

    thus, the inequality

    G(a(n)ij,a(n)ijt)Fij(a(n)ij)

    holds.

    Let Fij(a(N)ij) denote the part of the objective function in (3.2) relevant to a(N)ijin A(N).

    Lemma 2. The function

    G(a(N)ij,a(N)ijt)=Fij(a(N)ijt)+Fij(a(N)ijt)(a(N)ija(N)ijt)+12[A(N)tG(N)(pNA(p)A(p))G(N)+λDA(N)t+μN1R1TRA(N)t]ija(N)ijt×(a(N)ija(N)ijt)2 (4.10)

    is an auxiliary function for Fij(a(N)ij).

    Lemma 3. Let gi denote the element of vec(G) and Fi(gi) denote the part of the objective function in (4.1) relevant to gi. The function

    G(gi,gti)=Fi(gti)+Fi(gti)(gigti)+(FFvec(Gt))igti(gigti)2 (4.11)

    is an auxiliary function for Fi(gi).

    Due to the similarity between the proofs of Lemmas 1–3, they are omitted here.

    Theorem 4. The objective function in (4.1) is nonincreasing under the updating rules in (4.2), (4.3), and (4.6). The objective function is invariant under these updates if, and only if, A(r), r=1,2,,N, G are at a stationary point.

    Proof. Replacing G(u,u) in (4.7) by (4.8), we obtain

    G(a(n)ij,a(n)ijt)a(n)ij=Fij(a(n)ijt)+[A(n)tG(n)(pnA(p)A(p))G(n)+μn1R1TRA(n)t]ija(n)ijt(a(n)ija(n)ijt)=0,

    which yields

    a(n)t+1ij=a(n)tij[X(n)(pnA(p))G(n)+μnIA(n)t]ij[A(n)tG(n)(pnA(p)A(p))G(n)+μn1R1TRA(n)t]ij.

    According to Lemma 1, Fij(a(n)ij) is nonincreasing under the updating rules (4.2) for A(n). Then, putting G(a(N)ij,a(N)ijt) of (4.10) into (4.7), we can obtain

    a(N)t+1ij=a(N)tij[X(N)(pNA(p))G(N)+λWA(N)t+μNIA(N)t]ij[A(N)tG(N)(pNA(p)A(p))G(N)+λDA(N)t+μN1R1TRA(N)t]ij.

    From Lemma 2, Fij(a(N)ij) is nonincreasing under the updating rules (4.3) for A(N). Finally, putting G(gi,gti) of (4.11) into (4.7),

    gt+1i=gti(Fvec(X))i(FFvec(Gt))i.

    In accordance with Lemma 3, Fi(gi) is nonincreasing under the updating rule (4.6).

    The proof of Theorem 4 is completed.

    In this subsection, we discuss the computational cost of our proposed algorithms in comparison to other methods. The common way to express the complexity of one algorithm is using big O notation, but it is not precise enough to differentiate the complexities. Thus, we additionally count the arithmetic operations for each algorithm, including fladd (a floating-point addition), flmlt (a floating-point multiplication) and fldiv (a floating-point division). Specifically, we provide the computational complexity analysis of uni-AOGNTD and bi-AOGNTD models. There are many Kronecker product operations in our update rules, which needs high time cost. So, we will replace the Kronecker product with tensor operation during the calculation process.

    To facilitate the estimation of computational complexity, we consider the case where r=3 in the models. Taking uni-AOGNTD as an example, the update rules, respectively, are

    A(1)ijA(1)ij[(X×2A(2)×3A(3))(1)G(1)+μIA(1)]ij[A(1)G(1)(X×2A(2)A(2)×3A(3)A(3))(1)G(1)+μ1R1TRA(1)]ij,A(2)ijA(2)ij[(X×1A(1)×3A(3))(2)G(2)+μIA(2)]ij[A(2)G(2)(X×1A(1)A(1)×3A(3)A(3))(2)G(2)+μ1R1TRA(2)]ij,A(3)ijA(3)ij[(X×1A(1)×2A(2))(3)G(3)+λWA(3)]ij[A(3)G(3)(X×1A(1)A(1)×2A(2)A(2))(3)G(3)+λDA(3)]ij

    and

    GijkGijk[X×1A(1)×2A(2)×3A(3)]ijk[G×1A(1)A(1)×2A(2)A(2)×3A(3)A(3)]ijk.

    where

    X(1)(A(2)A(3))=X×2A(2)×3A(3)

    and so on.

    The frequently used expressions and the specific calculations for the two methods are provided in Table 1. By calculating the computational complexity of the four formats and considering the fact IiJi, we can gain the complexity of the proposed uni-AOGNTD as O(I1I2I3J), in which

    J=3(J1+J2+J3).

    Then, we also can acquire the complexity of the bi-AOGNTD method as O(I1I2I3J). According to the analysis of other compared methods in corresponding literature, Table 2 generalizes the complexity of compared algorithms. The new algorithm will inevitably increase a certain amount of computation, but it can be ignored in terms of overall computational complexity.

    Table 1.  Computational operation counts.
    Fladd Flmlt Fldiv
    Y×rU IrJ1J2JN IrJ1J2JN
    X×2AT2×3AT3 I1I2I3J2+I1I3J2J3 I1I2I3J2+I1I3J2J3
    G×2AT2A2×3AT3A3 J1J22J3+J1J2J23+I2J22+I3J23 J1J22J3+J1J2J23+I2J22+I3J23
    uni-AOGNTD
    A(1) I1I2I3J2+I1I3J2J3+J1J22J3+J1J2J23
    +I2J22+I3J23+3I1J1J2J3+2I1J1(I1+2)
    I1I2I3J2+I1I3J2J3+J1J22J3+J1J2J23+I2J22
    +I3J23+3I1J1J2J3+2I1J1(I1+1)+I1J1
    I1J1
    A(2) I1I2I3J1+I2I3J1J3+J21J2J3+J1J2J23
    +I1J21+I3J23+3I2J1J2J3+2I2J2(I2+2)
    I1I2I3J1+I2I3J1J3+J21J2J3+J1J2J23+I1J21
    +I3J23+3I2J1J2J3+2I2J2(I2+1)+I2J2
    I2J2
    A(3) I1I2I3J1+I2I3J1J2+J21J2J3+J1J22J3
    +I1J21+I2J22+3I3J1J2J3+2I3J3(I3+2)
    I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}
    +I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+2I_{3}J_{3}(I_{3}+2)+I_{3}J_{3}
    I_{3}J_{3}
    \mathcal{G} I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}
    +J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+I_{3}J_{1}J_{2}J_{3}
    I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}
    +I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+J_{1}J_{2}J_{3}
    J_{1}J_{2}J_{3}
    bi-AOGNTD
    {\mathbf{A}}^{(1)} I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}
    +I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+3I_{1}J_{1}J_{2}J_{3}+2I_{1}J_{1}(I_{1}+2)
    I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{2}J_{2}^{2}
    +I_{3}J_{3}^{2}+3I_{1}J_{1}J_{2}J_{3}+2I_{1}J_{1}(I_{1}+1)+I_{1}J_{1}
    I_{1}J_{1}
    {\mathbf{A}}^{(2)} I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{3}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}J_{3}^{2}
    +I_{1}J_{1}^{2}+I_{3}J_{3}^{2}+3I_{2}J_{1}J_{2}J_{3}+2I_{2}J_{2}(I_{2}+2)
    I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{3}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}
    +I_{3}J_{3}^{2}+3I_{2}J_{1}J_{2}J_{3}+2I_{2}J_{2}(I_{2}+1)+I_{2}J_{2}
    I_{2}J_{2}
    {\mathbf{A}}^{(3)} I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}
    +I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+4I_{3}J_{3}(I_{3}+2)
    I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}
    +I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+4I_{3}J_{3}(I_{1}+1)+I_{3}J_{3}
    I_{3}J_{3}
    \mathcal{G} I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}
    +J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+I_{3}J_{1}J_{2}J_{3}
    I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}
    +I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+J_{1}J_{2}J_{3}
    J_{1}J_{2}J_{3}

     | Show Table
    DownLoad: CSV
    Table 2.  Computational operation counts.
    Model Objective function Overall Supplement
    NMF \min_{{\mathbf{U}}, {\mathbf{V}}, }\dfrac{1}{2}{\begin{Vmatrix}{\mathbf{X}}-{\mathbf{U}}{\mathbf{V}}\end{Vmatrix} }^{2} s.t. {\mathbf{U}}, {\mathbf{V}}\geqslant0 \mathcal{O} (MNK)
    NTD \min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}
    s.t.\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0
    \mathcal{O} (I_{1}I_{2}I_{3}J) J=3(J_{1}+J_{2}+J_{3})
    GNTD \min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}
    +\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)}) s.t.\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0
    \mathcal{O} (I_{1}I_{2}I_{3}J) J=3(J_{1}+J_{2}+J_{3})
    uni-AOGNTD \min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}
    +\dfrac{\mu_{r}}{2}\sum_{r=1}^{N-1}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})
    s.t.\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0
    \mathcal{O} (I_{1}I_{2}I_{3}J) J=3(J_{1}+J_{2}+J_{3})
    bi-AOGNTD \min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}
    +\dfrac{\mu_{r}}{2}\sum_{r=1}^{N}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})
    s.t.\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0
    \mathcal{O} (I_{1}I_{2}I_{3}J) J=3(J_{1}+J_{2}+J_{3})

     | Show Table
    DownLoad: CSV

    In this Section, according to [36], two indicators, accuracy (AC) and normalized mutual information (NMI), are used to evaluate the clustering performance. The AC is defined as

    {{\rm{AC}}} = \frac{\delta({\rm{k}}_{{\rm{i}}}, {\rm{map}}({\rm{c}}_{{\rm{i}}}))}{{\rm{n}}},

    where c_{i} denotes the label from clustering method, k_{i} is the true label of object x_{i} , \delta(x, y) is the indicator function, n is the total number of objects, and map(c_{i}) is the mapping function. The NMI is defined as

    {{\rm{NMI}}} = \frac{\sum\nolimits_{{\rm{u}} = 1}^{{\rm{c}}}\sum\nolimits_{{\rm{v}} = 1}^{{\rm{k}}}\mathrm{log}(\frac{{\rm{nn}}_{{\rm{uv}}}}{{\rm{n}}_{{\rm{u}}}{\rm{n}}^{'}_{{\rm{v}}}})}{\sqrt{(\sum\nolimits_{u = 1}^{c}n_{u}\mathrm{log}(\frac{n_{u}}{n}))(\sum\nolimits_{v = 1}^{k}n^{'}_{v}\mathrm{log}(\frac{n^{'}_{v}}{n}))}},

    where n_{uv} is the number of cluster, and n_{i} and n^{'}_{i} are the data points in the true label cluster and the clustering label cluster, respectively.

    All the experiments are run with MATLAB on a PC equipped with Intel i5-10210U and 8GB of RAM.

    To evaluate the effectiveness of the proposed AOGNTD methods, we compare the clustering performance of our methods with the other six state of the art and previously mentioned methods, which are k -means [36], NMF [11], graph regularized NMF (GNMF) [36], graph dual regularized NMF (GDNMF) [37,38], NTD [17], and GNTD [22,23]. In order to distinguish DNMF of the original article from DNMF with the label information [27], we have written DNMF [37,38] as GDNMF.

    Three databases are used to conduct the experiments in Figure 1, and the details of public datasets are stated as follows (the COIL20 dataset means the Columbia University image library dataset, the ORL dataset means the Oxford-rotating dataset):

    Figure 1.  COIL20 (left), Yale (middle), and ORL dataset (right).

    The COIL20 dataset collects 1440 grayscale images of 20 subjects. The objects were placed on a motorized turntable, which was rotated through 360 degrees to vary object pose with respect to a fixed camera. Images of the objects were taken at pose intervals of 5 degrees, which corresponds to 72 images per object. In our experiments, each image was resized to 32 \times 32 pixels and all images are stacked into a tensor \mathcal{X}\in\mathbb{R}^{32\times32\times1440} .

    The Yale faces dataset contains 165 grayscale images in GIF format of 15 individuals. There are 11 images per subject, one per different facial expression or configuration. Each image was resized into 32 \times 32 pixels. In the experiments, all images form a tensor \mathcal{X}\in\mathbb{R}^{32\times32\times165} .

    The ORL dataset collects 400 grayscale 112 \times 92 faces images which consist of 40 different subjects with 10 distinct images. For some subjects, the images were taken at different lighting, times, and facial expressions. For each image, we resized it to be 32 \times 32 pixels in our experiment. These images construct a tensor of \mathcal{X}\in\mathbb{R}^{32\times32\times400} .

    Identifying and adjusting the parameter combination of the algorithm has always been a striking matter in machine learning. The size of core tensor plays a prominent role in our algorithms. For NTD-based methods, the size of the core tensor is set to \{10\times10\times s, 20\times20\times s, 30\times30\times s\} in three datasets, where s is positive integer.

    There are multiple regularization parameters \lambda and \mu_{r}, r = 1, 2, \dots, N balancing the effects of the reconstruction term, graph Laplacian term, and approximately orthogonal term in our model. To reduce parameter adjustment pressure, we study the influence on the clustering performance when the parameters \lambda and \mu_{r} = \mu and the ratio of sampling vary. Since it is infeasible to consider the changes of these parameters disposable, we use the grid control method to determine the relatively optimal parameters of the process.

    To begin, determining the numerical range of two parameters is [5e-5, 1e+4] , then select t values within this range, and finally select t^{2} results to achieve the best clustering effect. For the COIL20 dataset, we set \{1e-1, 1, 1e+1, 1e+2\} as selections of the parameters \lambda and \mu in uni-AOGNTD format, where t = 4 , and choose the best among 16 values. It is crucial to weigh the two measurement criteria reasonably when comparing results.

    In principle, the numerical value of AC (accuracy) is the main factor, supplemented by NMI. More details can be found in Figures 27. Figures 27 shows the specific parameter selection, and the optimal parameter setting is circled in black. The symbol dataset- k , say, COIL20-4, represents the categories extracted from a certain dataset. During the procedure of parameter option, we found strong graph regularity and weak orthogonality of the AOGNTD models.

    Figure 2.  The clustering performance of the uni-AOGNTD method on the COIL20 datasets.
    Figure 3.  The clustering performance of the bi-AOGNTD method on the COIL20 datasets.
    Figure 4.  The clustering performance of the uni-AOGNTD method on the Yale datasets.
    Figure 5.  The clustering performance of the bi-AOGNTD method on the Yale datasets.
    Figure 6.  The clustering performance of the uni-AOGNTD method on the ORL datasets.
    Figure 7.  The clustering performance of the bi-AOGNTD method on the ORL datasets.

    Tables 38 show the average clustering results of the compared algorithms on three datasets, respectively. We bold mark the best results with each clustering number k on each dataset. In each experiment, we randomly selected k categories as the evaluated data. We construct the graph Laplacian using p -nearest neighbors in which the neighborhood size p is set to 5. We run the experiment and apply K -means 10 times on the low-dimensional representation matrix of each method. To make the experimental result validated, we repeat the above operations 10 times and calculate the averages, so there are a total of 100 calculations.

    Table 3.  AC (%) of different algorithms on COIL20 dataset.
    k k-means NMF GNMF GDNMF NTD GNTD uni-AOGNTD bi-AOGNTD
    2 89.10±12.07 97.60±4.96 97.65±4.86 98.33±5.03 90.55±12.09 98.33±5.03 100.00 100.00
    4 75.86±15.01 71.68±16.99 72.47±17.14 77.31±14.95 71.42±13.90 76.29±18.45 81.94±15.35 81.46±17.55
    6 65.63±9.12 68.11±11.97 68.65±13.00 68.58±11.39 70.79±12.56 73.61±13.04 75.73±14.88 75.10±13.77
    8 66.61±10.32 64.37±10.37 65.60±10.77 65.90±8.09 69.49±8.54 71.00±10.92 74.73±10.22 73.80±10.85
    10 64.92±8.72 63.01±8.14 65.58±8.19 65.63±7.19 66.41±7.64 66.01±10.52 71.64±9.21 72.29±10.04
    12 62.38±8.40 63.84±8.26 64.16±7.52 62.73±7.78 62.37±8.93 66.59±9.24 70.55±8.57 70.21±9.82
    14 62.08±7.40 60.09±5.76 60.52±5.85 62.66±5.80 60.92±7.08 68.92±8.03 70.46±8.12 70.21±7.94
    16 59.84±5.81 62.55±5.64 63.33±5.73 63.70±6.06 61.38±6.09 69.24±7.52 70.69±7.05 70.04±7.39
    18 60.41±5.19 60.03±4.91 60.56±4.64 62.54±5.37 59.28±4.88 65.89±6.89 69.16±7.27 68.91±6.34
    20 57.49±4.71 60.29±4.49 59.22±4.70 60.01±4.64 56.90±4.03 64.47±5.75 69.35±5.20 68.79±6.28
    Avg. 66.43 67.16 67.77 68.74 66.95 72.04 75.43 75.08

     | Show Table
    DownLoad: CSV
    Table 4.  NMI (%) of different algorithms on COIL20 dataset.
    k k-means NMF GNMF GDNMF NTD GNTD uni-AOGNTD bi-AOGNTD
    2 65.77±30.18 91.29±17.74 91.38±17.53 94.59±16.31 72.77±31.77 94.59±16.31 100.00 100.00
    4 69.61±15.08 69.05±16.24 69.99±17.15 74.89±11.95 69.64±13.21 71.24±17.77 81.51±12.92 85.18±13.08
    6 66.49±8.37 69.69±11.15 70.68±12.54 66.91±10.45 70.39±12.97 78.68±10.49 81.21±10.79 80.28±10.74
    8 71.64±7.72 69.25±9.14 70.60±9.13 70.62±6.76 73.10±7.25 79.90±7.88 81.54±7.51 82.70±6.87
    10 73.85±6.80 70.39±7.17 72.18±7.40 72.95±5.68 72.72±6.05 76.73±10.00 81.72±5.41 83.51±6.06
    12 72.32±6.53 71.76±6.11 72.15±5.94 71.83±6.84 70.77±7.14 79.16±7.36 81.94±5.37 81.80±6.01
    14 74.36±5.58 70.62±4.84 71.11±4.53 72.82±4.20 71.16±5.16 82.12±4.70 82.89±4.27 81.45±4.77
    16 72.96±3.57 73.17±3.64 73.99±3.57 74.48±3.75 71.84±4.21 82.81±3.99 82.85±3.72 82.84±3.97
    18 74.37±3.02 71.92±2.81 72.27±2.62 74.66±3.37 71.42±2.85 81.00±3.86 82.31±4.54 82.28±3.54
    20 73.43±2.42 72.40±2.36 72.64±2.54 73.59±2.43 69.76±2.39 81.00±3.14 82.72±2.58 81.94±3.32
    Avg. 71.48 72.95 73.70 74.73 71.36 80.72 83.87 83.91

     | Show Table
    DownLoad: CSV
    Table 5.  AC (%) of different algorithms on Yale dataset.
    k k-means NMF GNMF GDNMF NTD GNTD uni-AOGNTD bi-AOGNTD
    3 62.03±10.93 59.73±9.23 59.97±9.35 62.12±9.66 60.82±11.96 61.55±12.06 67.48±12.52 66.06±13.70
    5 51.42±8.72 54.42±8.37 55.56±8.64 56.00±6.62 56.02±11.44 56.16±11.52 58.82±8.18 58.76±8.16
    7 47.58±7.31 48.91±6.24 49.03±6.05 48.56±6.94 49.45±7.26 50.83±7.78 53.26±10.38 52.74±9.37
    9 42.70±5.72 45.03±5.05 45.16±5.33 45.32±5.91 47.44±6.05 47.59±5.55 48.02±6.16 48.23±5.83
    11 40.41±4.78 41.51±4.55 41.74±4.67 43.31±4.80 41.94±4.60 43.96±4.57 44.69±4.68 44.02±4.62
    13 39.08±4.53 41.54±3.67 41.72±3.86 41.59±4.01 41.80±4.67 41.60±4.52 42.45±4.97 41.92±4.02
    15 38.43±3.71 38.64±3.80 38.73±3.68 38.88±3.10 39.25±4.00 39.94±3.64 41.02±2.73 40.36±2.97
    Avg. 45.95 47.11 47.42 47.97 48.10 48.80 50.82 50.30

     | Show Table
    DownLoad: CSV
    Table 6.  NMI (%) of different algorithms on Yale dataset.
    k k-means NMF GNMF GDNMF NTD GNTD uni-AOGNTD bi-AOGNTD
    3 34.72±15.81 30.16±13.95 30.37±14.32 38.05±14.35 31.74±17.86 32.85±16.57 39.26±17.48 38.44±19.12
    5 38.18±9.08 39.38±8.69 40.46±8.64 40.90±8.35 40.66±13.87 40.80±14.01 45.91±12.66 46.92±11.59
    7 40.82±7.59 39.88±6.68 39.99±6.64 40.44±7.54 40.79±7.32 44.21±9.39 46.67±11.39 46.14±10.51
    9 41.03±5.87 41.75±4.54 41.82±4.27 41.47±5.64 45.16±6.19 44.82±5.55 45.26±6.01 45.36±5.80
    11 41.51±4.42 41.79±4.35 42.28±4.16 43.61±4.49 42.65±4.43 43.99±4.20 44.81±3.99 44.05±4.28
    13 43.11±4.13 44.55±3.18 44.56±3.38 43.63±3.53 44.38±4.24 44.74±3.99 45.63±4.21 45.26±3.70
    15 38.43±3.71 43.70±3.19 43.78±3.20 43.85±2.54 44.58±2.96 45.53±3.05 46.06±2.59 45.68±2.58
    Avg. 39.69 40.17 40.47 41.71 41.42 42.42 44.80 44.55

     | Show Table
    DownLoad: CSV
    Table 7.  AC (%) of different algorithms on ORL dataset.
    k k-means NMF GNMF GDNMF NTD GNTD uni-AOGNTD bi-AOGNTD
    2 88.65±16.89 90.35±12.84 90.25±13.58 91.35±10.80 86.45±15.46 98.20±5.80 99.00±3.02 98.70±3.23
    4 73.73±14.28 82.08±13.92 82.48±14.08 80.23±16.09 71.43±12.83 83.10±16.65 83.40±16.44 83.80±16.16
    6 68.57±10.69 72.20±11.02 72.37±11.83 72.75±11.64 61.32±13.98 72.68±10.59 75.30±12.14 76.35±10.86
    8 62.29±10.23 69.91±10.22 68.49±9.91 67.98±10.06 54.05±13.90 70.48±9.09 70.55±9.87 70.88±9.39
    10 62.22±7.44 65.46±7.37 66.89±8.69 67.88±7.73 58.00±11.64 68.78±8.97 69.15±8.91 69.05±9.14
    15 57.78±5.38 64.19±6.03 64.83±6.80 65.03±7.26 56.13±11.52 65.82±6.07 65.99±6.88 65.99±6.31
    20 54.68±5.25 61.11±6.01 61.92±6.36 62.27±5.24 50.25±11.76 62.36±5.63 62.49±6.80 62.79±5.67
    25 55.38±4.56 61.10±4.23 61.70±4.82 62.21±5.44 52.83±10.18 61.91±5.63 62.05±5.14 62.20±5.68
    30 52.07±3.78 58.80±4.44 59.56±4.46 59.06±4.21 45.35±14.04 59.55±5.04 60.44±4.03 60.23±3.71
    35 51.86±3.07 57.15±3.52 57.28±3.63 58.69±3.59 49.01±10.39 57.00±4.35 59.31±3.85 58.91±3.99
    40 51.48±3.35 56.46±3.42 58.03±3.73 58.17±3.36 45.58±9.31 58.25±3.27 58.38±3.15 58.36±3.36
    Avg. 61.70 67.16 67.62 67.78 57.31 68.92 69.64 69.75

     | Show Table
    DownLoad: CSV
    Table 8.  NMI(%) of different algorithms on ORL dataset.
    k k-means NMF GNMF GDNMF NTD GNTD uni-AOGNTD bi-AOGNTD
    2 70.82±40.76 70.64±34.38 71.39±34.53 73.87±32.41 62.85±35.80 93.53±15.58 96.10±11.76 94.62±12.59
    4 68.42±16.36 79.91±12.97 80.06±13.49 77.86±15.48 64.81±14.60 80.18±19.54 82.13±15.94 82.46±16.15
    6 69.52±9.93 74.09±9.67 74.01±11.38 74.10±11.53 60.51±17.75 76.41±8.52 79.18±8.17 79.69±8.01
    8 65.60±10.31 74.99±8.75 73.82±8.16 72.66±9.05 55.10±17.20 75.27±7.38 75.52±7.61 75.73±7.08
    10 68.35±6.79 73.18±5.33 73.80±6.37 75.33±5.98 63.76±13.26 75.73±7.47 77.98±7.79 77.82±8.17
    15 69.53±4.25 76.00±4.69 76.38±5.10 75.83±4.99 66.67±12.51 76.64±4.86 76.64±5.09 76.93±4.93
    20 69.21±4.42 74.10±4.32 74.67±4.72 75.24±3.62 64.33±11.76 75.59±4.15 75.98±4.65 76.01±4.15
    25 71.68±3.25 75.77±2.66 76.62±3.26 76.92±3.62 68.37±9.31 77.04±3.91 77.11±3.69 77.15±3.59
    30 70.58±2.41 75.39±2.97 76.13±2.65 75.66±2.68 62.90±12.97 76.28±2.88 76.66±2.81 76.55±2.45
    35 71.17±1.94 74.71±2.11 75.35±2.09 76.02±2.11 67.46±9.66 75.39±2.77 76.86±2.18 76.48±2.30
    40 71.84±2.01 74.91±1.90 76.03±1.90 76.55±1.83 66.07±7.95 76.79±1.79 76.88±1.63 76.90±1.85
    Avg. 69.70 74.87 75.30 75.46 63.89 78.08 79.18 79.12

     | Show Table
    DownLoad: CSV

    From Tables 38, our AOGNTD algorithms present better performance on the three datasets with varying category counts. In the comparison of data, the following phenomena can also be observed:

    (1) For the COIL20 dataset, its performance is the most outstanding. The proposed uni-AOGNTD method attains 9\% , 8.27\% , 7.66\% , 6.69\% , 8.48\% , 3.39\% improvement corresponding to AC and 12.39\% , 10.92\% , 10.17\% , 9.14\% , 12.51\% , 3.15\% improvement corresponding to NMI in comparison with K -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively. In the same way, the proposed bi-AOGNTD algorithm attains 8.65\% , 7.92\% , 7.31\% , 6.34\% , 8.13\% , 3.04\% improvement corresponding to AC and 12.41\% , 10.96\% , 10.22\% , 9.18\% , 12.55\% , 3.19\% improvement corresponding to NMI in comparison with K -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively.

    (2) For the Yale dataset, the uni-AOGNTD method attains 4.87\% , 3.71\% , 3.4\% , 2.85\% , 2.72\% , and 2.02\% improvement corresponding to AC and 5.11\% , 4.63\% , 4.33\% , 3.09\% , 3.38\% , 2.38\% improvement corresponding to NMI in comparison with K -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively. The uni-AOGNTD and bi-AOGNTD models are about equal in problem-solving skills.

    (3) For the ORL dataset, the uni-AOGNTD method attains 7.94\% , 2.48\% , 2.02\% , 1.86\% , 12.33\% , and 0.72\% improvement corresponding to AC and 9.48\% , 4.31\% , 3.88\% , 3.72\% , 15.29\% , 1.1\% improvement corresponding to NMI in comparison with K -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively. The clustering effect of uni-AOGNTD method and bi-AOGNTD method are roughly the same.

    As can be seen, the uni-AOGNTD and bi-AOGNTD methods are superior to the compared methods, due to innovating and balancing the approximately orthogonal regularization and the graph regularization in conjunction with the unified NTD for tensor data representation. The local minimum solution is compressed to a better solution space under the adjustment of parameters. Intuitively, the nonnegativity of the factor matrix and the model constraint reduce the feasible solution domain to near local minima.

    The clustering results of methods containing graph regularization terms for {\mathbf{A}}^{(N)} are extended in Figure 8. In short, the fresh approaches produced the highest and stablest AC and NMI.

    Figure 8.  The clustering performance of four methods on COIL-20, Yale, and ORL datasets.

    As described in Section 3, the convergence of the proposed algorithms has been theoretically proved. In this subsection, we experimentally study the convergence of the proposed algorithms by assuming link between the number of iterations and the value of the objective function in Figure 9. The trend intuitively suggests that the objective function can converge effectively under the multiplication iteration rules and explains the correctness of Theorem 4.

    Figure 9.  Convergence curves of AOGNTD on COIL-20, Yale, and ORL datasets.

    In this paper, we built up two novel AOGNTD methods for tensor data representation. The convergence analysis is given. The AOGNTD method presents more competent representation and achieves better clustering performance on the publicly available real-world datasets through two adjustable regularization parameters. Although AOGNTD methods perform well in image clustering tasks, it can be further improved in two possible directions in the future. First, the graph in our algorithm is fixed, but it is thought-provoking whether to obtain joint information from multiple graphs or learn the optimal graph. In addition, a large number of NTD-based methods rely on K -means, and designing independent classification methods without any additional clustering procedure is the main point of further research.

    Xiang Gao: material preparation, data collection and analysis, writing the first draft, commenting; Linzhang Lu: material preparation, data collection and analysis, commenting; Qilong Liu: material preparation, data collection and analysis, commenting. All authors have read and approved the final version of the manuscript for publication.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    Linzhang Lu is supported by the National Natural Science Foundation of China under Grant (12161020 and 12061025), and Guizhou Normal University Academic Talent Foundation under Grant (QSXM[2022]04). Qilong Liu is supported by Guizhou Provincial Basis Research Program (Natural Science) under Grant (QKHJC-ZK[2023]YB245).

    The authors declare that they have no conflicts of interest in this paper.



    [1] A. Abkar, G. Cao, K. Zhu, The commutant of some shift operators, Complex Anal. Oper. Th., 14 (2020), 58–69. https://doi.org/10.1007/s11785-020-01015-0 doi: 10.1007/s11785-020-01015-0
    [2] J. Akeroyd, S. Fulmer, Closed-range composition operators on weighted Bergman spaces, Integr. Equat. Oper. Th., 72 (2012), 103–114. https://doi.org/10.1007/s00020-011-1912-1 doi: 10.1007/s00020-011-1912-1
    [3] P. Bourdon, Density of the polynomials in Bergman spaces, Pac. J. Math., 130 (1987), 215–221. https://doi.org/10.2140/pjm.1987.130.215 doi: 10.2140/pjm.1987.130.215
    [4] P. Bourdon, Fredholm multiplication and composition operators on the Hardy space, Integr. Equat. Oper. Th., 13 (1990), 607–610. https://doi.org/10.1007/BF01210404 doi: 10.1007/BF01210404
    [5] G. Cao, Composition and Toeplitz operators on general domains, Tohoku Math. J., 57 (2005), 11–22. https://doi.org/10.2748/tmj/1113234831 doi: 10.2748/tmj/1113234831
    [6] G. Cao, L. He, K. H. Zhu, Spectral theory of multiplication operators on Hardy-Sobolev spaces, J. Funct. Anal., 275 (2018), 1259–1279. https://doi.org/10.1016/j.jfa.2018.05.017 doi: 10.1016/j.jfa.2018.05.017
    [7] G. Cao, L. He, K. H. Zhu, Polynomial approximation and composition operators, Proc. Amer. Math. Soc., 149 (2021), 3715–3724. https://doi.org/10.1090/proc/14455 doi: 10.1090/proc/14455
    [8] G. Cao, S. Sun, On composition operators on H^p-spaces in several variables, Acta Math. Sin., 13 (1997), 281–288.
    [9] J. Cima, A theorem on composition operators, Lecture Notes in Mathematics, Springer-Verlag, 1976, 21–24.
    [10] J. Cima, J. Thomson, W. Wogen, On some properties of composition operators, Indiana Univ. Math. J., 24 (1974), 215–220. https://doi.org/10.1512/iumj.1975.24.24018 doi: 10.1512/iumj.1975.24.24018
    [11] C. Cowen, B. MacCluer, Composition operators on spaces of analytic functions, CRC Press, Boca Raton, Florida, 1995.
    [12] P. Duren, A. Schuster, Bergman spaces, Mathematical Surveys and monographs, Amer. Math. Soc., 2004.
    [13] P. Galindo, T. Gamelin, M. Lindstrom, Fredholm composition operators on algebras of analytic functions on Banach spaces, J. Funct. Anal., 258 (2010), 1504–1512. https://doi.org/10.1016/j.jfa.2009.10.020 doi: 10.1016/j.jfa.2009.10.020
    [14] J. Garnett, Bounded analytic functions, Academic Press, 1981.
    [15] A. Kumar, Fredholm composition operators, Proc. Amer. Math. Soc., 79 (1980), 233–236. https://doi.org/10.1090/S0002-9939-1980-0565345-0 doi: 10.1090/S0002-9939-1980-0565345-0
    [16] B. MacCluer, Fredholm composition operators, Proc. Amer. Math. Soc., 125 (1997), 163–166. https://doi.org/10.1090/S0002-9939-97-03743-X doi: 10.1090/S0002-9939-97-03743-X
    [17] S. Mergeljan, A. Talmadjan, On completeness in a class of non-Jordan regions, Amer. Math. Soc. Transl., 35 (1964), 79–94. https://doi.org/10.1090/trans2/035 doi: 10.1090/trans2/035
    [18] O. Hatori, Fredholm composition operators on spaces of holomorphic functions, Integr. Equat. Oper. Th., 18 (1994), 202–210. https://doi.org/10.1007/BF01192459 doi: 10.1007/BF01192459
    [19] J. Ortega, J. Fábrega, Multipliers in Hardy-Sobolev spaces, Integr. Equat. Oper. Th., 55 (2006), 535–560. https://doi.org/10.1007/s00020-005-1403-3 doi: 10.1007/s00020-005-1403-3
    [20] J. Pau, P. Perez, Composition operators acting on weighted Dirichlet spaces, J. Math. Anal. Appl., 401 (2013), 682–694. https://doi.org/10.1016/j.jmaa.2012.12.052 doi: 10.1016/j.jmaa.2012.12.052
    [21] R. Roan, Composition operators on H^{p} with dense range, Indiana Univ. Math. J., 27 (1978), 159–162.
    [22] W. Rudin, Function theory in the unit ball of {\mathbb C}^n, Springer, New York, 1980.
    [23] D. Sarason, Weak-star generators of H^{\infty}, Pac. J. Math., 17 (1966), 519–528.
    [24] J. Shapiro, Composition operators and classical function theory, Springer Science+Business Media, LLC, 1991.
    [25] A. Shields, Weighted shift operators and analytic function theory, Math. Surv., 13 (1974), 48–128.
    [26] R. Zhao, K. Zhu, Theory of Bergman spaces on the unit ball in {\mathbb C}^n, Mem. Soc. Math., France, 2008.
    [27] K. Zhu, Operator theory in function spaces, American Mathematical Society, 2007.
    [28] K. Zhu, Ten problems, A talk at a meeting on operator theory in Jiaxin, China.
    [29] N. Zorboska, Composition operators on weighted Dirichlet spaces, Proc. Amer. Math. Soc., 126 (1998), 2013–2023.
    [30] N. Zorboska, Composition operators with closed range, Trans. Amer. Math. Soc., 344 (1994), 791–801. https://doi.org/10.1090/S0002-9947-1994-1236226-9 doi: 10.1090/S0002-9947-1994-1236226-9
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1380) PDF downloads(90) Cited by(0)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog