Positivity analysis for mixed order sequential fractional difference operators

Pshtiwan Othman Mohammed; Dumitru Baleanu; Thabet Abdeljawad; Soubhagya Kumar Sahoo; Khadijah M. Abualnaja; Pshtiwan Othman Mohammed; Dumitru Baleanu; Thabet Abdeljawad; Soubhagya Kumar Sahoo; Khadijah M. Abualnaja

doi:10.3934/math.2023140

AIMS Mathematics

2023, Volume 8, Issue 2: 2673-2685. doi: 10.3934/math.2023140

Previous Article Next Article

Research article Special Issues

Positivity analysis for mixed order sequential fractional difference operators

1.
Department of Mathematics, College of Education, University of Sulaimani, Sulaimani 46001, Iraq
2.
Department of Mathematics, Cankaya University, Balgat 06530, Ankara, Turkey
3.
Institute of Space Sciences, Magurele-Bucharest R76900, Romania
4.
Department of Natural Sciences, School of Arts and Sciences, Lebanese American University, Beirut 11022801, Lebanon
5.
Department of Mathematics and Sciences, Prince Sultan University, P.O. Box 66833, Riyadh 11586, Saudi Arabia
6.
Department of Medical Research, China Medical University, Taichung 40402, Taiwan
7.
Department of Mathematics, Institute of Technical Education and Research, Siksha 'O' Anusandhan University, Bhubaneswar 751030, Odisha, India
8.
Department of Mathematics and Statistics, College of Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

Received: 30 August 2022 Revised: 03 October 2022 Accepted: 13 October 2022 Published: 08 November 2022
MSC : 26A48, 26A51, 33B10, 39A12, 39B62

We consider the positivity of the discrete sequential fractional operators $\left(^{\rm RL}_{a_{0}+1}\nabla^{\nu_{1}}\, ^{\rm RL}_{a_{0}}\nabla^{\nu_{2}}{f}\right)(\tau)$ defined on the set $\mathscr{D}_{1}$ (see (1.1) and Figure 1) and $\left(^{\rm RL}_{a_{0}+2}\nabla^{\nu_{1}}\, ^{\rm RL}_{a_{0}}\nabla^{\nu_{2}}{f}\right)(\tau)$ of mixed order defined on the set $\mathscr{D}_{2}$ (see (1.2) and Figure 2) for $\tau\in\mathbb{N}_{a_{0}}$ . By analysing the first sequential operator, we reach that $\bigl(\nabla {f}\bigr)(\tau)\geqq 0,$ for each $\tau\in{\mathbb{N}}_{a_{0}+1}$ . Besides, we obtain $\bigl(\nabla {f}\bigr)(3)\geqq 0$ by analysing the second sequential operator. Furthermore, some conditions to obtain the proposed monotonicity results are summarized. Finally, two practical applications are provided to illustrate the efficiency of the main theorems.

Keywords:

discrete delta Riemann-Liouville fractional difference,
negative lower bound,
convexity analysis,
analytical and numerical results

Citation: Pshtiwan Othman Mohammed, Dumitru Baleanu, Thabet Abdeljawad, Soubhagya Kumar Sahoo, Khadijah M. Abualnaja. Positivity analysis for mixed order sequential fractional difference operators[J]. AIMS Mathematics, 2023, 8(2): 2673-2685. doi: 10.3934/math.2023140

Related Papers:

[1]	He Yuan, Zhuo Liu . Lie $n$ -centralizers of generalized matrix algebras. AIMS Mathematics, 2023, 8(6): 14609-14622. doi: 10.3934/math.2023747
[2]	Junyuan Huang, Xueqing Chen, Zhiqi Chen, Ming Ding . On a conjecture on transposed Poisson $n$ -Lie algebras. AIMS Mathematics, 2024, 9(3): 6709-6733. doi: 10.3934/math.2024327
[3]	Xinfeng Liang, Mengya Zhang . Triangular algebras with nonlinear higher Lie n-derivation by local actions. AIMS Mathematics, 2024, 9(2): 2549-2583. doi: 10.3934/math.2024126
[4]	He Yuan, Qian Zhang, Zhendi Gu . Characterizations of generalized Lie $n$ -higher derivations on certain triangular algebras. AIMS Mathematics, 2024, 9(11): 29916-29941. doi: 10.3934/math.20241446
[5]	Dan Liu, Jianhua Zhang, Mingliang Song . Local Lie derivations of generalized matrix algebras. AIMS Mathematics, 2023, 8(3): 6900-6912. doi: 10.3934/math.2023349
[6]	Mohd Arif Raza, Huda Eid Almehmadi . Lie (Jordan) $\sigma-$ centralizer at the zero products on generalized matrix algebra. AIMS Mathematics, 2024, 9(10): 26631-26648. doi: 10.3934/math.20241295
[7]	Baiying He, Siyu Gao . The nonisospectral integrable hierarchies of three generalized Lie algebras. AIMS Mathematics, 2024, 9(10): 27361-27387. doi: 10.3934/math.20241329
[8]	Xinfeng Liang, Lingling Zhao . Bi-Lie n-derivations on triangular rings. AIMS Mathematics, 2023, 8(7): 15411-15426. doi: 10.3934/math.2023787
[9]	Huizhang Yang, Wei Liu, Yunmei Zhao . Lie symmetry reductions and exact solutions to a generalized two-component Hunter-Saxton system. AIMS Mathematics, 2021, 6(2): 1087-1100. doi: 10.3934/math.2021065
[10]	Guangyu An, Xueli Zhang, Jun He, Wenhua Qian . Characterizations of local Lie derivations on von Neumann algebras. AIMS Mathematics, 2022, 7(5): 7519-7527. doi: 10.3934/math.2022422

Abstract

1. Introduction

As the speedy expansion of technology is about acquiring and processing information, most factual data exists in tensor form, e.g., electroencephalography signal data, video volume data, hyperspectral image data, color image data, and functional magnetic resonance imaging data. Enormous and intricate data is ubiquitous in real-world application scenarios; see ^[1,2,3]. Typical dimensional reduction methods, such as singular value decomposition ^[4], local linear embedding ^[5,6], vector quantization ^[7], principal component analysis ^[8,9], nonnegative matrix factorization (NMF) ^[10,11,12], etc., extract their low-dimensional representation from high-dimensional ones. A commonality of the abovementioned low rank approximation methods is that converting the data matrices (tensors) of samples into a new large matrix may substantially corrupt the architecture of sample data space. To preserve the internal nature of tensor data and study the low-dimensional representation of high-order domain, many tensor decomposition methods ^[13,14,15] have been developed. Kim et al. introduced a nonnegative Tucker decomposition (NTD) ^[16,17] by combining Tucker decomposition with nonnegativity constraints on the core tensor and factor matrices. By adding nonnegativity constraints, NTD could not only obtain the parts-based representation like NMF, but also improve the uniqueness of Tucker decomposition. Recent efforts have extended NTD to boost calculation efficiency and meet different demands in actual applications by incorporating suitable constraint conditions ^[18,19,20] with NTD, including smoothness, graph Laplacian, sparsity, orthogonality, and supervision, just to name a few.

The model that adds effective and feasible constraints to the NTD model is called the NTD-like model. For example, Liu et al. ^[21] stated a graph regularized $L_{p}$ smooth NTD method by adding the graph regularization and $L_{p}$ smooth constraint into NTD to retain smooth and more accurate solutions of the objective function. Qiu et al. ^[22] proposed a graph Laplacian-regularized NTD (GNTD) method. GNTD extracts the low-dimensional parts-based representation and preserves the geometrical information simultaneously from the high-dimensional tensor data. Subsequently, Qiu et al. ^[23] developed an alternating proximate gradient descent method to solve the proposed GNTD framework. Chen et al. ^[24] designed an adaptive graph regularized NTD model, which adaptively learns the optimal graph to capture local manifold information. Li et al. ^[25] asserted a manifold regularization NTD (MR-NTD) by employing a manifold regularization term for the core tensor constructed in the NTD to preserve geometric information in tensor data. Huang et al. ^[26] gave a dynamic hypergraph regularized NTD method by incorporating the hypergraph structure and NTD in a unified framework. Jing et al. ^[27] recommended a label constrained NTD using partial labels to construct a label matrix. Then, they embedded the label term and graph regularization term into NTD for guiding the algorithm to obtain more correct categories in clustering tasks. To make use of the available label information of sample data, Qiu et al. ^[28] built up a semi-supervised NTD (SNTD) model by propagating the limited label information and learning the nonnegative tensor representation. This part can only cover a small subset of the many important and interesting ideas that have emerged. There are other related studies; see ^[29,30,31] for details.

However, the aforementioned NTD-like model does not take into account the orthogonality constraint. In fact, the orthogonality structure of factor matrices makes sense in practical use. In ^[32], the equivalence of the orthogonal nonnegative matrix factorization (ONMF) problem and $K$ -means clustering has been well discussed. To maintain this characteristic, Pan et al. ^[33] developed an orthogonal NTD model by considering the orthogonality on each factor matrix. It can get the clustering information from the factor matrices and their joint connection weight from the core tensor. The orthogonal NTD model not only helps to keep the inherent tensor structure but also performs well in data compression. Lately, drawing on the idea of approximate orthogonality from ^[34], Qiu et al. ^[35] affirmed a flexible multi-way clustering model called approximately orthogonal NTD for soft clustering. This model provides extra flexibility to handle crossed memberships while making the orthogonality of the factor matrix adjustable. However, these methods that consider orthogonality still have their shortcomings. In clustering problems, it is common to consider the factor matrix of the last direction as an approximation of the original data. Using graph regularization constraints to the factor matrix of the last direction can capture more internal manifold information of the data tensor. They fail to pay attention to the geometrical structure of the data space and the potential connections between samples.

The geometrical structure and resilient orthogonality of the data space are indispensable. To balance the two purposes, in this paper, we propose two approximately orthogonal NTD with graph regularized (AOGNTD) models by jointly blending the graph regularization and approximately orthogonal constraints into the NTD framework. The new model we propose is essentially an NTD-like model. The main contributions of this paper are fourfold:

● Coupling the graph regularized term, the approximately orthogonal term, and the objective function of NTD, we construct a novel tensor-based frame. Based on whether we add the approximately orthogonal constraint to the $N$ -th factor matrix, two models naturally generated.

● By regulating the quality of approximation and picking the appropriate graph for the learning task, the models allow us to detect more complex, latent, and structural correlations among sample data.

● This algorithm is sensitive to the input of parameters, which means this option determines the performance of clustering results. To overcome the issue, we use the grid method to select competent parameters.

● Numerical experiments on frequently adopted datasets are conducted to illustrate the feasibility of the proposed methods for image clustering and classification tasks. Extensive experimental results show that the AOGNTD models could significantly improve the performance.

The rest of this paper is organized as follows. In Section 2, we review the related models. In Section 3, the two AOGNTD methods are proposed. In Section 4, we have discussed its theoretical convergence, provided a solution process and computational complexity. Finally, experiments for clustering tasks are presented in Section 5, and conclusions are drawn in Section 6.

2. The brief review of NTD and GNTD

NTD could be considered as a special case of NMF with sparser and multi-linear basic vectors. Given a nonnegative data tensor

$\mathcal{X}\in\mathbb{R}_{+}^{I_{1}\times I_{2} \times {\ldots} \times {I_{N-1}}\times{I_{N}}},$

NTD aims at decomposing the nonnegative tensor $\mathcal{X}$ into a nonnegative core tensor

$\mathcal{G}\in\mathbb{R}_{+}^{J_{1}\times J_{2} \times {\ldots} \times {J_{N}}}$

multiplied by $N$ nonnegative factor matrices

${\mathbf{A}}^{(r)}\in \mathbb{R}_{+}^{I_{r}\times J_{r}}(r = 1, 2, \dots, N)$

along each mode. To achieve this goal, NTD minimizes the sum of squared residues between the data tensor $\mathcal{X}$ and the multi-linear product of core tensor $\mathcal{G}$ and factor matrices ${\mathbf{A}}^{(r)}$ , which can be formulated as

$\begin{equation} \begin{split} \min\limits_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}} \mathcal{O}_{NTD}& = \dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)} \end{Vmatrix}}^{2}, \\ s.t. \mathcal{G}&\geqslant0, \ \ {\mathbf{A}}^{(r)}\geqslant0, \ \ \ \ r = 1, 2, \dots, N, \end{split} \end{equation}$

(2.1)

where in the following, the operator $\times_{r}$ is referred to as the $r$ -mode product. The $r$ -mode product of a tensor

$\mathcal{Y}\in\mathbb{R}^{J_{1}\times J_{2} \times {\ldots} \times {J_{N}}}$

and a matrix ${\mathbf{U}}\in\mathbb{R}^{I_{r}\times J_{r}}$ , denoted by $\mathcal{Y}\times_{r}{\mathbf{U}}$ , is

$(\mathcal{Y}\times_{r}{\mathbf{U}})_{j_{1}\ldots j_{r-1}i_{r}j_{r+1}\ldots j_{N}} = \sum\limits_{j_{r} = 1}^{J_{r}}y_{j_{1}\ldots j_{r-1}j_{r}j_{r+1}\ldots j_{N}} u_{i_{r}j_{r}}.$

Equation (2.1) can be represented as a matrix form:

$\begin{equation*} \begin{split} \begin{split} \min\limits_{{\mathbf{G}}_{(N)}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}} \mathcal{O}_{NTD}& = \dfrac{1}{2}{\begin{Vmatrix}{\mathbf{X}}_{(N)}-{\mathbf{A}}^{(N)}{\mathbf{G}}_{(N)}({\otimes_{p\neq N}}{{\mathbf{A}}^{(p)}})^{\top}\end{Vmatrix}}^{2}, \\ s.t. {\mathbf{G}}_{(N)}&\geqslant0, \ \ {\mathbf{A}}^{(r)}\geqslant0, \ \ \ \ r = 1, 2, \dots, N, \end{split} \end{split} \end{equation*}$

where

${\mathbf{A}}^{(N)}\in\mathbb{R}_{+}^{I_{N}\times J_{N}} , \ \ \ {\otimes_{p\neq N}}{{\mathbf{A}}^{(p)}} = {{\mathbf{A}}^{(N-1)}}\otimes{{\mathbf{A}}^{(N-2)}}\otimes\cdots\otimes{{\mathbf{A}}^{(1)}},$

$\otimes$ denotes the Kronecker product, and

${\mathbf{X}}_{(N)}\in\mathbb{R}_{+}^{I_{N}\times{I_{1}I_{2}\cdots I_{N-1}}}\; \; \; \text{and}\; \; \; {\mathbf{G}}_{(N)}\in\mathbb{R}_{+}^{J_{N}\times{J_{1}J_{2}\cdots J_{N-1}}}$

are the mode- $N$ unfolding matrices of data tensor $\mathcal{X}$ and core tensor $\mathcal{G}$ , respectively. Therefore, the NTD regard as the NMF with the encoding matrix

${\mathbf{A}}^{(N)}\in\mathbb{R}_{+}^{I_{N}\times J_{N}}$

and the basis matrix

${\mathbf{G}}_{(N)}({\otimes_{p\neq N}}{{\mathbf{A}}^{(p)}})^{\top},$

where $I_{N}$ and $J_{N}$ can be regarded as the number of all samples and the dimension of low-dimensional representation of $\mathcal{X}$ .

The local geometric structure can be effectively modeled through a nearest neighbor graph on data points, which originates from spectral graph theory and manifold learning. Graphs using the broad range of $p$ nearest neighbors are generated and employed. The most commonly weighted matrices are heat kernel and 0-1 weighting. The heat kernel is

$\begin{align*} \begin{split} {\mathbf{W}}_{ij} = \left \{ \begin{array}{ll} e^{-\dfrac{\begin{Vmatrix}{\mathbf{y}}_{i}-{\mathbf{y}}_{j}\end{Vmatrix}^{2}}{\sigma^{2}}}, & {\mathbf{y}}_{j}\in \mathcal{N}({\mathbf{y}}_{i})\ or\ {\mathbf{y}}_{i}\in \mathcal{N} ({\mathbf{y}}_{j}), \\ 0, & \mathrm{otherwise} , \\ \end{array} \right. \end{split} \end{align*}$

where $\mathcal{N}({\mathbf{y}}_{i})$ is composed of the $p$ -nearest neighbors of sample ${\mathbf{y}}_{i}$ , and $\sigma$ is the Gaussian kernel parameter to control the values of similarity. In our experiment, it was set to 1. Alternatively, 0-1 weighting gave a binarization of weight definition as

$\begin{align*} \begin{split} {\mathbf{W}}_{ij} = \left \{ \begin{array}{ll} 1, & {\mathbf{y}}_{j}\in \mathcal{N}({\mathbf{y}}_{i})\ or\ {\mathbf{y}}_{i}\in \mathcal{N} ({\mathbf{y}}_{j}), \\ 0, & \mathrm{otherwise}.\\ \end{array} \right. \end{split} \end{align*}$

This 0-1 weight is the cosine weight method, which is used in the subsequent image experiments.

To measure the dissimilarity of data points ${\mathbf{z}}_{i}, {\mathbf{z}}_{j}$ in the low-dimensional representation ${\mathbf{A}}^{(N)}$ ,

$d = ({\mathbf{z}}_{i}, {\mathbf{z}}_{j}) = ||{\mathbf{z}}_{i}-{\mathbf{z}}_{j}||^{2}$

is under consideration. The definition of graph Laplacian is

$\begin{equation*} \begin{split} \frac{1}{2}\sum\limits_{i, j = 1}^{n}||{\mathbf{z}}_{i}-{\mathbf{z}}_{j}||^{2}{\mathbf{W}}_{ij}& = \sum\limits_{i = 1}^{n}{\mathbf{z}}_{i}^{\top}{\mathbf{z}}_{i}\sum\limits_{j = 1}^{n}{\mathbf{W}}_{ij}-\sum\limits_{i, j = 1}^{n}{\mathbf{z}}_{i}^{\top}{\mathbf{z}}_{j}{\mathbf{W}}_{ij}\\& = \sum\limits_{i = 1}^{n}{\mathbf{z}}_{i}^{\top}{\mathbf{z}}_{i}{\mathbf{D}}_{ii}-\sum\limits_{i, j = 1}^{n}{\mathbf{z}}_{i}^{\top}{\mathbf{z}}_{j}{\mathbf{W}}_{ij}\\& = \mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{D}}{\mathbf{A}}^{(N)})-\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{W}}{\mathbf{A}}^{(N)}) \\& = \mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)}) , \end{split} \end{equation*}$

where ${\mathbf{W}}$ is a weight matrix, ${\mathbf{L}} = {\mathbf{D}}-{\mathbf{W}}$ is called the graph Laplacian, and

${\mathbf{D}}_{ii} = \sum\limits_{j}{\mathbf{W}}_{ij}$

is a diagonal matrix whose elements are column sums of ${\mathbf{W}}$ . Minimizing the above formula, we hope if ${\mathbf{y}}_{i}$ and ${\mathbf{y}}_{j}$ are close, ${\mathbf{z}}_{i}$ and ${\mathbf{z}}_{j}$ are consistent with this trend. If data ${\mathbf{z}}_{i}$ and ${\mathbf{z}}_{j}$ are similar, the value of ${\mathbf{W}}_{ij}$ is relatively large. The operation of trace effectively characterizes the smoothness of low-rank representations.

GNTD is obtained by minimizing the following objective function:

$\begin{equation*} \begin{split} \min\limits_{{\mathbf{G}}_{(N)}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}} \mathcal{O}_{GNTD}& = \dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\ldots\times_{N}{\mathbf{A}}^{(N)} \end{Vmatrix}}^{2} +\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)}), \\ s.t. \mathcal{G}&\geqslant0, \ \ \ {\mathbf{A}}^{(r)}\geqslant0, \ \ \ \ r = 1, 2, \ldots, N, \end{split} \end{equation*}$

where $\mathrm{Tr}$ ( $\cdot$ ) represents the trace of a matrix, ${\mathbf{L}}$ is called the graph Laplacian matrix, and $\lambda$ is a nonnegative parameter for balancing the importance of a graph regularization term and reconstruction error term. It is worthwhile to integrate the graph regularization into mode- $N$ low-dimensional representation ${\mathbf{A}}^{(N)}$ .

3. Approximately orthogonal NTD with graph regularized

In this section, an approximately orthogonal NTD with graph regularized model is proposed, the specific updating rules are introduced, the convergence of stated algorithms are proved, and their computational complexity is investigated.

3.1. Approximately orthogonal term

As an orthogonal version of NMF, an orthogonal nonnegative matrix factorization has been proven to be closely related to $k$ -means clustering. For a given nonnegative data matrix ${\mathbf{X}}\in\mathbb{R}_{+}^{M\times N}$ , the ONMF seeks nonnegative factors ${\mathbf{U}}\in\mathbb{R}_{+}^{M\times R}$ and ${\mathbf{V}}\in\mathbb{R}_{+}^{N\times R}$ through the following optimization:

$\begin{equation*} \begin{split} \min\limits_{{\mathbf{U}}\geqslant0, {\mathbf{V}}\geqslant0} \mathcal{O}_{ONMF}& = \dfrac{1}{2}{\begin{Vmatrix}{\mathbf{X}}-{\mathbf{U}}{\mathbf{V}}^{\top}\end{Vmatrix}}^{2} , \\ s.t. {\mathbf{V}}^{\top}{\mathbf{V}}& = {\mathbf{I}}, \end{split} \end{equation*}$

where

${\mathbf{V}} = [v_{1}, v_{2}, \dots, v_{r}], \ \ (r = 1, 2, \dots, R).$

The rigid orthogonality constraint of ${\mathbf{V}}$ actually consists of two parts: the unit-norm constraints

$v_{r}^{T}v_{r} = 1$

and the orthogonality (zero) constraints

$v_{r}^{T}v_{j} = 0, \ \ r\neq j,$

where $v_{i}$ denotes the $i$ -th column of the encoding matrix ${\mathbf{V}}$ . It is reasonable to generalize the orthogonality of NMF to the orthogonality of each directional matrix in NTD:

$\begin{equation} \begin{split} \min\limits_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}} \mathcal{O}_{ONTD}& = \dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)} \end{Vmatrix}}^{2}, \\ s.t. \mathcal{G}, {\mathbf{A}}^{(r)}&\geqslant0, \ \ \ {{\mathbf{A}}^{(r)}}^{\top}{\mathbf{A}}^{(r)} = {\mathbf{I}}, \ \ \ \ r = 1, 2, \dots, N. \end{split} \end{equation}$

(3.1)

The orthogonality of the direct absorption about the factor matrix makes model solving more complex, so we cast around for its approximate orthogonal form. The unit-norm constraints are not obligatory. In the paper, we attempted not to put to use this condition. Hence, we only need to focus on the orthogonality constraints that explain the independent relationship between columns. Under the non-negativity constraints, we have

${{\mathbf{a}}^{(r)}_{i}}^{\top}{\mathbf{a}}^{(r)}_{j}\geqslant0$

for any $i$ and $j$ . The form of trace

$\sum\limits_{i = 1}^{R}\sum\limits_{j = 1, j\neq i}^{R}{{\mathbf{a}}^{(r)}_{i}}^{\top}{\mathbf{a}}^{(r)}_{j} = \mathrm{Tr}({{\mathbf{A}}^{(r)}}^{\top}{\mathbf{Q}}{\mathbf{A}}^{(r)})$

is a convenient mathematical expression of a partially and approximately orthogonality of matrix, where

${\mathbf{Q}} = {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}-{\mathbf{I}},$

${\mathbf{1}}_{R}$ is an all-one $R$ -by-1 column vector, ${\mathbf{I}}$ is an identity matrix, and $R$ will be assigned different values depending on the corresponding issue. The larger the value of this item, the greater the degree of orthogonality of the matrix.

3.2. The proposed AOGNTD model

Notice that

$\begin{equation*} \begin{split} \begin{split} &{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)} \end{Vmatrix}}^{2} \\ & = {\begin{Vmatrix}{\mathbf{X}}_{(N)}-{\mathbf{A}}^{(N)}{\mathbf{G}}_{(N)}({\otimes_{p\neq N}}{{\mathbf{A}}^{(p)}})^{\top}\end{Vmatrix}}_{F}^{2} \\ & = {\begin{Vmatrix}{\mathbf{X}}_{(N)}-{\mathbf{U}}{\mathbf{S}}{\mathbf{V}}^{\top} \end{Vmatrix}}_{F}^{2} \\ \end{split} \end{split} \end{equation*}$

holds, where

${\mathbf{U}} = {\mathbf{A}}^{(N)}, \ \ \ {\mathbf{S}} = {\mathbf{G}}_{(N)}, \ \ \ {\mathbf{V}} = {\otimes_{p\neq N}}{{\mathbf{A}}^{(p)}}.$

We see that the nonnegative 3-factorization:

${\mathbf{X}}_{(N)}\approx{\mathbf{U}}{\mathbf{S}}{\mathbf{V}}^{\top}$

gives a good framework for clustering the rows and columns of ${\mathbf{X}}_{(N)}$ . We consider uni-side and bi-side approximately orthogonal cases in the tri-factorization, respectively, so two models are spontaneously born:

● Uni-sided approximately orthogonality:

${\mathbf{V}}^{\top}{\mathbf{V}} = {\mathbf{I}}$ is weakened to $\sum_{i, j, i\neq j}{\mathbf{V}}_{i}^{T}{\mathbf{V}}_{j}$ ( ${\mathbf{V}}_{i}$ and ${\mathbf{V}}_{j}$ stand for the different column of V), or simply

$\sum\limits_{r = 1}^{N-1}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})$

for convenience of calculation. The original expression contains the mass of Kronecker product operations, which is too expensive.

● Bi-sided approximately orthogonality:

${\mathbf{U}}^{\top}{\mathbf{U}} = {\mathbf{I}}$ and ${\mathbf{V}}^{\top}{\mathbf{V}} = {\mathbf{I}}$ are weakened to $\sum_{i, j, i\neq j}{\mathbf{U}}_{i}^{T}{\mathbf{U}}_{j}$ and $\sum_{i, j, i\neq j}{\mathbf{V}}_{i}^{T}{\mathbf{V}}_{j}$ , or simply

$\sum\limits_{r = 1}^{N}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})$

for convenience of calculation.

It is easy to detect that uni-sides situation is a special case of bi-sides situation. Out of such inspiration, we can embed the approximately orthogonal regularization into regular GNTD to develop AOGNTD (or bi-AOGNTD):

$\begin{equation} \begin{split} \min\limits_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}} Obj = &\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)} \end{Vmatrix}}^{2}\\&+\dfrac{\mu_{r}}{2}\sum\limits_{r = 1}^{N}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)}), \\ s.t. \mathcal{G}\geqslant&0, \ \ {\mathbf{A}}^{(r)}\geqslant0, \ \ \ r = 1, 2, \dots, N, \end{split} \end{equation}$

(3.2)

where ${\mathbf{L}}$ is graph Laplacian matrix,

${\mathbf{Q}} = {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}-{\mathbf{I}},$

and $\mu_{r}$ , $\lambda$ are positive integers. When $\mu_{N} = 0$ is held, the bi-AOGNTD model could degenerate into uni-AOGNTD. The objective function is:

$\begin{equation*} \begin{split} \min\limits_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}} \frac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)} \end{Vmatrix}}^{2} +\dfrac{\mu_{r}}{2}\sum\limits_{r = 1}^{N-1}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)}), \\ s.t. \mathcal{G}\geqslant0, \ \ {\mathbf{A}}^{(r)}\geqslant0, \ \ \ r = 1, 2, \dots, N-1.\qquad\qquad\qquad\qquad\qquad\qquad\qquad\; \, \qquad \end{split} \end{equation*}$

In the AOGNTD model,

● ${\|\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\ldots\times_{N}{\mathbf{A}}^{(N)} \|}^{2}$ is the main part, which is the key to feature extraction.

● $\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}}), r = 1, \ldots, N$ is used to ensure that the first $n-1$ factor matrices are partially orthogonal, which makes global structure more accurate.

● $\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})$ is the basis for graph optimization, which is achieved by the Laplacian regularizer to conserve the smooth structure of low-rank expression.

We expect that the proposed AOGNTD model can capture more global structure and local manifold information.

4. Optimization algorithm

Solving the global optimal solution of (3.2) is tough, so we adopt the block coordinate descent framework that updates the core tensor or one factor matrix each time while fixing others. This update rule is also known as the multiplicative update rule. It is a validated compromise plan between speed and convenient of implementation.

We use the Lagrange multiplier method and apply the mode- $n$ unfolding form. The Lagrange function is

$\begin{equation} \begin{split} \mathcal{L} = &\dfrac{1}{2}\begin{Vmatrix} {\mathbf{X}}_{(n)}-{\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}({\otimes_{p\neq n}}{\mathbf{A}}^{(p)})^{\top} \end{Vmatrix}^{2}_{F}+\dfrac{\mu_{r}}{2}\sum\limits_{r = 1}^{N}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})\\ &+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)}) +\mathrm{Tr}( {\varPhi}_{n}{\mathbf{G}}_{(n)}^{T})+\sum\limits_{r = 1}^{N}\mathrm{Tr}( {\varPsi}_{r}{{\mathbf{A}}^{(r)}}^{\top}), \end{split} \end{equation}$

(4.1)

where ${\varPhi}_{n}$ and ${\varPsi}_{r}$ are the Lagrange multipliers matrices of ${\mathbf{G}}_{(n)}$ and ${\mathbf{A}}^{(r)}$ , respectively. The function (4.1) can be rewritten as

$\begin{equation*} \begin{split} \mathcal{L} = &\dfrac{1}{2}\mathrm{Tr}({\mathbf{X}}_{(n)}{\mathbf{X}}_{(n)}^{\top})-\mathrm{Tr}({\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}{{\mathbf{A}}^{(n)}}^{\top})+\dfrac{1}{2}\mathrm{Tr}({\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{T}{{\mathbf{A}}^{(n)}}^{\top})\\&+\dfrac{\mu_{r}}{2}\sum\limits_{r = 1}^{N}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})+\mathrm{Tr}( {\varPhi}_{n}{\mathbf{G}}_{(n)}^{\top})+\sum\limits_{r = 1}^{N} \mathrm{Tr}( {\varPsi}_{r}{{\mathbf{A}}^{(r)}}^{\top}). \end{split} \end{equation*}$

The entire updating rule is divided into tripartite, containing the solutions of factor matrices ${\mathbf{A}}^{(n)}$ , $(n = 1, 2, \ldots, N-1)$ , the solutions of factor matrices ${\mathbf{A}}^{(N)}$ , and the solutions of core tensor $\mathcal{G}$ .

4.1. Updating factor matrix

To update ${\mathbf{A}}^{(r)}$ , $r = 1, \ldots, N$ , we fix $\mathcal{G}$ and the other factor matrix.

The partial derivatives of $\mathcal{L}$ in (4.1) with respect to ${\mathbf{A}}^{(n)}, (n = 1, 2, \ldots, N-1)$ are

$\begin{equation*} \begin{split} \frac{\partial\mathcal{L}}{\partial{{\mathbf{A}}^{(n)}}} = -{\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+{\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{Q}}{{\mathbf{A}}}^{(n)}+ {\varPsi}_{n}. \end{split} \end{equation*}$

By using the Karush-Kuhn-Tucker (KKT) condition, i.e.,

${\partial\mathcal{L}}/{\partial{{\mathbf{A}}^{(n)}}} = 0\; \; \; \text{and}\; \; \; {\mathbf{A}}^{(n)} \odot {\varPsi}_{n} = 0 ,$

where in the following, $\odot$ denotes the Hadamard product. According to ${\partial\mathcal{L}}/{\partial{{\mathbf{A}}^{(n)}}} = 0$ , we obtain

$\begin{equation*} \begin{split} {\varPsi}_{n} = {\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}-{\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}-\mu_{n} {\mathbf{Q}}{{\mathbf{A}}}^{(n)}. \end{split} \end{equation*}$

By calculating

$\begin{equation*} \begin{split} {\mathbf{A}}^{(n)} \odot {\varPsi}_{n} = {\mathbf{A}}^{(n)} \odot ({\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top})-{\mathbf{A}}^{(n)} \odot ({\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}-\mu_{n} {\mathbf{Q}}{{\mathbf{A}}}^{(n)}), \end{split} \end{equation*}$

which together with

${\mathbf{A}}^{(n)} \odot {\varPsi}_{n} = 0$

yields the following updating rule for ${\mathbf{A}}^{(n)}(n = 1, 2, \ldots, N-1)$ :

$\begin{equation} {\mathbf{A}}_{ij}^{(n)} \leftarrow {\mathbf{A}}_{ij}^{(n)} \frac{P_{+}[[{\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{I}}{\mathbf{A}}^{(n)} ]_{ij}]}{[{\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T} {\mathbf{A}}^{(n)}]_{ij}}, \end{equation}$

(4.2)

where

$P_{+}[\eta] = \max(0, \eta).$

The partial derivative of $\mathcal{L}$ in (4.1) with respect to ${\mathbf{A}}^{(N)}$ is

$\begin{equation*} \begin{split} \frac{\partial\mathcal{L}}{\partial{{\mathbf{A}}^{(N)}}} = &-{\mathbf{X}}_{(N)}(\otimes_{p\neq N}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+{\mathbf{A}}^{(N)}{\mathbf{G}}_{(N)}(\otimes_{p\neq N}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}\\&+\mu_{N} {\mathbf{Q}}{\mathbf{A}}^{(N)}+\lambda {\mathbf{L}}{\mathbf{A}}^{(N)}+ {\varPsi}_{N}. \end{split} \end{equation*}$

Similarly, we consider the KKT condition

${\partial\mathcal{L}}/{\partial{{\mathbf{A}}^{(N)}}} = 0\; \; \; \text{and}\; \; \; {\mathbf{A}}^{(N)} \odot {\varPsi}_{N} = 0.$

As a result, we obtain the following updating rule for ${\mathbf{A}}^{(N)}$ :

$\begin{equation} {\mathbf{A}}_{ij}^{(N)} \leftarrow {\mathbf{A}}_{ij}^{(N)} \frac{P_{+}[[{\mathbf{X}}_{(N)}(\otimes_{p\neq N}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\mu_{N} {\mathbf{I}}{\mathbf{A}}^{(N)}+\lambda {\mathbf{W}}{\mathbf{A}}^{(N)}]_{ij}]}{[{\mathbf{A}}^{(N)}{\mathbf{G}}_{(N)}(\otimes_{p\neq N}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\mu_{N} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}{\mathbf{A}}^{(N)}+\lambda {\mathbf{D}}{\mathbf{A}}^{(N)}]_{ij}}. \end{equation}$

(4.3)

When only considering the unilateral information of the triple decomposition, the iterative format will undergo slight changes and the following updating rule for ${\mathbf{A}}^{(N)}$ is

$\begin{equation} {\mathbf{A}}_{ij}^{(N)} \leftarrow {\mathbf{A}}_{ij}^{(N)} \frac{P_{+}[[{\mathbf{X}}_{(N)}(\otimes_{p\neq N}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\lambda {\mathbf{W}}{\mathbf{A}}^{(N)}]_{ij}]}{[{\mathbf{A}}^{(N)}{\mathbf{G}}_{(N)}(\otimes_{p\neq N}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\lambda {\mathbf{D}}{\mathbf{A}}^{(N)}]_{ij}}. \end{equation}$

(4.4)

4.2. Updating core tensor

To update $\mathcal{G}$ , we fix ${\mathbf{A}}^{(r)}$ , $r = 1, \ldots, N$ .

The objective function in (4.1) about $\mathcal{G}$ can be changed into

$\begin{equation} \mathcal{L} = \frac{1}{2}\begin{Vmatrix} vec(\mathcal{X})-{\mathbf{F}}vec(\mathcal{G}) \end{Vmatrix}^{2}_{2}+vec(\mathcal{G})^{T}vec(\varPhi), \end{equation}$

(4.5)

where

${\mathbf{F}} = {{\mathbf{A}}^{(N)}}\otimes{{\mathbf{A}}^{(N-1)}}\otimes\ldots\otimes{{\mathbf{A}}^{(1)}} \in \mathbb{R}^{I_{1}I_{2}\ldots I_{N}\times{J_{1}J_{2}\ldots J_{N}}},$

$vec(\mathcal{X})$ expands the tensor $\mathcal{X}$ into a 1-dimension vector and $vec({\varPhi})$ represents the Lagrange multiplier of $vec(\mathcal{G})$ . The partial derivative of $\mathcal{L}$ in (4.5) with respect to $vec(\mathcal{G})$ is

$\begin{equation*} \frac{\partial\mathcal{L}}{\partial{vec(\mathcal{G})}} = {\mathbf{F}}^{\top}{\mathbf{F}}vec(\mathcal{G})-{\mathbf{F}}^{\top}vec(\mathcal{X})+vec( {\varPhi}). \end{equation*}$

By applying

${\partial\mathcal{L}}/{\partial{vec(\mathcal{G})}} = 0 \; \; \; \text{and}\; \; \; (vec(\mathcal{G}))_{i}(vec( {\varPhi}))_{i} = 0,$

we obtain the following updating rule:

$\begin{equation} (vec(\mathcal{G}))_{i} \leftarrow (vec(\mathcal{G}))_{i} \frac{P_{+}[({\mathbf{F}}^{\top}vec(\mathcal{X}))_{i}]}{({\mathbf{F}}^{\top}{\mathbf{F}}vec(\mathcal{G}))_{i}}. \end{equation}$

(4.6)

At this point, the optimization problem has been solved. According to the above iteration rules, we summarize the process of AOGNTD method, as shown in Algorithm 1.

Algorithm 1. Algorithm of the AOGNTD method.

Require: Data matrix

$\mathcal{X}$ , cluster number

$k$ ; parameter

$\lambda$ ,

$\mu$ .
Ensure: Core tensor

$\mathcal{G}$ , nonnegative factor matrices

$\textbf{A}^{(r)}, r = 1, 2, \dots, N$ .
1: Initialize

$\textbf{A}^{(r)}$ as random matrices and

$\mathcal{G}$ as an arbitrary positive tensor.
2: Calculate the weight matrix

$\textbf{W}$ .
3: repeat
4: Update

$\textbf{A}^{(n)},$ by (4.2), where

$n = 1, 2, \dots, N-1$ .
5: Update

$\textbf{A}^{(N)}$ by (4.3) or (4.4).
6: Update

$\mathcal{G}$ by (4.6).
7: until the algorithm convergence condition is satisfied.

4.3. Theoretical investigation

In ^[11], we are well aware that if $G(u, u^{'})$ is an auxiliary function for $F(u)$ , then $F(u)$ is nonincreasing under the updating rule

$\begin{equation} u^{t+1} = \underset{u}{\mathrm{argmin}} G(u, u^{t}). \end{equation}$

(4.7)

The equality

$F(u^{t+1}) = F(u^{t})$

holds only if $u^{t}$ is a local minimum of $G(u, u^{t})$ . Each subproblem has an optimal solution, which is the fundamental basis and theorem support about our algorithm. The construction and design of auxiliary functions are of utmost importance in this part. Before expounding the local convergence theorem, we offer three lemmas about diverse auxiliary functions.

For any element $a_{ij}^{(n)}$ in ${\mathbf{A}}^{(n)} (n = 1, 2, \dots, N-1)$ , let $F_{ij}(a_{ij}^{(n)})$ denote the part of the objective function in (3.2) relevant to $a_{ij}^{(n)}$ . The iterative process is element-wise, so it is necessary to imply that each $F_{ij}(a_{ij}^{(n)})$ is nonincreasing under the iteration rule. The first-order derivative of the $F_{ij}(a_{ij}^{(n)})$ with respect to $a_{ij}^{(n)}$ is

$\begin{equation*} \begin{split} F_{ij}^{'}(a_{ij}^{(n)}) = [-{\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+{\mathbf{A}}^{(n)}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T} {{\mathbf{A}}^{(n)}}]_{ij}, \end{split} \end{equation*}$

and

$\begin{equation*} F_{ij}^{''}({a^{(n)}_{ij}}^{t}) = [{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}]_{jj} \end{equation*}$

is the second order derivative of the $F_{ij}(a_{ij}^{(n)})$ relevant to ${a_{ij}^{(n)}}^{t}$ .

Lemma 1. The function

$\begin{equation} \begin{split} G(a_{ij}^{(n)}, {a_{ij}^{(n)}}^{t}) = &F_{ij}({a_{ij}^{(n)}}^{t})+F_{ij}^{'}({a_{ij}^{(n)}}^{t})(a_{ij}^{(n)}-{a_{ij}^{(n)}}^{t})\\&+\frac{1}{2}\frac{[{{\mathbf{A}}^{(n)}}^{t}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top} +\mu_{n} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T} {{\mathbf{A}}^{(n)}}^{t}]_{ij}}{{a^{(n)}_{ij}}^{t}} \times(a_{ij}^{(n)}-{a_{ij}^{(n)}}^{t})^{2} \end{split} \end{equation}$

(4.8)

is an auxiliary function for $F_{ij}(a_{ij}^{(n)})$ , with the matrix

${{\mathbf{A}}^{(n)}}^{t} = ({a_{ij}^{(n)}}^{t}).$

Proof. Basically, $G(u, u^{'})$ is an auxiliary function for $F(u)$ if the conditions

$G(u, u^{'})\geq F(u)\ \ \ \text{and}\; \; \; G(u, u) = F(u)$

are satisfied.

$G({a_{ij}^{(n)}}^{t}, {a_{ij}^{(n)}}^{t}) = F_{ij}({a_{ij}^{(n)}}^{t})$

clearly holds, so we only need to testify that

$G(a_{ij}^{(n)}, {a_{ij}^{(n)}}^{t}) \geq F_{ij}(a_{ij}^{(n)}).$

The Taylor expansion of $F_{ij}(a_{ij}^{(n)})$ at ${a_{ij}^{(n)}}^{t}$ is

$\begin{equation} \begin{split} F_{ij}(a_{ij}^{(n)}) = F_{ij}({a_{ij}^{(n)}}^{t})+F_{ij}^{'}({a_{ij}^{(n)}}^{t})(a_{ij}^{(n)}-{a_{ij}^{(n)}}^{t})+\frac{1}{2}F_{ij}^{''}({a^{(n)}_{ij}}^{t})(a_{ij}^{(n)}-{a_{ij}^{(n)}}^{t})^{2}. \end{split} \end{equation}$

(4.9)

Comparing (4.8) with (4.9), we can see that

$G(a_{ij}^{(n)}, {a_{ij}^{(n)}}^{t}) \geq F_{ij}(a_{ij}^{(n)})$

is equivalent to

$\begin{equation*} \frac{[{{\mathbf{A}}^{(n)}}^{t}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}]_{ij}}{{a^{(n)}_{ij}}^{t}}\geq [{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}]_{jj}. \end{equation*}$

We have

$\begin{equation*} \begin{split} [{{\mathbf{A}}^{(n)}}^{t}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}]_{ij}& = \sum\limits_{l = 1}^{J_{n}}{a^{(n)}_{il}}^{t}[{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}]_{lj}\\ & \ge {a^{(n)}_{ij}}^{t}[{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}]_{jj}, \end{split} \end{equation*}$

thus, the inequality

$G(a_{ij}^{(n)}, {a_{ij}^{(n)}}^{t}) \geq F_{ij}(a_{ij}^{(n)})$

holds. □

Let $F_{ij}(a_{ij}^{(N)})$ denote the part of the objective function in (3.2) relevant to $a_{ij}^{(N)}$ in ${\mathbf{A}}^{(N)}$ .

Lemma 2. The function

$\begin{equation} \begin{aligned} G(a_{ij}^{(N)}, {a_{ij}^{(N)}}^{t}) = &F_{ij}({a_{ij}^{(N)}}^{t})+F_{ij}^{'}({a_{ij}^{(N)}}^{t})(a_{ij}^{(N)}-{a_{ij}^{(N)}}^{t})\\&+\frac{1}{2}\frac{[{{\mathbf{A}}^{(N)}}^{t}{\mathbf{G}}_{(N)}(\otimes_{p\neq N}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\lambda{{\mathbf{D}}{\mathbf{A}}^{(N)}}^{t}+\mu_{N} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}{{{\mathbf{A}}^{(N)}}^{t}}]_{ij}}{{a^{(N)}_{ij}}^{t}} \times(a_{ij}^{(N)}-{a_{ij}^{(N)}}^{t})^{2}\end{aligned} \end{equation}$

(4.10)

is an auxiliary function for $F_{ij}(a_{ij}^{(N)})$ .

Lemma 3. Let $g_{i}$ denote the element of $vec(\mathcal{G})$ and $F_{i}(g_{i})$ denote the part of the objective function in (4.1) relevant to ${g_{i}}$ . The function

$\begin{equation} G(g_{i}, g_{i}^{t}) = F_{i}({g_{i}^{t}})+F_{i}^{'}({g_{i}^{t}})(g_{i}-{g_{i}^{t}})+\frac{({\mathbf{F}}^{\top}{\mathbf{F}}vec(\mathcal{G}^{t}))_{i}}{g_{i}^{t}}(g_{i}-g_{i}^{t})^{2} \end{equation}$

(4.11)

is an auxiliary function for $F_{i}(g_{i})$ .

Due to the similarity between the proofs of Lemmas 1–3, they are omitted here.

Theorem 4. The objective function in (4.1) is nonincreasing under the updating rules in (4.2), (4.3), and (4.6). The objective function is invariant under these updates if, and only if, ${\mathbf{A}}^{(r)}$ , $r = 1, 2, \dots, N$ , $\mathcal{G}$ are at a stationary point.

Proof. Replacing $G(u, u^{'})$ in (4.7) by (4.8), we obtain

$\begin{equation*} \begin{split} \frac{\partial G(a_{ij}^{(n)}, {a_{ij}^{(n)}}^{t})}{\partial a_{ij}^{(n)}} = F^{'}_{ij}({a_{ij}^{(n)}}^{t})+\frac{[{{\mathbf{A}}^{(n)}}^{t}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T} {{\mathbf{A}}^{(n)}}^{t}]_{ij}}{{a^{(n)}_{ij}}^{t}}(a_{ij}^{(n)}-{a_{ij}^{(n)}}^{t}) = 0, \end{split} \end{equation*}$

which yields

$\begin{equation*} a_{ij}^{(n)^{t+1}} = a_{ij}^{(n)^{t} } \frac{[{\mathbf{X}}_{(n)}(\otimes_{p\neq n}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{I}}{\mathbf{A}}^{(n)^{t}} ]_{ij}}{[{\mathbf{A}}^{(n)^{t}}{\mathbf{G}}_{(n)}(\otimes_{p\neq n}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(n)}^{\top}+\mu_{n} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T} {\mathbf{A}}^{(n)^{t}}]_{ij}}. \end{equation*}$

According to Lemma 1, $F_{ij}({a_{ij}^{(n)}})$ is nonincreasing under the updating rules (4.2) for ${\mathbf{A}}^{(n)}$ . Then, putting $G(a_{ij}^{(N)}, {a_{ij}^{(N)}}^{t})$ of (4.10) into (4.7), we can obtain

$\begin{equation*} a_{ij}^{(N)^{t+1} } = a_{ij}^{(N)^{t}} \frac{[{\mathbf{X}}_{(N)}(\otimes_{p\neq N}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\lambda {\mathbf{W}}{\mathbf{A}}^{(N)^{t}}+\mu_{N}{\mathbf{I}}{\mathbf{A}}^{(N)^{t}}]_{ij}}{[{\mathbf{A}}^{(N)^{t}}{\mathbf{G}}_{(N)}(\otimes_{p\neq N}{{\mathbf{A}}^{(p)}}^{\top}{\mathbf{A}}^{(p)}){\mathbf{G}}_{(N)}^{\top}+\lambda {\mathbf{D}}{\mathbf{A}}^{(N)^{t}}+\mu_{N} {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}{\mathbf{A}}^{(N)^{t}}]_{ij}}. \end{equation*}$

From Lemma 2, $F_{ij}({a_{ij}^{(N)}})$ is nonincreasing under the updating rules (4.3) for ${\mathbf{A}}^{(N)}$ . Finally, putting $G(g_{i}, g_{i}^{t})$ of (4.11) into (4.7),

$\begin{equation*} g_{i}^{t+1} = g_{i}^{t}\frac{(F^{\top}vec(\mathcal{X}))_{i}}{(F^{\top}Fvec(\mathcal{G}^{t}))_{i}}. \end{equation*}$

In accordance with Lemma 3, $F_{i}(g_{i})$ is nonincreasing under the updating rule (4.6).

The proof of Theorem 4 is completed. □

4.4. Computational complexity analysis

In this subsection, we discuss the computational cost of our proposed algorithms in comparison to other methods. The common way to express the complexity of one algorithm is using big $\mathcal{O}$ notation, but it is not precise enough to differentiate the complexities. Thus, we additionally count the arithmetic operations for each algorithm, including fladd (a floating-point addition), flmlt (a floating-point multiplication) and fldiv (a floating-point division). Specifically, we provide the computational complexity analysis of uni-AOGNTD and bi-AOGNTD models. There are many Kronecker product operations in our update rules, which needs high time cost. So, we will replace the Kronecker product with tensor operation during the calculation process.

To facilitate the estimation of computational complexity, we consider the case where $r = 3$ in the models. Taking uni-AOGNTD as an example, the update rules, respectively, are

$\begin{align*} {\mathbf{A}}_{ij}^{(1)}& \leftarrow {\mathbf{A}}_{ij}^{(1)} \frac{[(\mathcal{X}\times_{2}{{\mathbf{A}}^{(2)}}^{\top}\times_{3}{{\mathbf{A}}^{(3)}}^{\top})_{(1)}{\mathbf{G}}_{(1)}^{\top}+\mu {\mathbf{I}}{\mathbf{A}}^{(1)}]_{ij}}{[{\mathbf{A}}^{(1)}{\mathbf{G}}_{(1)}(\mathcal{X}\times_{2}{{\mathbf{A}}^{(2)}}^{\top}{\mathbf{A}}^{(2)}\times_{3}{{\mathbf{A}}^{(3)}}^{\top}{\mathbf{A}}^{(3)})_{(1)}{\mathbf{G}}_{(1)}^{\top}+\mu {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}{\mathbf{A}}^{(1)}]_{ij}}, \\ {\mathbf{A}}_{ij}^{(2)}& \leftarrow {\mathbf{A}}_{ij}^{(2)} \frac{[(\mathcal{X}\times_{1}{{\mathbf{A}}^{(1)}}^{\top}\times_{3}{{\mathbf{A}}^{(3)}}^{\top})_{(2)}{\mathbf{G}}_{(2)}^{\top}+\mu {\mathbf{I}}{\mathbf{A}}^{(2)}]_{ij}}{[{\mathbf{A}}^{(2)}{\mathbf{G}}_{(2)}(\mathcal{X}\times_{1}{{\mathbf{A}}^{(1)}}^{\top}{\mathbf{A}}^{(1)}\times_{3}{{\mathbf{A}}^{(3)}}^{\top}{\mathbf{A}}^{(3)})_{(2)}{\mathbf{G}}_{(2)}^{\top}+\mu {\mathbf{1}}_{R}{\mathbf{1}}_{R}^{T}{\mathbf{A}}^{(2)}]_{ij}}, \\ {\mathbf{A}}_{ij}^{(3)} &\leftarrow {\mathbf{A}}_{ij}^{(3)} \frac{[(\mathcal{X}\times_{1}{{\mathbf{A}}^{(1)}}^{\top}\times_{2}{{\mathbf{A}}^{(2)}}^{\top})_{(3)}{\mathbf{G}}_{(3)}^{\top}+\lambda {\mathbf{W}}{\mathbf{A}}^{(3)}]_{ij}}{[{\mathbf{A}}^{(3)}{\mathbf{G}}_{(3)}(\mathcal{X}\times_{1}{{\mathbf{A}}^{(1)}}^{\top}{\mathbf{A}}^{(1)}\times_{2}{{\mathbf{A}}^{(2)}}^{\top}{\mathbf{A}}^{(2)})_{(3)}{\mathbf{G}}_{(3)}^{\top}+\lambda {\mathbf{D}}{\mathbf{A}}^{(3)}]_{ij}} \end{align*}$

and

$\begin{equation*} \mathcal{G}_{ijk} \leftarrow \mathcal{G}_{ijk} \frac{[\mathcal{X}\times_{1}{{\mathbf{A}}^{(1)}}^{\top}\times_{2}{{\mathbf{A}}^{(2)}}^{\top}\times_{3}{{\mathbf{A}}^{(3)}}^{\top}]_{ijk}}{[\mathcal{G}\times_{1}{{\mathbf{A}}^{(1)}}^{\top}{\mathbf{A}}^{(1)}\times_{2}{{\mathbf{A}}^{(2)}}^{\top}{\mathbf{A}}^{(2)}\times_{3}{{\mathbf{A}}^{(3)}}^{\top}{\mathbf{A}}^{(3)}]_{ijk}}. \end{equation*}$

where

$\mathcal{X}_{(1)}({\mathbf{A}}^{(2)}\otimes {\mathbf{A}}^{(3)}) = \mathcal{X}\times_{2}{{\mathbf{A}}^{(2)}}^{\top}\times_{3}{{\mathbf{A}}^{(3)}}^{\top}$

and so on.

The frequently used expressions and the specific calculations for the two methods are provided in . By calculating the computational complexity of the four formats and considering the fact $I_{i}\gg J_{i}$ , we can gain the complexity of the proposed uni-AOGNTD as $\mathcal{O}(I_{1}I_{2}I_{3}J)$ , in which

$J = 3(J_{1}+J_{2}+J_{3}).$

Then, we also can acquire the complexity of the bi-AOGNTD method as $\mathcal{O}(I_{1}I_{2}I_{3}J)$ . According to the analysis of other compared methods in corresponding literature, Table 2 generalizes the complexity of compared algorithms. The new algorithm will inevitably increase a certain amount of computation, but it can be ignored in terms of overall computational complexity.

Table 1. Computational operation counts.

	Fladd	Flmlt	Fldiv
$\mathcal{Y}\times_{r}{\mathbf{U}}$	$I_{r}J_{1}J_{2}\dots J_{N}$	$I_{r}J_{1}J_{2}\dots J_{N}$	$-$
$\mathcal{X}\times_{2}{\mathbf{A_{2}^{T}}}\times_{3}{\mathbf{A_{3}^{T}}}$	$I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}$	$I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}$	$-$
$\mathcal{G}\times_{2}{\mathbf{A}}_{2}^{T}{\mathbf{A}}_{2}\times_{3}{\mathbf{A}}_{3}^{T}{\mathbf{A}}_{3}$	$J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}$	$J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}$	$-$
uni-AOGNTD
${\mathbf{A}}^{(1)}$	$I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}$ $+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+3I_{1}J_{1}J_{2}J_{3}+2I_{1}J_{1}(I_{1}+2)$	$I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{2}J_{2}^{2}$ $+I_{3}J_{3}^{2}+3I_{1}J_{1}J_{2}J_{3}+2I_{1}J_{1}(I_{1}+1)+I_{1}J_{1}$	$I_{1}J_{1}$
${\mathbf{A}}^{(2)}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{3}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}J_{3}^{2}$ $+I_{1}J_{1}^{2}+I_{3}J_{3}^{2}+3I_{2}J_{1}J_{2}J_{3}+2I_{2}J_{2}(I_{2}+2)$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{3}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}$ $+I_{3}J_{3}^{2}+3I_{2}J_{1}J_{2}J_{3}+2I_{2}J_{2}(I_{2}+1)+I_{2}J_{2}$	$I_{2}J_{2}$
${\mathbf{A}}^{(3)}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}$ $+I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+2I_{3}J_{3}(I_{3}+2)$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}$ $+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+2I_{3}J_{3}(I_{3}+2)+I_{3}J_{3}$	$I_{3}J_{3}$
$\mathcal{G}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}$ $+J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+I_{3}J_{1}J_{2}J_{3}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}$ $+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+J_{1}J_{2}J_{3}$	$J_{1}J_{2}J_{3}$
bi-AOGNTD
${\mathbf{A}}^{(1)}$	$I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}$ $+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+3I_{1}J_{1}J_{2}J_{3}+2I_{1}J_{1}(I_{1}+2)$	$I_{1}I_{2}I_{3}J_{2}+I_{1}I_{3}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{2}J_{2}^{2}$ $+I_{3}J_{3}^{2}+3I_{1}J_{1}J_{2}J_{3}+2I_{1}J_{1}(I_{1}+1)+I_{1}J_{1}$	$I_{1}J_{1}$
${\mathbf{A}}^{(2)}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{3}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}J_{3}^{2}$ $+I_{1}J_{1}^{2}+I_{3}J_{3}^{2}+3I_{2}J_{1}J_{2}J_{3}+2I_{2}J_{2}(I_{2}+2)$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{3}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}$ $+I_{3}J_{3}^{2}+3I_{2}J_{1}J_{2}J_{3}+2I_{2}J_{2}(I_{2}+1)+I_{2}J_{2}$	$I_{2}J_{2}$
${\mathbf{A}}^{(3)}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}$ $+I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+4I_{3}J_{3}(I_{3}+2)$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}$ $+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+4I_{3}J_{3}(I_{1}+1)+I_{3}J_{3}$	$I_{3}J_{3}$
$\mathcal{G}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}$ $+J_{1}J_{2}J_{3}^{2}+I_{1}J_{1}^{2}+I_{2}J_{2}^{2}+I_{3}J_{3}^{2}+I_{3}J_{1}J_{2}J_{3}$	$I_{1}I_{2}I_{3}J_{1}+I_{2}I_{3}J_{1}J_{2}+J_{1}^{2}J_{2}J_{3}+J_{1}J_{2}^{2}J_{3}+I_{1}J_{1}^{2}$ $+I_{2}J_{2}^{2}+3I_{3}J_{1}J_{2}J_{3}+J_{1}J_{2}J_{3}$	$J_{1}J_{2}J_{3}$

| Show Table

DownLoad: CSV

Table 2. Computational operation counts.

Model	Objective function	Overall	Supplement
NMF	$\min_{{\mathbf{U}}, {\mathbf{V}}, }\dfrac{1}{2}{\begin{Vmatrix}{\mathbf{X}}-{\mathbf{U}}{\mathbf{V}}\end{Vmatrix} }^{2}$ s.t. ${\mathbf{U}}, {\mathbf{V}}\geqslant0$	$\mathcal{O} (MNK)$
NTD	$\min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}$ s.t. $\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0$	$\mathcal{O} (I_{1}I_{2}I_{3}J)$	$J=3(J_{1}+J_{2}+J_{3})$
GNTD	$\min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}$ $+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})$ s.t. $\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0$	$\mathcal{O} (I_{1}I_{2}I_{3}J)$	$J=3(J_{1}+J_{2}+J_{3})$
uni-AOGNTD	$\min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}$ $+\dfrac{\mu_{r}}{2}\sum_{r=1}^{N-1}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})$ s.t. $\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0$	$\mathcal{O} (I_{1}I_{2}I_{3}J)$	$J=3(J_{1}+J_{2}+J_{3})$
bi-AOGNTD	$\min_{\mathcal{G}, {\mathbf{A}}^{(1)}, \ldots, {\mathbf{A}}^{(N)}}\dfrac{1}{2}{\begin{Vmatrix}\mathcal{X}-\mathcal{G}\times_{1}{\mathbf{A}}^{(1)}\times_{2}{\mathbf{A}}^{(2)}\cdots\times_{N}{\mathbf{A}}^{(N)}\end{Vmatrix} }^{2}$ $+\dfrac{\mu_{r}}{2}\sum_{r=1}^{N}\mathrm{Tr}({{\mathbf{A}}^{(r)}}^{T}{\mathbf{Q}}{{\mathbf{A}}^{(r)}})+\dfrac{\lambda}{2}\mathrm{Tr}({{\mathbf{A}}^{(N)}}^{T}{\mathbf{L}}{\mathbf{A}}^{(N)})$ s.t. $\mathcal{G}\geqslant0,{\mathbf{A}}^{(r)}\geqslant0$	$\mathcal{O} (I_{1}I_{2}I_{3}J)$	$J=3(J_{1}+J_{2}+J_{3})$

| Show Table

DownLoad: CSV

5. Experiments

In this Section, according to ^[36], two indicators, accuracy (AC) and normalized mutual information (NMI), are used to evaluate the clustering performance. The AC is defined as

${{\rm{AC}}} = \frac{\delta({\rm{k}}_{{\rm{i}}}, {\rm{map}}({\rm{c}}_{{\rm{i}}}))}{{\rm{n}}},$

where $c_{i}$ denotes the label from clustering method, $k_{i}$ is the true label of object $x_{i}$ , $\delta(x, y)$ is the indicator function, $n$ is the total number of objects, and $map(c_{i})$ is the mapping function. The NMI is defined as

${{\rm{NMI}}} = \frac{\sum\nolimits_{{\rm{u}} = 1}^{{\rm{c}}}\sum\nolimits_{{\rm{v}} = 1}^{{\rm{k}}}\mathrm{log}(\frac{{\rm{nn}}_{{\rm{uv}}}}{{\rm{n}}_{{\rm{u}}}{\rm{n}}^{'}_{{\rm{v}}}})}{\sqrt{(\sum\nolimits_{u = 1}^{c}n_{u}\mathrm{log}(\frac{n_{u}}{n}))(\sum\nolimits_{v = 1}^{k}n^{'}_{v}\mathrm{log}(\frac{n^{'}_{v}}{n}))}},$

where $n_{uv}$ is the number of cluster, and $n_{i}$ and $n^{'}_{i}$ are the data points in the true label cluster and the clustering label cluster, respectively.

All the experiments are run with MATLAB on a PC equipped with Intel i5-10210U and 8GB of RAM.

5.1. Compared methods

To evaluate the effectiveness of the proposed AOGNTD methods, we compare the clustering performance of our methods with the other six state of the art and previously mentioned methods, which are $k$ -means ^[36], NMF ^[11], graph regularized NMF (GNMF) ^[36], graph dual regularized NMF (GDNMF) ^[37,38], NTD ^[17], and GNTD ^[22,23]. In order to distinguish DNMF of the original article from DNMF with the label information ^[27], we have written DNMF ^[37,38] as GDNMF.

5.2. Datasets description

Three databases are used to conduct the experiments in Figure 1, and the details of public datasets are stated as follows (the COIL20 dataset means the Columbia University image library dataset, the ORL dataset means the Oxford-rotating dataset):

Figure 1. COIL20 (left), Yale (middle), and ORL dataset (right).

DownLoad: Full-Size Img PowerPoint

5.2.1. COIL20 dataset

The COIL20 dataset collects 1440 grayscale images of 20 subjects. The objects were placed on a motorized turntable, which was rotated through 360 degrees to vary object pose with respect to a fixed camera. Images of the objects were taken at pose intervals of 5 degrees, which corresponds to 72 images per object. In our experiments, each image was resized to 32 $\times$ 32 pixels and all images are stacked into a tensor $\mathcal{X}\in\mathbb{R}^{32\times32\times1440}$ .

5.2.2. Yale dataset

The Yale faces dataset contains 165 grayscale images in GIF format of 15 individuals. There are 11 images per subject, one per different facial expression or configuration. Each image was resized into 32 $\times$ 32 pixels. In the experiments, all images form a tensor $\mathcal{X}\in\mathbb{R}^{32\times32\times165}$ .

5.2.3. ORL dataset

The ORL dataset collects 400 grayscale 112 $\times$ 92 faces images which consist of 40 different subjects with 10 distinct images. For some subjects, the images were taken at different lighting, times, and facial expressions. For each image, we resized it to be 32 $\times$ 32 pixels in our experiment. These images construct a tensor of $\mathcal{X}\in\mathbb{R}^{32\times32\times400}$ .

5.3. Parameter selection

Identifying and adjusting the parameter combination of the algorithm has always been a striking matter in machine learning. The size of core tensor plays a prominent role in our algorithms. For NTD-based methods, the size of the core tensor is set to $\{10\times10\times s, 20\times20\times s, 30\times30\times s\}$ in three datasets, where $s$ is positive integer.

There are multiple regularization parameters $\lambda$ and $\mu_{r}, r = 1, 2, \dots, N$ balancing the effects of the reconstruction term, graph Laplacian term, and approximately orthogonal term in our model. To reduce parameter adjustment pressure, we study the influence on the clustering performance when the parameters $\lambda$ and $\mu_{r} = \mu$ and the ratio of sampling vary. Since it is infeasible to consider the changes of these parameters disposable, we use the grid control method to determine the relatively optimal parameters of the process.

To begin, determining the numerical range of two parameters is $[5e-5, 1e+4]$ , then select $t$ values within this range, and finally select $t^{2}$ results to achieve the best clustering effect. For the COIL20 dataset, we set $\{1e-1, 1, 1e+1, 1e+2\}$ as selections of the parameters $\lambda$ and $\mu$ in uni-AOGNTD format, where $t = 4$ , and choose the best among 16 values. It is crucial to weigh the two measurement criteria reasonably when comparing results.

In principle, the numerical value of AC (accuracy) is the main factor, supplemented by NMI. More details can be found in Figures 2–. – shows the specific parameter selection, and the optimal parameter setting is circled in black. The symbol dataset- $k$ , say, COIL20-4, represents the categories extracted from a certain dataset. During the procedure of parameter option, we found strong graph regularity and weak orthogonality of the AOGNTD models.

Figure 2. The clustering performance of the uni-AOGNTD method on the COIL20 datasets.

DownLoad: Full-Size Img PowerPoint

Figure 3. The clustering performance of the bi-AOGNTD method on the COIL20 datasets.

DownLoad: Full-Size Img PowerPoint

Figure 4. The clustering performance of the uni-AOGNTD method on the Yale datasets.

DownLoad: Full-Size Img PowerPoint

Figure 5. The clustering performance of the bi-AOGNTD method on the Yale datasets.

DownLoad: Full-Size Img PowerPoint

Figure 6. The clustering performance of the uni-AOGNTD method on the ORL datasets.

DownLoad: Full-Size Img PowerPoint

Figure 7. The clustering performance of the bi-AOGNTD method on the ORL datasets.

DownLoad: Full-Size Img PowerPoint

5.4. Experiments for effectiveness and analysis

– show the average clustering results of the compared algorithms on three datasets, respectively. We bold mark the best results with each clustering number $k$ on each dataset. In each experiment, we randomly selected $k$ categories as the evaluated data. We construct the graph Laplacian using $p$ -nearest neighbors in which the neighborhood size $p$ is set to 5. We run the experiment and apply $K$ -means 10 times on the low-dimensional representation matrix of each method. To make the experimental result validated, we repeat the above operations 10 times and calculate the averages, so there are a total of 100 calculations.

Table 3. AC (%) of different algorithms on COIL20 dataset.

k	k-means	NMF	GNMF	GDNMF	NTD	GNTD	uni-AOGNTD	bi-AOGNTD
2	89.10±12.07	97.60±4.96	97.65±4.86	98.33±5.03	90.55±12.09	98.33±5.03	100.00	100.00
4	75.86±15.01	71.68±16.99	72.47±17.14	77.31±14.95	71.42±13.90	76.29±18.45	81.94±15.35	81.46±17.55
6	65.63±9.12	68.11±11.97	68.65±13.00	68.58±11.39	70.79±12.56	73.61±13.04	75.73±14.88	75.10±13.77
8	66.61±10.32	64.37±10.37	65.60±10.77	65.90±8.09	69.49±8.54	71.00±10.92	74.73±10.22	73.80±10.85
10	64.92±8.72	63.01±8.14	65.58±8.19	65.63±7.19	66.41±7.64	66.01±10.52	71.64±9.21	72.29±10.04
12	62.38±8.40	63.84±8.26	64.16±7.52	62.73±7.78	62.37±8.93	66.59±9.24	70.55±8.57	70.21±9.82
14	62.08±7.40	60.09±5.76	60.52±5.85	62.66±5.80	60.92±7.08	68.92±8.03	70.46±8.12	70.21±7.94
16	59.84±5.81	62.55±5.64	63.33±5.73	63.70±6.06	61.38±6.09	69.24±7.52	70.69±7.05	70.04±7.39
18	60.41±5.19	60.03±4.91	60.56±4.64	62.54±5.37	59.28±4.88	65.89±6.89	69.16±7.27	68.91±6.34
20	57.49±4.71	60.29±4.49	59.22±4.70	60.01±4.64	56.90±4.03	64.47±5.75	69.35±5.20	68.79±6.28
Avg.	66.43	67.16	67.77	68.74	66.95	72.04	75.43	75.08

| Show Table

DownLoad: CSV

Table 4. NMI (%) of different algorithms on COIL20 dataset.

k	k-means	NMF	GNMF	GDNMF	NTD	GNTD	uni-AOGNTD	bi-AOGNTD
2	65.77±30.18	91.29±17.74	91.38±17.53	94.59±16.31	72.77±31.77	94.59±16.31	100.00	100.00
4	69.61±15.08	69.05±16.24	69.99±17.15	74.89±11.95	69.64±13.21	71.24±17.77	81.51±12.92	85.18±13.08
6	66.49±8.37	69.69±11.15	70.68±12.54	66.91±10.45	70.39±12.97	78.68±10.49	81.21±10.79	80.28±10.74
8	71.64±7.72	69.25±9.14	70.60±9.13	70.62±6.76	73.10±7.25	79.90±7.88	81.54±7.51	82.70±6.87
10	73.85±6.80	70.39±7.17	72.18±7.40	72.95±5.68	72.72±6.05	76.73±10.00	81.72±5.41	83.51±6.06
12	72.32±6.53	71.76±6.11	72.15±5.94	71.83±6.84	70.77±7.14	79.16±7.36	81.94±5.37	81.80±6.01
14	74.36±5.58	70.62±4.84	71.11±4.53	72.82±4.20	71.16±5.16	82.12±4.70	82.89±4.27	81.45±4.77
16	72.96±3.57	73.17±3.64	73.99±3.57	74.48±3.75	71.84±4.21	82.81±3.99	82.85±3.72	82.84±3.97
18	74.37±3.02	71.92±2.81	72.27±2.62	74.66±3.37	71.42±2.85	81.00±3.86	82.31±4.54	82.28±3.54
20	73.43±2.42	72.40±2.36	72.64±2.54	73.59±2.43	69.76±2.39	81.00±3.14	82.72±2.58	81.94±3.32
Avg.	71.48	72.95	73.70	74.73	71.36	80.72	83.87	83.91

| Show Table

DownLoad: CSV

Table 5. AC (%) of different algorithms on Yale dataset.

k	k-means	NMF	GNMF	GDNMF	NTD	GNTD	uni-AOGNTD	bi-AOGNTD
3	62.03±10.93	59.73±9.23	59.97±9.35	62.12±9.66	60.82±11.96	61.55±12.06	67.48±12.52	66.06±13.70
5	51.42±8.72	54.42±8.37	55.56±8.64	56.00±6.62	56.02±11.44	56.16±11.52	58.82±8.18	58.76±8.16
7	47.58±7.31	48.91±6.24	49.03±6.05	48.56±6.94	49.45±7.26	50.83±7.78	53.26±10.38	52.74±9.37
9	42.70±5.72	45.03±5.05	45.16±5.33	45.32±5.91	47.44±6.05	47.59±5.55	48.02±6.16	48.23±5.83
11	40.41±4.78	41.51±4.55	41.74±4.67	43.31±4.80	41.94±4.60	43.96±4.57	44.69±4.68	44.02±4.62
13	39.08±4.53	41.54±3.67	41.72±3.86	41.59±4.01	41.80±4.67	41.60±4.52	42.45±4.97	41.92±4.02
15	38.43±3.71	38.64±3.80	38.73±3.68	38.88±3.10	39.25±4.00	39.94±3.64	41.02±2.73	40.36±2.97
Avg.	45.95	47.11	47.42	47.97	48.10	48.80	50.82	50.30

| Show Table

DownLoad: CSV

Table 6. NMI (%) of different algorithms on Yale dataset.

k	k-means	NMF	GNMF	GDNMF	NTD	GNTD	uni-AOGNTD	bi-AOGNTD
3	34.72±15.81	30.16±13.95	30.37±14.32	38.05±14.35	31.74±17.86	32.85±16.57	39.26±17.48	38.44±19.12
5	38.18±9.08	39.38±8.69	40.46±8.64	40.90±8.35	40.66±13.87	40.80±14.01	45.91±12.66	46.92±11.59
7	40.82±7.59	39.88±6.68	39.99±6.64	40.44±7.54	40.79±7.32	44.21±9.39	46.67±11.39	46.14±10.51
9	41.03±5.87	41.75±4.54	41.82±4.27	41.47±5.64	45.16±6.19	44.82±5.55	45.26±6.01	45.36±5.80
11	41.51±4.42	41.79±4.35	42.28±4.16	43.61±4.49	42.65±4.43	43.99±4.20	44.81±3.99	44.05±4.28
13	43.11±4.13	44.55±3.18	44.56±3.38	43.63±3.53	44.38±4.24	44.74±3.99	45.63±4.21	45.26±3.70
15	38.43±3.71	43.70±3.19	43.78±3.20	43.85±2.54	44.58±2.96	45.53±3.05	46.06±2.59	45.68±2.58
Avg.	39.69	40.17	40.47	41.71	41.42	42.42	44.80	44.55

| Show Table

DownLoad: CSV

Table 7. AC (%) of different algorithms on ORL dataset.

k	k-means	NMF	GNMF	GDNMF	NTD	GNTD	uni-AOGNTD	bi-AOGNTD
2	88.65±16.89	90.35±12.84	90.25±13.58	91.35±10.80	86.45±15.46	98.20±5.80	99.00±3.02	98.70±3.23
4	73.73±14.28	82.08±13.92	82.48±14.08	80.23±16.09	71.43±12.83	83.10±16.65	83.40±16.44	83.80±16.16
6	68.57±10.69	72.20±11.02	72.37±11.83	72.75±11.64	61.32±13.98	72.68±10.59	75.30±12.14	76.35±10.86
8	62.29±10.23	69.91±10.22	68.49±9.91	67.98±10.06	54.05±13.90	70.48±9.09	70.55±9.87	70.88±9.39
10	62.22±7.44	65.46±7.37	66.89±8.69	67.88±7.73	58.00±11.64	68.78±8.97	69.15±8.91	69.05±9.14
15	57.78±5.38	64.19±6.03	64.83±6.80	65.03±7.26	56.13±11.52	65.82±6.07	65.99±6.88	65.99±6.31
20	54.68±5.25	61.11±6.01	61.92±6.36	62.27±5.24	50.25±11.76	62.36±5.63	62.49±6.80	62.79±5.67
25	55.38±4.56	61.10±4.23	61.70±4.82	62.21±5.44	52.83±10.18	61.91±5.63	62.05±5.14	62.20±5.68
30	52.07±3.78	58.80±4.44	59.56±4.46	59.06±4.21	45.35±14.04	59.55±5.04	60.44±4.03	60.23±3.71
35	51.86±3.07	57.15±3.52	57.28±3.63	58.69±3.59	49.01±10.39	57.00±4.35	59.31±3.85	58.91±3.99
40	51.48±3.35	56.46±3.42	58.03±3.73	58.17±3.36	45.58±9.31	58.25±3.27	58.38±3.15	58.36±3.36
Avg.	61.70	67.16	67.62	67.78	57.31	68.92	69.64	69.75

| Show Table

DownLoad: CSV

Table 8. NMI(%) of different algorithms on ORL dataset.

k	k-means	NMF	GNMF	GDNMF	NTD	GNTD	uni-AOGNTD	bi-AOGNTD
2	70.82±40.76	70.64±34.38	71.39±34.53	73.87±32.41	62.85±35.80	93.53±15.58	96.10±11.76	94.62±12.59
4	68.42±16.36	79.91±12.97	80.06±13.49	77.86±15.48	64.81±14.60	80.18±19.54	82.13±15.94	82.46±16.15
6	69.52±9.93	74.09±9.67	74.01±11.38	74.10±11.53	60.51±17.75	76.41±8.52	79.18±8.17	79.69±8.01
8	65.60±10.31	74.99±8.75	73.82±8.16	72.66±9.05	55.10±17.20	75.27±7.38	75.52±7.61	75.73±7.08
10	68.35±6.79	73.18±5.33	73.80±6.37	75.33±5.98	63.76±13.26	75.73±7.47	77.98±7.79	77.82±8.17
15	69.53±4.25	76.00±4.69	76.38±5.10	75.83±4.99	66.67±12.51	76.64±4.86	76.64±5.09	76.93±4.93
20	69.21±4.42	74.10±4.32	74.67±4.72	75.24±3.62	64.33±11.76	75.59±4.15	75.98±4.65	76.01±4.15
25	71.68±3.25	75.77±2.66	76.62±3.26	76.92±3.62	68.37±9.31	77.04±3.91	77.11±3.69	77.15±3.59
30	70.58±2.41	75.39±2.97	76.13±2.65	75.66±2.68	62.90±12.97	76.28±2.88	76.66±2.81	76.55±2.45
35	71.17±1.94	74.71±2.11	75.35±2.09	76.02±2.11	67.46±9.66	75.39±2.77	76.86±2.18	76.48±2.30
40	71.84±2.01	74.91±1.90	76.03±1.90	76.55±1.83	66.07±7.95	76.79±1.79	76.88±1.63	76.90±1.85
Avg.	69.70	74.87	75.30	75.46	63.89	78.08	79.18	79.12

| Show Table

DownLoad: CSV

From Tables 3–8, our AOGNTD algorithms present better performance on the three datasets with varying category counts. In the comparison of data, the following phenomena can also be observed:

$(1)$ For the COIL20 dataset, its performance is the most outstanding. The proposed uni-AOGNTD method attains $9\%$ , $8.27\%$ , $7.66\%$ , $6.69\%$ , $8.48\%$ , $3.39\%$ improvement corresponding to AC and $12.39\%$ , $10.92\%$ , $10.17\%$ , $9.14\%$ , $12.51\%$ , $3.15\%$ improvement corresponding to NMI in comparison with $K$ -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively. In the same way, the proposed bi-AOGNTD algorithm attains $8.65\%$ , $7.92\%$ , $7.31\%$ , $6.34\%$ , $8.13\%$ , $3.04\%$ improvement corresponding to AC and $12.41\%$ , $10.96\%$ , $10.22\%$ , $9.18\%$ , $12.55\%$ , $3.19\%$ improvement corresponding to NMI in comparison with $K$ -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively.

$(2)$ For the Yale dataset, the uni-AOGNTD method attains $4.87\%$ , $3.71\%$ , $3.4\%$ , $2.85\%$ , $2.72\%$ , and $2.02\%$ improvement corresponding to AC and $5.11\%$ , $4.63\%$ , $4.33\%$ , $3.09\%$ , $3.38\%$ , $2.38\%$ improvement corresponding to NMI in comparison with $K$ -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively. The uni-AOGNTD and bi-AOGNTD models are about equal in problem-solving skills.

$(3)$ For the ORL dataset, the uni-AOGNTD method attains $7.94\%$ , $2.48\%$ , $2.02\%$ , $1.86\%$ , $12.33\%$ , and $0.72\%$ improvement corresponding to AC and $9.48\%$ , $4.31\%$ , $3.88\%$ , $3.72\%$ , $15.29\%$ , $1.1\%$ improvement corresponding to NMI in comparison with $K$ -means, NMF, GNMF, GDNMF, NTD, GNTD, respectively. The clustering effect of uni-AOGNTD method and bi-AOGNTD method are roughly the same.

As can be seen, the uni-AOGNTD and bi-AOGNTD methods are superior to the compared methods, due to innovating and balancing the approximately orthogonal regularization and the graph regularization in conjunction with the unified NTD for tensor data representation. The local minimum solution is compressed to a better solution space under the adjustment of parameters. Intuitively, the nonnegativity of the factor matrix and the model constraint reduce the feasible solution domain to near local minima.

The clustering results of methods containing graph regularization terms for ${\mathbf{A}}^{(N)}$ are extended in Figure 8. In short, the fresh approaches produced the highest and stablest AC and NMI.

Figure 8. The clustering performance of four methods on COIL-20, Yale, and ORL datasets.

DownLoad: Full-Size Img PowerPoint

5.5. Convergence study

As described in Section 3, the convergence of the proposed algorithms has been theoretically proved. In this subsection, we experimentally study the convergence of the proposed algorithms by assuming link between the number of iterations and the value of the objective function in Figure 9. The trend intuitively suggests that the objective function can converge effectively under the multiplication iteration rules and explains the correctness of Theorem 4.

Figure 9. Convergence curves of AOGNTD on COIL-20, Yale, and ORL datasets.

DownLoad: Full-Size Img PowerPoint

6. Conclusions

In this paper, we built up two novel AOGNTD methods for tensor data representation. The convergence analysis is given. The AOGNTD method presents more competent representation and achieves better clustering performance on the publicly available real-world datasets through two adjustable regularization parameters. Although AOGNTD methods perform well in image clustering tasks, it can be further improved in two possible directions in the future. First, the graph in our algorithm is fixed, but it is thought-provoking whether to obtain joint information from multiple graphs or learn the optimal graph. In addition, a large number of NTD-based methods rely on $K$ -means, and designing independent classification methods without any additional clustering procedure is the main point of further research.

Author contributions

Xiang Gao: material preparation, data collection and analysis, writing the ﬁrst draft, commenting; Linzhang Lu: material preparation, data collection and analysis, commenting; Qilong Liu: material preparation, data collection and analysis, commenting. All authors have read and approved the final version of the manuscript for publication.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

Linzhang Lu is supported by the National Natural Science Foundation of China under Grant (12161020 and 12061025), and Guizhou Normal University Academic Talent Foundation under Grant (QSXM[2022]04). Qilong Liu is supported by Guizhou Provincial Basis Research Program (Natural Science) under Grant (QKHJC-ZK[2023]YB245).

Conflict of interest

The authors declare that they have no conflicts of interest in this paper.

References

[1]	J. L. G. Guirao, P. O. Mohammed, H. M. Srivastava, D. Baleanu, M. S. Abualrub, Relationships between the discrete Riemann-Liouville and Liouville-Caputo fractional differences and their associated convexity results, AIMS Mathematics, 7 (2022), 18127–18141. https://doi.org/10.3934/math.2022997 doi: 10.3934/math.2022997
[2]	C. S. Goodrich, On discrete sequential fractional boundary value problems, J. Math. Anal. Appl., 385 (2012), 111–124. https://doi.org/10.1016/j.jmaa.2011.06.022 doi: 10.1016/j.jmaa.2011.06.022
[3]	T. Abdeljawad, Different type kernel $h$ –fractional differences and their fractional $h$ –sums, Chaos Soliton. Fract., 116 (2018), 146–156. https://doi.org/10.1016/j.chaos.2018.09.022 doi: 10.1016/j.chaos.2018.09.022
[4]	P. O. Mohammed, H. M. Srivastava, D. Baleanu, K. M. Abualnaja, Modified fractional difference operators defined using Mittag-Leffler kernels, Symmetry, 14 (2022), 1519. https://doi.org/10.3390/sym14081519 doi: 10.3390/sym14081519
[5]	F. M. Atici, M. Uyanik, Analysis of discrete fractional operators, Appl. Anal. Discr. Math., 9 (2015), 139–149. http://dx.doi.org/10.2298/AADM150218007A doi: 10.2298/AADM150218007A
[6]	F. M. Atici, M. Atici, M. Belcher, D. Marshall, A new approach for modeling with discrete fractional equations, Fund. Inform., 151 (2017), 313–324. http://dx.doi.org/10.3233/FI-2017-1494 doi: 10.3233/FI-2017-1494
[7]	F. M. Atici, S. S. Ayan, Modeling with fractional difference equations, J. Math. Anal. Appl., 369 (2010), 1–9. http://dx.doi.org/10.1016/j.jmaa.2010.02.009 doi: 10.1016/j.jmaa.2010.02.009
[8]	C. S. Goodrich, On discrete sequential fractional boundary value problems, J. Math. Anal. Appl., 385 (2012), 111–124. https://doi.org/10.1016/j.jmaa.2011.06.022 doi: 10.1016/j.jmaa.2011.06.022
[9]	C. R. Chen, M. Bohner, B. G. Jia, Ulam-hyers stability of Caputo fractional difference equations, Math. Methods Appl. Sci., 42 (2019), 7461–7470. https://doi.org/10.1002/mma.5869 doi: 10.1002/mma.5869
[10]	R. Dahal, C. S. Goodrich, Theoretical and numerical analysis of monotonicity results for fractional difference operators, Appl. Math. Lett., 117 (2021), 107104. https://doi.org/10.1016/j.aml.2021.107104 doi: 10.1016/j.aml.2021.107104
[11]	C. Lizama, The poisson distribution, abstract fractional difference equations, and stability, Proc. Amer. Math. Soc., 145 (2017), 3809–3827. http://dx.doi.org/10.1090/proc/12895 doi: 10.1090/proc/12895
[12]	H. M. Srivastava, P. O. Mohammed, C. S. Ryoo, Y. S. Hamed, Existence and uniqueness of a class of uncertain Liouville-Caputo fractional difference equations, J. King Saud Univ. Sci., 33 (2021), 101497. https://doi.org/10.1016/j.jksus.2021.101497 doi: 10.1016/j.jksus.2021.101497
[13]	Q. Lu, Y. Zhu, Comparison theorems and distributions of solutions to uncertain fractional difference equations, J. Cmput. Appl. Math., 376 (2020), 112884. https://doi.org/10.1016/j.cam.2020.112884 doi: 10.1016/j.cam.2020.112884
[14]	F. M. Atici, P. W. Eloe, A transform method in discrete fractional calculus, Int. J. Differ. Equ., 2 (2007), 165–176.
[15]	P. O. Mohammed, T. Abdeljawad, Discrete generalized fractional operators defined using $h$ -discrete Mittag-Leffler kernels and applications to AB fractional difference systems, Math. Methods Appl. Sci., 2020. https://doi.org/10.1002/mma.7083
[16]	F. M. Atici, M. Atici, N. Nguyen, T. Zhoroev, G. Koch, A study on discrete and discrete fractional pharmaco kinetics pharmaco dynamics models for tumor growth and anti-cancer effects, Comput. Math. Biophys., 7 (2019), 10–24.
[17]	A. Silem, H. Wu, D. J. Zhang, Discrete rogue waves and blow-up from solitons of a nonisospectral semi-discrete nonlinear Schrödinger equation, Appl. Math. Lett., 116 (2021), 107049. https://doi.org/10.1016/j.aml.2021.107049 doi: 10.1016/j.aml.2021.107049
[18]	R. A. C. Ferreira, D. F. M. Torres, Fractional h-difference equations arising from the calculus of variations, Appl. Anal. Discrete Math., 5 (2011), 110–121. https://doi.org/10.2298/AADM110131002F doi: 10.2298/AADM110131002F
[19]	G. C. Wu, D. Baleanu, Discrete chaos in fractional delayed logistic maps, Nonlinear Dyn., 80 (2015), 1697–1703. http://dx.doi.org/10.1007/s11071-014-1250-3 doi: 10.1007/s11071-014-1250-3
[20]	J. W. He, L. Zhang, Y. Zhou, B. Ahmad, Existence of solutions for fractional difference equations via topological degree methods, Adv. Differ. Equ., 2018 (2018), 153. https://doi.org/10.1186/s13662-018-1610-2 doi: 10.1186/s13662-018-1610-2
[21]	R. Dahal, C. S. Goodrich, A monotonicity result for discrete fractional difference operators, Arch. Math. (Basel), 102 (2014), 293–299. https://doi.org/10.1007/S00013-014-0620-X doi: 10.1007/S00013-014-0620-X
[22]	C. S. Goodrich, B. Lyons, Positivity and monotonicity results for triple sequential fractional differences via convolution, Analysis, 40 (2020), 89–103. http://dx.doi.org/10.1515/anly-2019-0050 doi: 10.1515/anly-2019-0050
[23]	P. O. Mohammed, T. Abdeljawad, F. K. Hamasalh, On Riemann-Liouville and Caputo fractional forward difference monotonicity analysis, Mathematics, 9 (2021), 1303. https://doi.org/10.3390/math9111303 doi: 10.3390/math9111303
[24]	P. O. Mohammed, T. Abdeljawad, F. K. Hamasalh, On Discrete delta Caputo-Fabrizio fractional operators and monotonicity analysis, Fractal Fract., 5 (2021), 116. https://doi.org/10.3390/fractalfract5030116 doi: 10.3390/fractalfract5030116
[25]	T. Abdeljawad, D. Baleanu, Monotonicity analysis of a nabla discrete fractional operator with discrete Mittag-Leffler kernel, Chaos Soliton. Fract., 102 (2017), 106–110. https://doi.org/10.1016/j.chaos.2017.04.006 doi: 10.1016/j.chaos.2017.04.006
[26]	X. Liu, F. F. Du, D. R. Anderson, B. Jia, Monotonicity results for nabla fractional h-difference operators, Math. Methods Appl. Sci., 44 (2020), 1207–1218. https://doi.org/10.1002/mma.6823 doi: 10.1002/mma.6823
[27]	R. Dahal, C. S. Goodrich, B. Lyons, Monotonicity results for sequential fractional differences of mixed orders with negative lower bound, J. Differ. Equ. Appl., 27 (2021), 1574–1593. https://doi.org/10.1080/10236198.2021.1999434 doi: 10.1080/10236198.2021.1999434
[28]	P. O. Mohammed, O. Almutairi, R. P. Agarwal, Y. S. Hamed, On convexity, monotonicity and positivity analysis for discrete fractional operators defined using exponential kernels, Fractal Fract., 6 (2022), 55. https://doi.org/10.3390/fractalfract6020055 doi: 10.3390/fractalfract6020055
[29]	C. S. Goodrich, J. M. Jonnalagadda, Monotonicity results for CFC nabla fractional differences with negative lower bound, Analysis, 44 (2021), 221–229. https://doi.org/10.1515/anly-2021-0011 doi: 10.1515/anly-2021-0011
[30]	C. S. Goodrich, Monotonicity and non-monotonicity results for sequential fractional delta differences of mixed order, Analysis, 22 (2018). https://doi.org/10.1007/S11117-017-0527-4
[31]	P. O. Mohammed, C. S. Goodrich, F. K. Hamasalh, A. Kashuri, Y. S. Hamed, On positivity and monotonicity analysis for discrete fractional operators with discrete Mittag-Leffler kernel, Math. Methods Appl. Sci., 45 (2022), 6931–6410. https://doi.org/10.1002/mma.8176 doi: 10.1002/mma.8176
[32]	C. S. Goodrich, A. C. Peterson, Discrete fractional calculus, Springer, 2015.

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(1543) PDF downloads(108) Cited by(4)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(2)

AIMS Mathematics

Positivity analysis for mixed order sequential fractional difference operators

Related Papers:

Abstract

1. Introduction

2. The brief review of NTD and GNTD

3. Approximately orthogonal NTD with graph regularized

3.1. Approximately orthogonal term

3.2. The proposed AOGNTD model

4. Optimization algorithm

4.1. Updating factor matrix

4.2. Updating core tensor

4.3. Theoretical investigation

4.4. Computational complexity analysis

5. Experiments

5.1. Compared methods

5.2. Datasets description

5.2.1. COIL20 dataset

5.2.2. Yale dataset

5.2.3. ORL dataset

5.3. Parameter selection

5.4. Experiments for effectiveness and analysis

5.5. Convergence study

6. Conclusions

Author contributions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

Positivity analysis for mixed order sequential fractional difference operators

Related Papers:

Abstract

1. Introduction

2. The brief review of NTD and GNTD

3. Approximately orthogonal NTD with graph regularized

3.1. Approximately orthogonal term

3.2. The proposed AOGNTD model

4. Optimization algorithm

4.1. Updating factor matrix

4.2. Updating core tensor

4.3. Theoretical investigation

4.4. Computational complexity analysis

5. Experiments

5.1. Compared methods

5.2. Datasets description

5.2.1. COIL20 dataset

5.2.2. Yale dataset

5.2.3. ORL dataset

5.3. Parameter selection

5.4. Experiments for effectiveness and analysis

5.5. Convergence study

6. Conclusions

Author contributions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog