
Non-negative matrix factorization (NMF) has been widely used in machine learning and data mining fields. As an extension of NMF, non-negative matrix tri-factorization (NMTF) provides more degrees of freedom than NMF. However, standard NMTF algorithm utilizes Frobenius norm to calculate residual error, which can be dramatically affected by noise and outliers. Moreover, the hidden geometric information in feature manifold and sample manifold is rarely learned. Hence, a novel robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization (RCHNMTF) is proposed. First, a robust capped norm is adopted to handle extreme outliers. Second, dual hyper-graph regularization is considered to exploit intrinsic geometric information in feature manifold and sample manifold. Third, orthogonality constraints are added to learn unique data presentation and improve clustering performance. The experiments on seven datasets testify the robustness and superiority of RCHNMTF.
Citation: Jiyang Yu, Baicheng Pan, Shanshan Yu, Man-Fai Leung. Robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization[J]. Mathematical Biosciences and Engineering, 2023, 20(7): 12486-12509. doi: 10.3934/mbe.2023556
[1] | Qing Yang, Jun Chen, Najla Al-Nabhan . Data representation using robust nonnegative matrix factorization for edge computing. Mathematical Biosciences and Engineering, 2022, 19(2): 2147-2178. doi: 10.3934/mbe.2022100 |
[2] | Zhihong Zhang, Yingchun Luo, Meiping Jiang, Dongjie Wu, Wang Zhang, Wei Yan, Bihai Zhao . An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information. Mathematical Biosciences and Engineering, 2022, 19(6): 6331-6343. doi: 10.3934/mbe.2022296 |
[3] | Wei Kong, Feifan Xu, Shuaiqun Wang, Kai Wei, Gen Wen, Yaling Yu . Application of orthogonal sparse joint non-negative matrix factorization based on connectivity in Alzheimer's disease research. Mathematical Biosciences and Engineering, 2023, 20(6): 9923-9947. doi: 10.3934/mbe.2023435 |
[4] | Yue Liu, Wing-Cheong Lo . Analysis of spontaneous emergence of cell polarity with delayed negative feedback. Mathematical Biosciences and Engineering, 2019, 16(3): 1392-1413. doi: 10.3934/mbe.2019068 |
[5] | Tingzhe Sun, Dan Mu . Multi-scale modeling identifies the role of p53-Gys2 negative feedback loop in cellular homeostasis. Mathematical Biosciences and Engineering, 2020, 17(4): 3260-3273. doi: 10.3934/mbe.2020186 |
[6] | Xiujun Zhang, Ahmad Bilal, M. Mobeen Munir, Hafiz Mutte ur Rehman . Maximum degree and minimum degree spectral radii of some graph operations. Mathematical Biosciences and Engineering, 2022, 19(10): 10108-10121. doi: 10.3934/mbe.2022473 |
[7] | Eduardo Gonzalez-Olivares, Alejandro Rojas-Palma . Global stability in a modified Leslie-Gower type predation model assuming mutual interference among generalist predators. Mathematical Biosciences and Engineering, 2020, 17(6): 7708-7731. doi: 10.3934/mbe.2020392 |
[8] | Rebecca Vandiver . Effect of residual stress on peak cap stress in arteries. Mathematical Biosciences and Engineering, 2014, 11(5): 1199-1214. doi: 10.3934/mbe.2014.11.1199 |
[9] | Zhenglong Tang, Chao Chen . Spatio-temporal information enhance graph convolutional networks: A deep learning framework for ride-hailing demand prediction. Mathematical Biosciences and Engineering, 2024, 21(2): 2542-2567. doi: 10.3934/mbe.2024112 |
[10] | Yutong Man, Guangming Liu, Kuo Yang, Xuezhong Zhou . SNFM: A semi-supervised NMF algorithm for detecting biological functional modules. Mathematical Biosciences and Engineering, 2019, 16(4): 1933-1948. doi: 10.3934/mbe.2019094 |
Non-negative matrix factorization (NMF) has been widely used in machine learning and data mining fields. As an extension of NMF, non-negative matrix tri-factorization (NMTF) provides more degrees of freedom than NMF. However, standard NMTF algorithm utilizes Frobenius norm to calculate residual error, which can be dramatically affected by noise and outliers. Moreover, the hidden geometric information in feature manifold and sample manifold is rarely learned. Hence, a novel robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization (RCHNMTF) is proposed. First, a robust capped norm is adopted to handle extreme outliers. Second, dual hyper-graph regularization is considered to exploit intrinsic geometric information in feature manifold and sample manifold. Third, orthogonality constraints are added to learn unique data presentation and improve clustering performance. The experiments on seven datasets testify the robustness and superiority of RCHNMTF.
As the dimension of data in machine learning and data mining is too high to learn, matrix factorization algorithms are adopted to deconstruct the high-dimensional data into several low-dimensional data and explore the hidden structure of low dimensional data. The widely used matrix factorization approaches, include Principal Component Analysis (PCA) [1], Singular Value Decomposition (SVD) [2], Vector Quantization (VQ) [3] and Non-negative matrix factorization (NMF).
Different from PCA, SVD and VQ, the main motivations of NMF, are to decompose a high-dimensional data matrix into two low-rank non-negative matrices, whose product is approximate to the original data matrix. NMF is able to obtain a parts-based representation of data as the non-negative constraints allow only additive, not subtractive, combinations [4]. NMF algorithm and its extensions have been applied in several areas, e.g., medical research [5], community discovery [6], gene expression [7,8,9,10], text mining [11] and neural network [12,13,14,15].
As an extension of NMF, Non-negative matrix tri-factorization (NMTF) aims to decompose a high-dimensional data matrix into three low-rank non-negative matrices [16,17]. With an extra decomposition matrix, NMTF can obtain higher degrees of freedom than NMF [18]. NMTF is helpful for co-clustering task as it can categorize feature space and sample space simultaneously. Nevertheless, Ding et al. [18] point out that unconstrained NMTF is equivalent to NMF while constrained NMTF brings new features to NMF and hereby propose orthogonal non-negative matrix tri-factorization (ONMTF). ONMTF can obtain a rigorous clustering presentation as non-negative and orthogonality constraints lead to a sparse solution. Abundant research show the superiority of spareness research [14,19,20,21,22,23,24,25,26,27].
The standard NMF and NMTF algorithms rarely consider the hidden geometrical information in feature space and sample space. However, it is pointed out that the observed data lie on a nonlinear low dimensional manifold embedded in a high dimensional ambient space [28,29]. Several manifold learning algorithms have been proposed to exploit intrinsic geometrical information [30,31,32], such as ISOMAP [33] and Laplacian Eigenmap (LE) [34]. Inspired by recent progress in manifold learning, Cai et al. propose graph regularized NMF (GNMF) [35]. GNMF firstly constructs a nearest neighbor graph to encode the geometrical information of the data space and incorporates the graph regularization into its objective function. To explore the geometrical information in feature space, which is neglected in GNMF, Shang et al. further propose graph dual regularized NMF (DNMF) [17]. DNMF constructs dual graph to discover the manifold embedded in sample space and feature space simultaneously. Unfortunately, the high-order relations among samples are seldom considered in graph-based learning methods as the constructed graph only considers the pairwise relationship between two samples or two features. This problem has been solved by hyper-graph learning [36,37]. Contrast to graph, hyper-graph is constructed by edges connected with multiple samples. Therefore, hyper-graph can explore high-order relationship of data and features. Hyper-graph is popular in machine learning fields [38,39]. Feng et al. propose Hyper-graph Neural Networks (HNN)[40]. Jiang et al. propose dynamic hyper-graph neural networks[41]. HNN utilizes hyper-edge convolution operations to learn the hidden layer representation considering the high-order data structure [40,42]. Zeng et al. first combine hyper-graph learning and NMF algorithm and propose hyper-graph regularized NMF (HNMF) [43]. HNMF utilizes single hyper-graph regularization to obtain a better geometrical information in sample space.
Although standard NMF and NMTF algorithms utilize Forbenius norm to calculate residual error and perform well on untainted data, the performance can be dramatically decreased by noise and outliers as Forbenius norm calculates the squared residual error. To alleviate the impact of noise and outliers, several robust M-estimator based NMF algorithms are proposed [44]. Kong et al. propose a robust NMF using l2,1-norm to calculate the residual error of each sample point without squaring it [45]. Gao et al. design a robust capped NMF [46]. The main contribution of robust capped NMF is that it caps the residual of extreme outlier to further enhance the robustness. Li et al. propose a robust NMF using l2,p-norm to further reduce the impact of noise and outliers [47]. Guan et al. develop Truncated Cauchy non-negative matrix factorization which utilizes Truncated Cauchy loss to truncate large errors and handle outliers [48]. Guan et al. also propose Manhattan non-negative matrix factorization (MahNMF) which uses Manhattan distance instead of Euclidean distance to model the heavy tailed Laplacian noise [49].
Motivated by the good robustness of l2,p-norm and capped l2,1-norm and excellent performance of manifold learning based NMF algorithms, a robust capped norm dual hyper-graph regularized NMTF (RCHNMTF) is proposed. Specifically, RCHNMTF utilizes capped l2,p-norm to enhance robustness. Second, RCHNMTF constructs dual hyper-graph to encode intrinsic geometrical information in sample manifold and feature manifold. Third, ONMTF framework is incorporated to RCHNMTF to improve clustering performance. The main contribution of the article is summarized as follows:
1) A novel method called RCHNMTF is proposed. RCHNMTF first utilizes capped l2,p-norm to improve robustness. Dual hyper-graph regularization and ONMTF framework are also added to RCHNMTF to further improve clustering performance. 2) The optimization problem of RCHNMTF is reformulated and an alternative iteration algorithm is designed thereafter to simplify iteration steps. The computational complexity of RCHNMTF and its comparison methods are calculated thoroughly with three arithmetic operations and big O notation. 3) Experiments on seven real-world datasets and four noised datasets verify the superior clustering performance and robustness of RCHNMTF, compared with eight state-of-the-art algorithms.
The rest of the paper is arranged as follows: In Section 2, NMF algorithm, lq,p-norm, capped l2,1-norm NMF and hyper-graph regularization are introduced. In Section 3, the formulation of RCHNMTF is first described. The optimization problem of RCHNMTF is reformulated thereafter to simplify iteration steps. The computational complexity of RCHNMTF and its comparison algorithms are thoroughly analyzed. Section 4 demonstrates the clustering performance and robustness of RCHNMTF on seven real-world datasets and four contaminated datasets, compared with eight state-of-the-art algorithms. The paper is summarized in Section 5.
Given a data matrix X=[x1,x2,...,xn]∈Rm×n, each column of X denotes a sample vector xn. NMF aims to decompose the data matrix X into two low-dimensional non-negative matrices U=[u1,u2,...,uk]∈Rm×k and VT=[v1,v2,...,vk]∈Rk×n, whose product UVT is approximate to the original data matrix X. The residual error between X and UVT is calculated by Frobenius norm. The optimization problem of NMF is defined as [4]:
minU,V‖X−UVT‖2Fs.t.U≥0,V≥0 | (2.1) |
where ‖⋅‖F represents the Frobenius norm of the matrix. Optimization problem (2.1) can be solved by the following multiplicative update rules [4]:
U←UXVUVTV | (2.2) |
V←VXTUVUTU | (2.3) |
The lq,p-norm of X is defined as follows [47]:
‖X‖q,p=(N∑i=1‖xi‖pq)1/p,p∈(0,1]. | (2.4) |
where ‖⋅‖q,p represents lq,p-norm. Let q=2, the l2,p-norm of X is defined as follows:
‖X‖2,p=(N∑i=1‖xi‖p2)1/p,p∈(0,1]. | (2.5) |
where ‖⋅‖2,p represents l2,p-norm. It should be noticed that l2,p-norm (0<p<1) is not a valid matrix norm as it does not admit the triangular inequality. However, for simplicity, we term it a matrix norm. Moreover, the l2,p matrix pseudo norm is not convex or Lipschitz continuous as the lp-norm (0<p<1) is neither convex nor Lipschitz continuous.
To decrease the influence of extreme outliers, Gao et al. propose capped l2,1-norm based NMF, whose optimization problem is defined as follows [46]:
minU,Vn∑i=1min{‖xi−Uvi‖2,θ},s.t.U≥0,V≥0. | (2.6) |
where θ>0 is a thresholding parameter to choose the extreme data outliers. In capped l2,1-norm based NMF, if the residual error of a data point ‖xi−Uvi‖>θ, then data point xi is determined as an extreme outlier and its residual is capped as θ to fix its effect on the whole model. For other data points ‖xi−Uvi‖<θ, the algorithm calculates the residual error using l1-norm, which is also robust to regular data outliers. Moreover, θ is set according to the ratio of outliers. With an extra thresholding parameter θ to select extreme data outliers and cap their influence, capped l2,1-norm based NMF is more robust than l2,1-norm based NMF.
With the development of graph theory, hyper-graph regularization has been widely used. Unlike graph regularization only considers the pairwise relationships between two samples or two features, hyper graph regularization considers the relationships between multiple samples or multiple features. Specifically, an edge of graph connects two nodes while an edge of hyper graph connects multiple nodes. Therefore, using hyper-graph prevents the forced conversion of multivariate relationships into binary relationships and explores the high-order information in sample space and feature space [43].
Hyper-graph G=(V,E,W) consists of vertex set V, hyper-edge set E and diagonal hyper-edge weight matrix W. The incidence matrix H∈R|V|×|E| is defined as follows:
H(v,e)={1,ifv∈e;0,ifv∉e. | (2.7) |
Wi represents the weight of hyper-edge ei. The calculation method of Wi depends on the specific situation. Denote Vx and Vy represent the sample points Xx and Xy, respectively. In this paper, Wi is defined as:
Wi=∑Vx,Vy∈eiexp(−‖Vx−Vy‖22σ2) | (2.8) |
where σ2 is defined as:
σ2=∑Vx,Vy∈ei‖Vx−Vy‖22k | (2.9) |
where parameter k represents the value of k-nearest neighbors for each vertex.
The degree of vertex v is expressed as:
d(v)=∑e∈Ew(e)H(w,e) | (2.10) |
The degree of edge e is defined as:
f(e)=∑v∈VH(w,e) | (2.11) |
Given diagonal matrix Dv composed of d(v) and diagonal matrix De composed of f(e), the unnormalized hyper-graph Laplacian matrix Lhyper is defined as:
S=HWD−1eH−1Lhyper=Dv−S | (2.12) |
In this section, a novel algorithm called RCHNMTF is proposed. After that, an efficient optimization algorithm is designed. The computational complexity analysis of RCHNMTF and comparison algorithms are discussed subsequently. Figure 1 shows the framework of RCHNMTF.
The optimization problem of NMTF with capped l2,p-norm is formulated as follows:
minU,B,Vn∑i=1min{‖(X−UBVT)i‖p2,θ}s.t.U≥0,B≥0,V≥0,p∈(0,1]. | (3.1) |
where θ denotes a thresholding parameter.
In optimization problem (3.1), ∑ni=1min{‖(X−UBVT)i‖p2,θ} indicates using capped l2,p-norm to measure the residual error of each point. If the residual error ‖(X−UBVT)i‖p2 of data point xi is bigger than θ, data point xi is determined as extreme outliers and its residual error is capped. The influence of outlier sample point xi to the whole model is decreased by capping its residual error. For other data points whose residual error ‖(X−UBVT)i‖p2<θ, the optimization problem will minimize ∑ni=1‖(X−UBVT)i‖p2, which is also robust as it calculates the residual error of each sample point to the p-th power, compared with Forbenius norm in original NMF measures the residual error to the power of 2. Therefore, utilizing capped l2,p -norm, the proposed algorithm can decrease the influence of extreme outliers and have good robustness.
To adopt geometric information of data space and sample space, dual hyper-graph regularization is incorporated to optimization problem (3.1). Let LUhyper and LVhyper represent the unnormalized hyper-graph Laplacian matrix of matrix U and V, respectively. With hyper-graph Laplacian matrix LUhyper and LVhyper, we extend the optimization problem as:
minU,B,Vn∑i=1min{‖(X−UBVT)i‖p2,θ}+αTr(VTLVhyperV)+αTr(UTLUhyperU)s.t.U≥0,B≥0,V≥0,p∈(0,1]. | (3.2) |
where Tr(⋅) denotes the trace of a matrix and α denotes dual hyper-graph regularization parameter. The dual hyper-graph regularization is formulated as αTr(VTLVhyperV)+αTr(UTLUhyperU). Dual hyper-graph regularization utilizes two nearest neighbor hyper-graphs to exploit the geometric structure of the sample manifold and feature manifold.
Orthogonality constraints in NMTF is important as orthogonality constraints lead to a sparse and unique solution. To guarantee orthogonality of matrix U and V, orthogonality penalty terms β‖UTU−IK‖2F and β‖VTV−IK‖2F are added to the optimization problem:
minU,B,Vn∑i=1min{‖(X−UBVT)i‖p2,θ}+αTr(VTLVhyperV)+αTr(UTLUhyperU)+β‖UTU−IK‖2F+β‖VTV−IK‖2Fs.t.U≥0,B≥0,V≥0,p∈(0,1]. | (3.3) |
where β≥0 denotes orthogonality parameter.
First, the following equation can be easily proved:
‖X−UBVT‖p2,p=Tr((X−UBVT)H(X−UBVT)T),p∈(0,1]. | (3.4) |
where H is the diagonal matrix whose i -th diagonal element is calculated as:
hii=1‖(X−UBVT)i‖2−p2 | (3.5) |
It is apparent that optimization problem (3.3) is a non-convex optimization problem, so it is not easy to solve the optimization problem (3.3) directly. The optimization problem (3.3) is reformulated as follows:
minU,B,VTr((X−UBVT)D(X−UBVT)T)+αTr(VTLVhyperV)+αTr(UTLUhyperU)+β‖UTU−Ik‖2F+β‖VTV−Ik‖2Fs.t.U≥0,B≥0,V≥0,p∈(0,1]. | (3.6) |
where Ik is the a×a identity matrix and D is the diagonal matrix whose i-th diagonal element is calculated as:
dii={1‖(X−UBVT)i‖2−p2,if‖(X−UBVT)i‖2−p2<θ;0,otherwise. | (3.7) |
Considering ||X||2F=Tr(XXT), the optimization problem (3.6) is rewritten as follows:
minU,B,VTr(XDXT)−2Tr(UBVTDXT)+Tr(UBVTDVBTUT)+αTr(VTLVhyperV)+αTr(UTLUhyperU)+βTr(UTUUTU−2UTU+Ik)+βTr(VTVVTV−2VTV+Ik) | (3.8) |
Optimization problem (3.8) can be solved by multiplicative iteration method. Let ϕ=[ϕjk]∈Rm×k≥0, ω=[ωkk′]∈Rk×k≥0, ψ=[ψik]∈Rn×k≥0 be the Lagrange multipliers for U, B, V respectively, Lagrange function L is obtained as follows:
L=Tr(XDXT)−2Tr(UBVTDXT)+Tr(UBVTDVBTUT)+αTr(VTLVhyperV)+αTr(UTLUhyperU)+βTr(UTUUTU−2UTU+Ik)+βTr(VTVVTV−2VTV+Ik)+Tr(ϕUT)+Tr(ωBT)+Tr(ψVT) | (3.9) |
The partial derivatives of L with respect to U, B and V are as follows:
∂L∂U=−2XDVBT+2UBVTDVBT+2αLUhyper+4β(UUTU−U)+ϕ | (3.10) |
∂L∂B=−2UTXDV+2UTUBVTDV+ω | (3.11) |
∂L∂V=−2DXTUB+2DVBTUTUB+2αLVhyper+4β(VVTV−V)+ψ | (3.12) |
Considering that LUhyper and LVhyper are not non-negative matrices, we refer to (2.12) and define non-negative matrices DUhyper, SUhyper, DVhyper, SVhyper as follows:
LUhyper=DUhyper−SUhyperLVhyper=DVhyper−SVhyper | (3.13) |
Using the KKT condition ϕU=0, ωB=0, and βV=0, we have:
−(XDVBT+αSUhyperU+2βU)jkujk+(UBVTDVBT+αDUhyperU+2βUUTU)jkujk=0 | (3.14) |
−(UTXDV)kk′bkk′+(UTUBVTDV)kk′bkk′=0 | (3.15) |
−(DXTUB+αSVhyperV+2βV)ikvik+(DVBTUTUB+αDVhyperV+2βVVTV)ikvik=0 | (3.16) |
According to the above equations, the multiplicative update rules of RCHNMTF are as follows:
ujk←ujk(XDVBT+αSUhyperU+2βU)jk(UBVTDVBT+αDUhyperU+2βUUTU)jk | (3.17) |
bkk′←bkk′(UTXDV)kk′(UTUBVTDV)kk′ | (3.18) |
vik←vik(DXTUB+αSVhyperV+2βV)ik(DVBTUTUB+αDVhyperV+2βVVTV)ik | (3.19) |
Algorithm 1 Alternative iteration algorithm for RCHNMTF |
Input: Data matrix X∈Rm×n, parameter α,β,k. |
Output: U∈Rm×k, V∈Rn×k |
1: Initialize U≥0,B≥0,V≥0 |
2: Initialize matrix D based on Eq (3.7) |
3: Repeat: |
4: Calculate matrix D based on Eq (3.7) |
5: Update U based on Eq (3.17) |
6: Update B based on Eq (3.18) |
7: Update V based on Eq (3.19) |
8: Until Converges |
9: Return U,V |
10: Apply K-means clustering method to U,V to get learned cluster label information. |
In this part, the computational complexity of the proposed RCHNMTF algorithm and comparison algorithms are calculated and presented. The computational complexity of NMF is also added for reference. To accurately present the computational complexity of RCHNMTF and comparison algorithms, big O notation and three arithmetic operations, namely addition, multiplication and division, are adopted. Table 1 shows the detailed computational complexity information of each algorithm in each iteration. It can be seen that the extra factorization matrix, graph regularization, hyper-graph regularization and orthogonality constraints will all increase the computational complexity. However, these extra term will not dramatically affect the runtime of algorithms. It can be explained by two reasons. First, if measured by big O notation, the computational complexity of all algorithms are O(mnk). Second, considering k<<min(m,n), mnk term counts much in computational complexity analysis and the coefficient of mnk term in RCHNMTF is not high, which indicates the computational complexity of RCHNMTF is not high. Moreover, compared with other M-estimator based NMF algorithms, the computational complexity of capped lq,p-norm are small (e.g., correntropy based NMF algorithms require exponential operation). If calculate the computational complexity of RCHNMTF algorithm from the beggining to the n-th iterations, then the computational complexity of RCHNMTF is O(tmnk+mn2+nm2), due to the extra O(mn2+nm2) from constructing dual hyper-graph in feature space and sample space.
floating point addition | floating point multiplication | floating point division | overall | |
CNMTF | 5mnk+9mk2+6nk2+2k3+11mk+5nk+(m+n)pk | 5mnk+9mk2+6nk2+2k3+8mk+2nk+k2+(m+n)pK | mk+nk+k2 | O(mnk) |
CHNMF | 2mnk+2mk2+2nk2+4mk+3nk+npk | 2mnk+2mk2+2nk2+5mk+2nk+npk | mk+nk | O(mnk) |
DNMF | 2mnk+2mk2+2nk2+3mk+3nk+(m+n)pk | 2mnk+2mk2+2nk2+2mk+2nk+(m+n)pk | mk+nk | O(mnk) |
GNMF | 2mnk+2mk2+2nk2+mk+3nk+npk | 2mnk+2mk2+2nk2+mk+2nk+npk | mk+nk | O(mnk) |
HNMF | 2mnk+2mk2+2nk2+mk+3nk+mpk | 2mnk+2mk2+2nk2+mk+2nk+npk | mk+nk | O(mnk) |
RHNMF | 2mnk+2mk2+2nk2+7nk+npk | 2mnk+2mk2+2nk2+mk+6nk+npp | mk+nk | O(mnk) |
CaNMF | 3mnk+mk2+3nk2+4nk | 3mnk+mk2+3nk2+mk+5nk | mk+nk | O(mnk) |
DHSNMF | 2mnk+2mk2+2nk2+3mk+3nk+(m+n)pk | 2mnk+2mk2+2nk2+2mk+2nk+(m+n)pk | mk+nk | O(mnk) |
RCHNMTF | 3mnk+8mk2+5nk2+6k3+5mk+11nk+(m+n)pk | 3mnk+8mk2+5nk2+6k3+2mk+8nk+k2+(m+n)pk | mk+nk+k2 | O(mnk) |
In this section, RCHNMTF algorithm is compared with eight state-of-the-art algorithms on seven datasets (i.e., AR100*, ORL†, YALE‡, PIE§, MSRA25¶, PENDIGITS|| and COIL100**) to verify the effectiveness and robustness of the proposed algorithm.
*http://www2.ece.ohio-state.edu/aleix/ARdatabase.html
†http://www.cl.cam.ac.uk/Research/DTG/attarchive:pub/data/att_faces.tar.Z
‡http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html
¶http://www.escience.cn/people/fpnie/index.html
||http://archive.ics.uci.edu/ml/datasets/pen-based+recognition+of+handwritten+digits
**https://www.cs.columbia.edu/CAVE/software/softlib/coil-100.php
Seven real-world datasets are used in the experiment. The description of these datasets can be found in Table 2. Besides, to simulate the situation in reality where parts of sample points are affected by noise and outliers, four contaminated datasets are produced by adding noise to parts of the sample points of the original dataset while leaving the other data points unchanged. Specifically, the formulations are: 1) The last 5 images of each class in PIE dataset are noised by Gaussian noise with mean 0 and variance V = 0.1, 0.2, 0.3, 0.4, 0.5; 2) The last 4 images of each class in YALE dataset are contaminated with Speckle noise with mean 0 and variance V = 0.08, 0.16, 0.24, 0.32, 0.4; 3) The last 2 images of each class in ORL dataset are polluted by a×a-blocks noise with a=6,7,8,9,10; 4) The last 2 images of each class in ORL dataset are contaminated by Salt & Pepper noise with noise density D = 10, 20, 30, 40, 50%. Figure 2 shows the example class of each polluted datasets.
Datasets | Sample | Feature | Class | Data types |
AR100 | 1300 | 1024 | 100 | Face Image |
ORL | 400 | 1024 | 40 | Face Image |
YALE | 165 | 1024 | 15 | Face Image |
PIE | 1166 | 1024 | 53 | Face Image |
MSRA25 | 1799 | 256 | 12 | Face Image |
PENDIGITS | 10992 | 16 | 10 | Handwritten digits |
COIL100 | 7200 | 1024 | 100 | Object image |
The proposed RCHNMTF algorithm are compared with eight state-of-the-art algorithms to verify its clustering performance and robustness, which are listed as follows:
1). CNMTF [50]. A correntropy based NMTF combines dual graph regularization and orthogonal constraints.
2). CHNMF [7]. It incorporates correntropy and hyper-graph regularization into NMF.
3). DNMF [17]. It utilizes dual graph regularization to learn the geometrical information of the sample manifold and the feature manifold.
4). GNMF [35]. It constructs a single graph to consider the geometrical information of the sample manifold.
5). HNMF [43]. A hyper-graph regularized NMF constructs a hyper-graph to explore the geometrical information of the sample manifold.
6). RHNMF [8]. A robust NMF utilizes l2,1-norm and hyper-graph regularization.
7). CaNMF [46]. A robust NMF adopts capped norm to cap the residual error of extreme outliers.
8). DHSNMF [51]. A robust dual hyper-graph regularized supervised NMF. To compare DHSNMF with other unsupervised algorithms in a fair way, the label information parameter of DHSNMF is set to 0 specifically.
To evaluate the performance and robustness of RCHNMTF and comparison algorithms in a sound manner, three evaluation metrics, namely purity (PUR), normalized mutual information (NMI) and accuracy (ACC) are introduced.
Purity demonstrates how well each cluster contains sample from primarily one, which is defined as follows:
Purity=1NN∑j=1max(nji) | (4.1) |
where nji denotes the sample in cluster i that also belongs to original class j.
NMI calculates the shared information of two clusters and it expresses the degree of agreement between the two clusters. Given the ground truth cluster C and the cluster ¯C from clustering algorithm result, NMI of C and ¯C is calculated as follows:
NMI=MI(C,¯C)max(H(C),H(¯C)) | (4.2) |
where MI(C,¯C) denotes the mutual information of the cluster C and ¯C, H(C) and H(¯C) are the entropies of C and ¯C, respectively.
Given ground truth label li and the learned cluster label ri, accuracy is defined as follows:
ACC=N∑n=1δ(ln,map(rn))n | (4.3) |
where δ(x,y)=1 if x=y and δ(x,y)=0 otherwise. Mapping function Map(⋅) is solved by Hungarian algorithm [52]. For the mentioned evaluation metrics PUR, NMI and ACC, the higher they are, the better clustering performance the algorithm is.
In this part, experiments setup are reported in detail. The dimension k of the B∈Rk×k is adjusted to be the same as the number of real classes for all datasets. K-nearest neighbor method is applied to construct graph and hyper-graph in graph regularized and hyper-graph regularized algorithms. The nearest neighbor parameter p in k-nearest neighbor method is set to 5 empirically. HeatKernel method is adopted to assign weights for each edge of a graph. For hyper-graph, the weight of a hyper-edge is calculated by (2.8). Parameter θ is not fixed but set according to the outliers radio s. Outliers ratio s denotes the ratio of outlier samples to the whole sample. Specifically, in the first five iterations, the outlier data with the largest l2,p loss at a ratio of s are selected to determine θ. To better reduce the influence of extreme sample points, s is set to 0.05 and 0.1 for unpolluted real-world datasets and polluted real-world datasets, respectively. Additionally, dual hyper-graph regularization parameter α in RCHNMTF is tuned as 100 and orthogonality parameter β is set to 0.01. Moreover, the parameter p in the capped l2,p-norm is set to 0.5 to achieve good clustering performance and robustness. The parameters of the comparison algorithms are set according to the default given values. Matrices U,B,V are initialized randomly and we run each method on different datasets 20 times. After decomposition to obtain low-dimensional representation matrix V, we apply k-means method to V to get clustering performance measured by PUR, NMI and ACC.
To testify the clustering performance of RCHNMTF, RCHNMTF and eight comparison algorithms are first experimented on seven real-world datasets. Tables 3–5 demonstrate the clustering performance on seven real-world datasets measured by PUR, NMI and ACC. Figures 3–6 present the clustering performance on polluted datasets. Figure 7 shows the convergent results on seven datasets. To validate the robustness of RCHNMTF, RCHNMTF and the comparison algorithms are further experimented on four contaminated datasets. From the experiment results on seven real-world datasets and four polluted datasets, the following conclusions are obtained:
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 48.78±1.24 | 70.00±2.18 | 63.92±2.81 | 57.31±4.16 | 47.03±2.42 | 73.44±3.17 | 47.72±1.17 |
CHNMF [7] | 52.28±1.37 | 74.83±1.60 | 62.56±1.54 | 57.42±2.16 | 46.42±2.10 | 67.12±4.06 | 47.86±1.55 |
DNMF [17] | 59.70±1.47 | 77.87±2.71 | 60.94±1.88 | 54.78±2.84 | 42.42±3.18 | 73.64±4.45 | 54.81±2.14 |
GNMF [35] | 58.14±1.63 | 78.73±1.95 | 63.44±2.99 | 53.00±2.66 | 42.24±3.12 | 71.00±5.19 | 55.33±1.10 |
HNMF [43] | 55.43±1.57 | 76.44±2.81 | 64.05±1.67 | 55.04±1.87 | 42.60±2.42 | 73.24±5.46 | 58.00±1.29 |
RHNMF [8] | 59.05±1.22 | 78.49±2.21 | 63.05±1.50 | 54.37±1.78 | 47.93±1.39 | 71.46±6.22 | 50.62±1.58 |
DHSNMF [51] | 55.96±1.77 | 75.28±2.58 | 65.12±1.58 | 52.57±1.998 | 43.15±2.58 | 70.27±4.32 | 51.91±1.34 |
RCHNMTF | 60.08±1.91 | 80.04±2.44 | 65.50±2.31 | 59.13±2.72 | 45.33±2.39 | 74.24±2.91 | 58.26±1.64 |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 51.98±1.37 | 74.10±1.63 | 69.12±1.96 | 58.84±3.63 | 48.00±2.70 | 75.13±2.71 | 51.91±1.06 |
CHNMF [7] | 55.79±1.42 | 78.30±1.41 | 67.36±1.57 | 59.57±1.49 | 47.21±2.25 | 70.22±3.02 | 52.73±1.34 |
DNMF [17] | 62.99±1.23 | 81.50±2.02 | 67.02±1.50 | 56.85±2.31 | 43.87±2.53 | 75.56±3.37 | 58.83±1.59 |
GNMF [35] | 61.20±1.45 | 82.50±1.85 | 68.34±1.73 | 55.05±2.03 | 43.75±2.90 | 73.61±3.31 | 59.12±0.84 |
HNMF [43] | 58.37±1.54 | 80.64±2.07 | 68.25±1.55 | 57.67±1.25 | 43.87±2.06 | 74.86±3.58 | 62.35±0.93 |
RHNMF [8] | 61.52±0.78 | 79.53±2.39 | 68.10±1.56 | 61.11±3.73 | 44.97±1.81 | 74.38±2.78 | 55.78±1.16 |
CaNMF [46] | 61.96±0.99 | 81.97±1.81 | 66.75±1.31 | 56.81±1.63 | 48.66±1.64 | 73.42±4.53 | 53.36±1.18 |
DHSNMF [51] | 58.89±1.46 | 79.19±1.95 | 69.20±1.33 | 55.53±1.88 | 44.06±2.08 | 72.48±2.84 | 56.30±1.21 |
RCHNMTF | 62.50±1.64 | 83.35±1.68 | 69.61±2.01 | 61.22±2.82 | 46.48±2.40 | 75.87±2.67 | 62.75±1.44 |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 75.85±1.24 | 83.37±1.22 | 82.35±1.28 | 61.77±3.72 | 51.68±2.37 | 71.10±2.55 | 74.99±0.55 |
CHNMF [7] | 78.41±0.72 | 86.29±1.11 | 80.48±0.79 | 63.61±2.46 | 51.70±2.24 | 65.52±2.22 | 75.43±0.31 |
DNMF [17] | 82.54±0.54 | 88.48±1.11 | 82.56±0.37 | 60.87±2.19 | 48.26±2.11 | 69.61±2.98 | 81.77±0.32 |
GNMF [35] | 82.15±0.78 | 89.52±1.47 | 81.70±1.29 | 59.03±1.71 | 48.02±1.72 | 69.63±1.23 | 81.65±0.24 |
HNMF [43] | 80.62±0.96 | 88.54±1.27 | 80.87±0.88 | 60.85±1.84 | 47.83±1.38 | 70.33±1.31 | 78.29±0.81 |
RHNMF [8] | 82.31±0.20 | 87.21±1.42 | 80.69±0.93 | 63.92±3.28 | 49.38±2.10 | 69.81±1.43 | 78.27±0.18 |
CaNMF [46] | 82.08±0.50 | 88.53±1.20 | 79.74±0.70 | 59.68±1.95 | 51.51±1.41 | 68.85±3.42 | 76.18±0.46 |
DHSNMF [51] | 79.8±0.64 | 87.54±1.03 | 81.84±1.13 | 60.45±2.79 | 48.33±1.43 | 68.10±1.71 | 77.00±0.45 |
RCHNMTF | 82.56±0.77 | 89.64±0.99 | 81.84±0.56 | 64.92±2.28 | 50.46±1.08 | 67.46±2.36 | 80.78±0.67 |
1) RCHNMTF outperforms other algorithms on most original datasets (e.g., AR100, PIE, ORL, MSRA25, PENDIGITS, COIL100). The reasons are as follows: ⅰ.) Capped l2,p-norm can alleviate the impact caused by the noise and outliers inherent in the original datasets; ⅱ.) Dual hyper-graph regularization helps RCHNMTF to obtain intrinsic geometrical information of feature manifold and data manifold; ⅲ.) Orthogonal NMTF framework provides a decomposed results with more degree of freedom and enables RCHNMTF to learn a unique decomposition result.
2) The algorithms which utilize lq,p-norm or capped lq,p-norm (e.g., RHNMF, CaNMF, RCHNMTF) are more robust than other algorithms as the former calculates the residual error of each sample point. When parts of sample points are contaminated, lq,p-norm and capped lq,p-norm are able to distinguish the polluted sample points and reduces the influence of outlier sample points. Moreover, RCHNMTF is more robust to CaNMF and RHNMF because capped l2,p-norm caps the residual error of extreme outlier points and uses l2,0.5-norm to calculate the residual error of other data, which can alleviate the impact of outlier points as much as possible. Therefore, RCHNMTF outperforms other algorithms when the dataset has extreme outliers.
3) The convergence curve in Figure 7 shows the convergence results of RCHNMTF.
Parameter selection are crucial to RCHNMTF algorithm. The main parameters in RCHNMTF are p in capped l2,p-norm, outlier ratio s, thresholding parameter θ, hyper-graph graph regularization parameter α and orthogonality constrains parameter β. During the experiments, α and β are fixed as 100 and 0.01 [8,50]. The value of θ is set according to the ratio of the outliers s. Specifically, in the first five iterations, we select outlier data with the largest l2,p loss at a ratio of s and the value of l2,p loss is assigned to θ. By this means, outlier data with the largest l2,p loss is capped and its impact to the whole model is reduced. In parameter selection, s and p are adjusted to find their optimal values. Figure 8 shows the clustering accuracy on different datasets with s ranging in {0,0.05,0.1,0.15,0.2}. Table 6 demonstrates the clustering performance with p ranging in {0.25,0.5,0.75}. The conclusions are as follows:
Accuracy under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 58.67 | 77.83 | 63.25 | 52.15 | 44.72 | 66.88 | 54.975 |
p=0.5 | 60.08 | 80.04 | 65.50 | 59.13 | 45.33 | 74.24 | 58.26 |
p=0.25 | 59.09 | 77.39 | 63.0 | 54.43 | 43.515 | 70.63 | 58.50 |
Purity under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 61.40 | 81.32 | 67.7 | 55.23 | 46.42 | 69.22 | 59.48 |
p=0.5 | 62.50 | 83.35 | 69.61 | 61.22 | 46.48 | 75.87 | 62.75 |
p=0.25 | 61.75 | 81.02 | 67.95 | 56.77 | 44.24 | 71.78 | 62.91 |
NMI under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 82.68 | 87.90 | 80.665 | 58.24 | 49.94 | 65.61 | 78.83 |
p=0.5 | 82.56 | 89.64 | 81.84 | 64.92 | 50.46 | 67.46 | 81.78 |
p=0.25 | 83.24 | 88.50 | 80.67 | 60.42 | 48.18 | 66.29 | 80.61 |
1) Compared with uncapped l2,p-norm, a proper value of s in capped l2,p-norm can improve clustering performance on both original and noised datasets (e.g., s=0.05 for original datasets and s=0.1 for noised datasets), because the residual error of intrinsic noised points in original real-world datasets and the formulated outlier sample points in noised datasets are capped and thus the influence of noise and outliers are decreased. However, relatively large s will reduce the clustering performance, as the unpolluted sample points are misidentified as outlier sample points.
2) RCHNMTF with p=0.5 is better than the ones with p=0.75 and p=0.25. Given that l2,p-norm calculates the residual error of each sample points to the p-th power, a relatively large p will decrease the robustness of capped l2,p-norm while a relatively small p will blur the difference between normal sample points and outlier sample points.
In this subsection, ablation study is made to demonstrate the superiority of dual hyper-graph regularization and orthogonality constraints. Table 7 records the ablation study of RCHNMTF for three cases: 1) without dual hyper-graph regularization; 2)without orthogonality constraints; 3) with dual hyper-graph regularization and orthogonality constraints. α and β are accordingly set to: 1) 0 and 0.01; 2) 100 and 0; 3) 100 and 0.01. From Table 7, we can learn that both hyper-graph regularization and orthogonality constraints will improve the clustering performance of RCHNMTF.
RCHNMTF without dual hyper-graph regularization | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 53.93 | 79.14 | 58.85 | 55.22 | 44.84 | 70.34 | 26.05 |
NMI | 76.76 | 86.69 | 73.89 | 58.14 | 49.43 | 64.54 | 58.28 |
ACC | 51.33 | 75.48 | 55.40 | 53.61 | 43.75 | 69.62 | 24.02 |
RCHNMTF without orthogonality constraints | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 33.82 | 78.21 | 67.10 | 46.51 | 39.39 | 61.50 | 29.03 |
NMI | 61.54 | 86.31 | 80.15 | 51.28 | 45.32 | 56.48 | 48.43 |
ACC | 30.32 | 74.97 | 62.75 | 43.46 | 37.57 | 60.78 | 24.33 |
RCHNMTF with dual hyper-graph regularization and orthogonality constraints | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 62.50 | 83.35 | 69.61 | 61.22 | 46.48 | 75.87 | 62.75 |
NMI | 82.56 | 89.64 | 81.84 | 64.92 | 50.46 | 67.46 | 81.78 |
ACC | 60.08 | 80.04 | 65.50 | 59.13 | 45.33 | 74.24 | 58.26 |
In this paper, the robustness of NMTF is improved by introducing capped l2,p-norm to cap the extreme outlier points. Dual hyper-graph is constructed to encode the high-dimensional geometrical information of the sample space and feature space. ONMTF framework is incorporated to RCHNMTF to get unique clustering solution and improve clustering performance. To solve the formulated problem, the optimization problem is rewritten and an alternative algorithm is designed. The computational complexity of RCHNMTF and comparison algorithms are thoroughly analyzed. Abundant experiments verify that RCHNMTF performs well on real-world datasets and performs better on datasets with extreme outliers.
Although RCHNMTF is robust and has good clustering performance, it still has some limitations. Firstly, RCHNMTF has many parameters, which makes adjusting parameters troublesome. Secondly, though the first term in (3.3) can distinguish and cap outlier sample points, the orthogonal penalty term and hyper-graph regularization term regard outlier sample points as normal sample points and can not cap the influence of outliers. In the future, several investigations will be conducted to improve RCHNMTF:
1) Systematic mechanisms for simplifying parameter selections are needed.
2) The orthogonality penalty term and hyper-graph regularization term also need to be optimized for finding and capping outlier sample points.
3) Multi-view clustering is popular in machine learning field as multi-view clustering can analyze multi-view data [53,54,55,56,57,58,59,60,61,62]. It is worth the time to utilize RCHNMTF in multi-view clustering field.
The authors are grateful for the reviewers and editor for their beneficial suggestions.
The authors declare that there is no conflict of interest.
[1] | I. T. Jolliffe, J. Cadima, in Principal component analysis: a review and recent developments, Philosophical transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374 (2016), 20150202. https://doi.org/10.1098/rsta.2015.0202 |
[2] | R. O. Duda, P. E. Hart, Pattern Classification, John Wiley & Sons, 2006. |
[3] | A. Gersho, R. M. Gray, Vector Quantization and Signal Compression, Springer Science & Business Media, 2012. |
[4] | D. Seung, L. Lee, Algorithms for non-negative matrix factorization, Adv. Neural Inf. Process. Syst., 13 (2001), 556–562. |
[5] |
D. Li, S. Zhang, X. Ma, Dynamic module detection in temporal attributed networks of cancers, IEEE/ACM Trans. Comput. Biol. Bioinf., 19 (2021), 2219–2230. https://doi.org/10.1109/TCBB.2021.3069441 doi: 10.1109/TCBB.2021.3069441
![]() |
[6] |
Z. Zhao, Z. Ke, Z. Gou, H. Guo, K. Jiang, R. Zhang, The trade-off between topology and content in community detection: An adaptive encoder–decoder-based nmf approach, Expert Syst. Appl., 209 (2022), 118230. https://doi.org/10.1016/j.eswa.2022.118230 doi: 10.1016/j.eswa.2022.118230
![]() |
[7] |
N. Yu, M. J. Wu, J. X. Liu, C. H. Zheng, Y. Xu, Correntropy-based hypergraph regularized nmf for clustering and feature selection on multi-cancer integrated data, IEEE Trans. Cybern., 51 (2020), 3952–3963. https://doi.org/10.1109/TCYB.2020.3000799 doi: 10.1109/TCYB.2020.3000799
![]() |
[8] | N. Yu, Y. L. Gao, J. X. Liu, J. Wang, J. Shang, Hypergraph regularized nmf by l 2, 1-norm for clustering and com-abnormal expression genes selection, in 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), (2018), 578–582. |
[9] |
M. Venkatasubramanian, K. Chetal, D. J. Schnell, G. Atluri, N. Salomonis, Resolving single-cell heterogeneity from hundreds of thousands of cells through sequential hybrid clustering and nmf, Bioinformatics, 36 (2020), 3773–3780. https://doi.org/10.1093/bioinformatics/btaa201 doi: 10.1093/bioinformatics/btaa201
![]() |
[10] |
W. Wu, X. Ma, Network-based structural learning nonnegative matrix factorization algorithm for clustering of scrna-seq data, IEEE/ACM Trans. Comput. Biol. Bioinf., 20 (2022), 566–575. https://doi.org/10.1038/s41579-022-00790-1 doi: 10.1038/s41579-022-00790-1
![]() |
[11] | R. Egger, J. Yu, A topic modeling comparison between lda, nmf, top2vec, and bertopic to demystify twitter posts, Front. Soc., 7 (2022). |
[12] |
H. Che, J. Wang, Nonnegative matrix factorization algorithm based on a discrete-time projection neural network, Neural Networks, 103 (2018), 63–71. https://doi.org/10.1016/j.neunet.2018.03.003 doi: 10.1016/j.neunet.2018.03.003
![]() |
[13] | H. Che, J. Wang, A. Cichocki, Bicriteria sparse nonnegative matrix factorization via two-timescale duplex neurodynamic optimization, IEEE Trans. Neural Networks Learn. Syst., 2021 (2021). |
[14] |
H. Che, J. Wang, A two-timescale duplex neurodynamic approach to mixed-integer optimization, IEEE Trans. Neural Networks Learn. Syst., 32 (2020), 36–48. https://doi.org/10.1109/TNNLS.2020.2973760 doi: 10.1109/TNNLS.2020.2973760
![]() |
[15] | X. Ma, W. Zhao, W. Wu, Layer-specific modules detection in cancer multi-layer networks, IEEE/ACM Trans. Comput. Biol. Bioinf., 2022 (2022). |
[16] | S. Wang, A. Huang, Penalized nonnegative matrix tri-factorization for co-clustering, Expert Syst. Appl., 78 (2017), 64–73. |
[17] |
F. Shang, L. Jiao, F. Wang, Graph dual regularization non-negative matrix factorization for co-clustering, Pattern Recognit., 45 (2012), 2237–2250. https://doi.org/10.1016/j.patcog.2011.12.015 doi: 10.1016/j.patcog.2011.12.015
![]() |
[18] | C. Ding, T. Li, W. Peng, H. Park, Orthogonal nonnegative matrix t-factorizations for clustering, in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2006), 126–135. |
[19] | J. Li, H. Che, X. Liu, Circuit design and analysis of smoothed l0 norm approximation for sparse signal reconstruction, Circuits Syst. Signal Process., (2022), 1–25. |
[20] |
X. Ju, H. Che, C. Li, X. He, Solving mixed variational inequalities via a proximal neurodynamic network with applications, Neural Process. Lett., 54 (2022), 207–226. https://doi.org/10.1007/s11063-021-10628-1 doi: 10.1007/s11063-021-10628-1
![]() |
[21] |
H. Che, J. Wang, A collaborative neurodynamic approach to global and combinatorial optimization, Neural Networks, 114 (2019), 15–27. https://doi.org/10.1016/j.neunet.2019.02.002 doi: 10.1016/j.neunet.2019.02.002
![]() |
[22] |
X. Ju, H. Che, C. Li, X. He, G. Feng, Exponential convergence of a proximal projection neural network for mixed variational inequalities and applications, Neurocomputing, 454 (2021), 54–64. https://doi.org/10.1016/j.neucom.2021.04.059 doi: 10.1016/j.neucom.2021.04.059
![]() |
[23] |
C. Dai, H. Che, M.-F. Leung, A neurodynamic optimization approach for l 1 minimization with application to compressed image reconstruction, Int. J. Artif. Intell. Tools, 30 (2021), 2140007. https://doi.org/10.1142/S0218213021400078 doi: 10.1142/S0218213021400078
![]() |
[24] |
H. Che, J. Wang, A. Cichocki, Sparse signal reconstruction via collaborative neurodynamic optimization, Neural Networks, 154 (2022), 255–269. https://doi.org/10.1016/j.neunet.2022.07.018 doi: 10.1016/j.neunet.2022.07.018
![]() |
[25] | H. Che, J. Wang, A. Cichocki, Neurodynamics-based iteratively reweighted convex optimization for sparse signal reconstruction, in 2022 12th International Conference on Information Science and Technology (ICIST), IEEE, (2022), 45–51. |
[26] |
Y. Wang, J. Wang, H. Che, Two-timescale neurodynamic approaches to supervised feature selection based on alternative problem formulations, Neural Networks, 142 (2021), 180–191. https://doi.org/10.1016/j.neunet.2021.04.038 doi: 10.1016/j.neunet.2021.04.038
![]() |
[27] | X. Ju, C. Li, H. Che, X. He, G. Feng, A proximal neurodynamic network with fixed-time convergence for equilibrium problems and its applications, IEEE Trans. Neural Networks Learn. Syst., 2022 (2022). |
[28] |
F. Shang, L. Jiao, J. Shi, J. Chai, Robust positive semidefinite l-isomap ensemble, Pattern Recognit. Lett., 32 (2011), 640–649. https://doi.org/10.1016/j.patrec.2010.12.005 doi: 10.1016/j.patrec.2010.12.005
![]() |
[29] | M. Belkin, P. Niyogi, V. Sindhwani, Manifold regularization: A geometric framework for learning from labeled and unlabeled examples., J. Mach. Learn. Res., 7 (2006). |
[30] | K. Chen, H. Che, X. Li, M. F. Leung, Graph non-negative matrix factorization with alternative smoothed l0 regularizations, Neural Comput. Appl., 2022 (2022), 1–15. |
[31] | X. Yang, H. Che, M. F. Leung, C. Liu, Adaptive graph nonnegative matrix factorization with the self-paced regularization, Appl. Intell., 2022 (2022), 1–18. |
[32] |
Z. Huang, Y. Wang, X. Ma, Clustering of cancer attributed networks by dynamically and jointly factorizing multi-layer graphs, IEEE/ACM Trans. Comput. Biol. Bioinf., 19 (2021), 2737–2748. https://doi.org/10.1137/19M1301746 doi: 10.1137/19M1301746
![]() |
[33] |
J. B. Tenenbaum, V. d. Silva, J. C. Langford, A global geometric framework for nonlinear dimensionality reduction, Science, 290 (2000), 2319–2323. https://doi.org/10.1126/science.290.5500.2319 doi: 10.1126/science.290.5500.2319
![]() |
[34] | M. Belkin, P. Niyogi, Laplacian eigenmaps and spectral techniques for embedding and clustering, Adv. Neural Inf. Process. Syst., 14 (2001). |
[35] | D. Cai, X. He, J. Han, T. S. Huang, Graph regularized nonnegative matrix factorization for data representation, IEEE Trans. Pattern Anal. Mach. Intell., 33 (2010), 1548–1560. |
[36] | D. Zhou, J. Huang, B. Schölkopf, Learning with hypergraphs: Clustering, classification, and embedding, Adv. Neural Inf. Process. Syst., 19 (2006). |
[37] | J. Yu, D. Tao, M. Wang, Adaptive hypergraph learning and its application in image classification, IEEE Trans. Image Process., 21 (2012), 3262–3272. |
[38] |
P. Zhou, X. Wang, L. Du, X. Li, Clustering ensemble via structured hypergraph learning, Inf. Fusion, 78 (2022), 171–179. https://doi.org/10.1016/j.inffus.2021.09.003 doi: 10.1016/j.inffus.2021.09.003
![]() |
[39] | L. Xia, C. Huang, Y. Xu, J. Zhao, D. Yin, J. Huang, Hypergraph contrastive collaborative filtering, in Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, (2022), 70–79. |
[40] | Y. Feng, H. You, Z. Zhang, R. Ji, Y. Gao, Hypergraph neural networks, in Proceedings of the AAAI Conference on Artificial Intelligence, 33 (2019), 3558–3565. https://doi.org/10.1609/aaai.v33i01.33013558 |
[41] | J. Jiang, Y. Wei, Y. Feng, J. Cao, Y. Gao, Dynamic hypergraph neural networks, in International Joint Conference on Artificial Intelligence, (2019), 2635–2641. |
[42] | X. Liao, Y. Xu, H. Ling, Hypergraph neural networks for hypergraph matching, in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), 1266–1275. |
[43] |
K. Zeng, J. Yu, C. Li, J. You, T. Jin, Image clustering by hyper-graph regularized non-negative matrix factorization, Neurocomputing, 138 (2014), 209–217. https://doi.org/10.1016/j.neucom.2014.01.043 doi: 10.1016/j.neucom.2014.01.043
![]() |
[44] | L. Du, X. Li, Y. D. Shen, Robust nonnegative matrix factorization via half-quadratic minimization, in 2012 IEEE 12th International Conference on Data Mining, (2012), 201–210. |
[45] | D. Kong, C. Ding, H. Huang, Robust nonnegative matrix factorization using l21-norm, in Proceedings of the 20th ACM International Conference on Information and Knowledge Management, (2011), 673–682. https://doi.org/10.3917/ag.682.0673 |
[46] | H. Gao, F. Nie, W. Cai, H. Huang, Robust capped norm nonnegative matrix factorization: Capped norm nmf, in Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, 2015 2015,871–880. |
[47] |
Z. Li, J. Tang, X. He, Robust structured nonnegative matrix factorization for image representation, IEEE Trans. Neural Networks Learn. Syst., 29 (2017), 1947–1960. https://doi.org/10.1109/TNNLS.2017.2691725 doi: 10.1109/TNNLS.2017.2691725
![]() |
[48] | N. Guan, T. Liu, Y. Zhang, D. Tao, L. S. Davis, Truncated cauchy non-negative matrix factorization, IEEE Trans. Pattern Anal. Mach. Intell., 41 (2017), 246–259. |
[49] | N. Guan, D. Tao, Z. Luo, J. Shawe-Taylor, Mahnmf: Manhattan non-negative matrix factorization, Statics, 1050 (2012), 14. |
[50] |
S. Peng, W. Ser, B. Chen, Z. Lin, Robust orthogonal nonnegative matrix tri-factorization for data representation, Knowl. Based Syst., 201 (2020), 106054. https://doi.org/10.1016/j.knosys.2020.106054 doi: 10.1016/j.knosys.2020.106054
![]() |
[51] |
C. Y. Wang, N. Yu, M. J. Wu, Y. L. Gao, J. X. Liu, J. Wang, Dual hyper-graph regularized supervised nmf for selecting differentially expressed genes and tumor classification, IEEE/ACM Trans. Comput. Biol. Bioinf., 18 (2020), 2375–2383. https://doi.org/10.1109/TCBB.2020.2975173 doi: 10.1109/TCBB.2020.2975173
![]() |
[52] | L. Lovász, M. D. Plummer, Matching theory, American Mathematical Society, 2009. https://doi.org/10.1090/chel/367 |
[53] |
X. Gao, X. Ma, W. Zhang, J. Huang, H. Li, Y. Li, J. Cui, Multi-view clustering with self-representation and structural constraint, IEEE Trans. Big Data, 8 (2021), 882–893. https://doi.org/10.1109/TBDATA.2021.3128906 doi: 10.1109/TBDATA.2021.3128906
![]() |
[54] | C. Liu, W. Cao, S. Wu, W. Shen, D. Jiang, Z. Yu, H.-S. Wong, Supervised graph clustering for cancer subtyping based on survival analysis and integration of multi-omic tumor data, IEEE/ACM Trans. Comput. Biol. Bioinf., 19 (2020), 1193–1202. |
[55] | C. Liu, S. Wu, R. Li, D. Jiang, H. S. Wong, Self-supervised graph completion for incomplete multi-view clustering, IEEE Trans. Knowl. Data Eng., 2023 (2023), forthcoming. |
[56] | C. Liu, R. Li, S. Wu, H. Che, D. Jiang, Z. Yu, H.-S. Wong, Self-guided partial graph propagation for incomplete multiview clustering, IEEE Trans. Neural Networks Learn. Syst., 2023 (2023). |
[57] | C. Li, H. Che, M. F. Leung, C. Liu, Z. Yan, Robust multi-view non-negative matrix factorization with adaptive graph and diversity constraints, Inf. Sci., 2023 (2023). |
[58] | B. Pan, C. Li, H. Che, Nonconvex low-rank tensor approximation with graph and consistent regularizations for multi-view subspace learning, Neural Networks, 161 (2023), 638–658. |
[59] | S. Wang, Z. Chen, S. Du, Z. Lin, Learning deep sparse regularizers with applications to multi-view clustering and semi-supervised classification, IEEE Trans. Pattern Anal. Mach. Intell., 44 (2021), 5042–5055. |
[60] |
S. Du, Z. Liu, Z. Chen, W. Yang, S. Wang, Differentiable bi-sparse multi-view co-clustering, IEEE Trans. Signal Process., 69 (2021), 4623–4636. https://doi.org/10.1109/TSP.2021.3101979 doi: 10.1109/TSP.2021.3101979
![]() |
[61] |
S. Wang, X. Lin, Z. Fang, S. Du, G. Xiao, Contrastive consensus graph learning for multi-view clustering, IEEE/CAA J. Autom. Sin., 9 (2022), 2027–2030. https://doi.org/10.1109/JAS.2022.105959 doi: 10.1109/JAS.2022.105959
![]() |
[62] | Z. Fang, S. Du, X. Lin, J. Yang, S. Wang, Y. Shi, Dbo-net: Differentiable bi-level optimization network for multi-view clustering, Inf. Sci., 2023 (2023). |
1. | Panpan Zhou, Qianwen Zhou, Xuezhang Xiao, Xiulin Fan, Yongjin Zou, Lixian Sun, Jinghua Jiang, Dan Song, Lixin Chen, Machine Learning in Solid‐State Hydrogen Storage Materials: Challenges and Perspectives, 2024, 0935-9648, 10.1002/adma.202413430 |
floating point addition | floating point multiplication | floating point division | overall | |
CNMTF | 5mnk+9mk2+6nk2+2k3+11mk+5nk+(m+n)pk | 5mnk+9mk2+6nk2+2k3+8mk+2nk+k2+(m+n)pK | mk+nk+k2 | O(mnk) |
CHNMF | 2mnk+2mk2+2nk2+4mk+3nk+npk | 2mnk+2mk2+2nk2+5mk+2nk+npk | mk+nk | O(mnk) |
DNMF | 2mnk+2mk2+2nk2+3mk+3nk+(m+n)pk | 2mnk+2mk2+2nk2+2mk+2nk+(m+n)pk | mk+nk | O(mnk) |
GNMF | 2mnk+2mk2+2nk2+mk+3nk+npk | 2mnk+2mk2+2nk2+mk+2nk+npk | mk+nk | O(mnk) |
HNMF | 2mnk+2mk2+2nk2+mk+3nk+mpk | 2mnk+2mk2+2nk2+mk+2nk+npk | mk+nk | O(mnk) |
RHNMF | 2mnk+2mk2+2nk2+7nk+npk | 2mnk+2mk2+2nk2+mk+6nk+npp | mk+nk | O(mnk) |
CaNMF | 3mnk+mk2+3nk2+4nk | 3mnk+mk2+3nk2+mk+5nk | mk+nk | O(mnk) |
DHSNMF | 2mnk+2mk2+2nk2+3mk+3nk+(m+n)pk | 2mnk+2mk2+2nk2+2mk+2nk+(m+n)pk | mk+nk | O(mnk) |
RCHNMTF | 3mnk+8mk2+5nk2+6k3+5mk+11nk+(m+n)pk | 3mnk+8mk2+5nk2+6k3+2mk+8nk+k2+(m+n)pk | mk+nk+k2 | O(mnk) |
Datasets | Sample | Feature | Class | Data types |
AR100 | 1300 | 1024 | 100 | Face Image |
ORL | 400 | 1024 | 40 | Face Image |
YALE | 165 | 1024 | 15 | Face Image |
PIE | 1166 | 1024 | 53 | Face Image |
MSRA25 | 1799 | 256 | 12 | Face Image |
PENDIGITS | 10992 | 16 | 10 | Handwritten digits |
COIL100 | 7200 | 1024 | 100 | Object image |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 48.78±1.24 | 70.00±2.18 | 63.92±2.81 | 57.31±4.16 | 47.03±2.42 | 73.44±3.17 | 47.72±1.17 |
CHNMF [7] | 52.28±1.37 | 74.83±1.60 | 62.56±1.54 | 57.42±2.16 | 46.42±2.10 | 67.12±4.06 | 47.86±1.55 |
DNMF [17] | 59.70±1.47 | 77.87±2.71 | 60.94±1.88 | 54.78±2.84 | 42.42±3.18 | 73.64±4.45 | 54.81±2.14 |
GNMF [35] | 58.14±1.63 | 78.73±1.95 | 63.44±2.99 | 53.00±2.66 | 42.24±3.12 | 71.00±5.19 | 55.33±1.10 |
HNMF [43] | 55.43±1.57 | 76.44±2.81 | 64.05±1.67 | 55.04±1.87 | 42.60±2.42 | 73.24±5.46 | 58.00±1.29 |
RHNMF [8] | 59.05±1.22 | 78.49±2.21 | 63.05±1.50 | 54.37±1.78 | 47.93±1.39 | 71.46±6.22 | 50.62±1.58 |
DHSNMF [51] | 55.96±1.77 | 75.28±2.58 | 65.12±1.58 | 52.57±1.998 | 43.15±2.58 | 70.27±4.32 | 51.91±1.34 |
RCHNMTF | 60.08±1.91 | 80.04±2.44 | 65.50±2.31 | 59.13±2.72 | 45.33±2.39 | 74.24±2.91 | 58.26±1.64 |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 51.98±1.37 | 74.10±1.63 | 69.12±1.96 | 58.84±3.63 | 48.00±2.70 | 75.13±2.71 | 51.91±1.06 |
CHNMF [7] | 55.79±1.42 | 78.30±1.41 | 67.36±1.57 | 59.57±1.49 | 47.21±2.25 | 70.22±3.02 | 52.73±1.34 |
DNMF [17] | 62.99±1.23 | 81.50±2.02 | 67.02±1.50 | 56.85±2.31 | 43.87±2.53 | 75.56±3.37 | 58.83±1.59 |
GNMF [35] | 61.20±1.45 | 82.50±1.85 | 68.34±1.73 | 55.05±2.03 | 43.75±2.90 | 73.61±3.31 | 59.12±0.84 |
HNMF [43] | 58.37±1.54 | 80.64±2.07 | 68.25±1.55 | 57.67±1.25 | 43.87±2.06 | 74.86±3.58 | 62.35±0.93 |
RHNMF [8] | 61.52±0.78 | 79.53±2.39 | 68.10±1.56 | 61.11±3.73 | 44.97±1.81 | 74.38±2.78 | 55.78±1.16 |
CaNMF [46] | 61.96±0.99 | 81.97±1.81 | 66.75±1.31 | 56.81±1.63 | 48.66±1.64 | 73.42±4.53 | 53.36±1.18 |
DHSNMF [51] | 58.89±1.46 | 79.19±1.95 | 69.20±1.33 | 55.53±1.88 | 44.06±2.08 | 72.48±2.84 | 56.30±1.21 |
RCHNMTF | 62.50±1.64 | 83.35±1.68 | 69.61±2.01 | 61.22±2.82 | 46.48±2.40 | 75.87±2.67 | 62.75±1.44 |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 75.85±1.24 | 83.37±1.22 | 82.35±1.28 | 61.77±3.72 | 51.68±2.37 | 71.10±2.55 | 74.99±0.55 |
CHNMF [7] | 78.41±0.72 | 86.29±1.11 | 80.48±0.79 | 63.61±2.46 | 51.70±2.24 | 65.52±2.22 | 75.43±0.31 |
DNMF [17] | 82.54±0.54 | 88.48±1.11 | 82.56±0.37 | 60.87±2.19 | 48.26±2.11 | 69.61±2.98 | 81.77±0.32 |
GNMF [35] | 82.15±0.78 | 89.52±1.47 | 81.70±1.29 | 59.03±1.71 | 48.02±1.72 | 69.63±1.23 | 81.65±0.24 |
HNMF [43] | 80.62±0.96 | 88.54±1.27 | 80.87±0.88 | 60.85±1.84 | 47.83±1.38 | 70.33±1.31 | 78.29±0.81 |
RHNMF [8] | 82.31±0.20 | 87.21±1.42 | 80.69±0.93 | 63.92±3.28 | 49.38±2.10 | 69.81±1.43 | 78.27±0.18 |
CaNMF [46] | 82.08±0.50 | 88.53±1.20 | 79.74±0.70 | 59.68±1.95 | 51.51±1.41 | 68.85±3.42 | 76.18±0.46 |
DHSNMF [51] | 79.8±0.64 | 87.54±1.03 | 81.84±1.13 | 60.45±2.79 | 48.33±1.43 | 68.10±1.71 | 77.00±0.45 |
RCHNMTF | 82.56±0.77 | 89.64±0.99 | 81.84±0.56 | 64.92±2.28 | 50.46±1.08 | 67.46±2.36 | 80.78±0.67 |
Accuracy under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 58.67 | 77.83 | 63.25 | 52.15 | 44.72 | 66.88 | 54.975 |
p=0.5 | 60.08 | 80.04 | 65.50 | 59.13 | 45.33 | 74.24 | 58.26 |
p=0.25 | 59.09 | 77.39 | 63.0 | 54.43 | 43.515 | 70.63 | 58.50 |
Purity under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 61.40 | 81.32 | 67.7 | 55.23 | 46.42 | 69.22 | 59.48 |
p=0.5 | 62.50 | 83.35 | 69.61 | 61.22 | 46.48 | 75.87 | 62.75 |
p=0.25 | 61.75 | 81.02 | 67.95 | 56.77 | 44.24 | 71.78 | 62.91 |
NMI under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 82.68 | 87.90 | 80.665 | 58.24 | 49.94 | 65.61 | 78.83 |
p=0.5 | 82.56 | 89.64 | 81.84 | 64.92 | 50.46 | 67.46 | 81.78 |
p=0.25 | 83.24 | 88.50 | 80.67 | 60.42 | 48.18 | 66.29 | 80.61 |
RCHNMTF without dual hyper-graph regularization | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 53.93 | 79.14 | 58.85 | 55.22 | 44.84 | 70.34 | 26.05 |
NMI | 76.76 | 86.69 | 73.89 | 58.14 | 49.43 | 64.54 | 58.28 |
ACC | 51.33 | 75.48 | 55.40 | 53.61 | 43.75 | 69.62 | 24.02 |
RCHNMTF without orthogonality constraints | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 33.82 | 78.21 | 67.10 | 46.51 | 39.39 | 61.50 | 29.03 |
NMI | 61.54 | 86.31 | 80.15 | 51.28 | 45.32 | 56.48 | 48.43 |
ACC | 30.32 | 74.97 | 62.75 | 43.46 | 37.57 | 60.78 | 24.33 |
RCHNMTF with dual hyper-graph regularization and orthogonality constraints | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 62.50 | 83.35 | 69.61 | 61.22 | 46.48 | 75.87 | 62.75 |
NMI | 82.56 | 89.64 | 81.84 | 64.92 | 50.46 | 67.46 | 81.78 |
ACC | 60.08 | 80.04 | 65.50 | 59.13 | 45.33 | 74.24 | 58.26 |
floating point addition | floating point multiplication | floating point division | overall | |
CNMTF | 5mnk+9mk2+6nk2+2k3+11mk+5nk+(m+n)pk | 5mnk+9mk2+6nk2+2k3+8mk+2nk+k2+(m+n)pK | mk+nk+k2 | O(mnk) |
CHNMF | 2mnk+2mk2+2nk2+4mk+3nk+npk | 2mnk+2mk2+2nk2+5mk+2nk+npk | mk+nk | O(mnk) |
DNMF | 2mnk+2mk2+2nk2+3mk+3nk+(m+n)pk | 2mnk+2mk2+2nk2+2mk+2nk+(m+n)pk | mk+nk | O(mnk) |
GNMF | 2mnk+2mk2+2nk2+mk+3nk+npk | 2mnk+2mk2+2nk2+mk+2nk+npk | mk+nk | O(mnk) |
HNMF | 2mnk+2mk2+2nk2+mk+3nk+mpk | 2mnk+2mk2+2nk2+mk+2nk+npk | mk+nk | O(mnk) |
RHNMF | 2mnk+2mk2+2nk2+7nk+npk | 2mnk+2mk2+2nk2+mk+6nk+npp | mk+nk | O(mnk) |
CaNMF | 3mnk+mk2+3nk2+4nk | 3mnk+mk2+3nk2+mk+5nk | mk+nk | O(mnk) |
DHSNMF | 2mnk+2mk2+2nk2+3mk+3nk+(m+n)pk | 2mnk+2mk2+2nk2+2mk+2nk+(m+n)pk | mk+nk | O(mnk) |
RCHNMTF | 3mnk+8mk2+5nk2+6k3+5mk+11nk+(m+n)pk | 3mnk+8mk2+5nk2+6k3+2mk+8nk+k2+(m+n)pk | mk+nk+k2 | O(mnk) |
Datasets | Sample | Feature | Class | Data types |
AR100 | 1300 | 1024 | 100 | Face Image |
ORL | 400 | 1024 | 40 | Face Image |
YALE | 165 | 1024 | 15 | Face Image |
PIE | 1166 | 1024 | 53 | Face Image |
MSRA25 | 1799 | 256 | 12 | Face Image |
PENDIGITS | 10992 | 16 | 10 | Handwritten digits |
COIL100 | 7200 | 1024 | 100 | Object image |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 48.78±1.24 | 70.00±2.18 | 63.92±2.81 | 57.31±4.16 | 47.03±2.42 | 73.44±3.17 | 47.72±1.17 |
CHNMF [7] | 52.28±1.37 | 74.83±1.60 | 62.56±1.54 | 57.42±2.16 | 46.42±2.10 | 67.12±4.06 | 47.86±1.55 |
DNMF [17] | 59.70±1.47 | 77.87±2.71 | 60.94±1.88 | 54.78±2.84 | 42.42±3.18 | 73.64±4.45 | 54.81±2.14 |
GNMF [35] | 58.14±1.63 | 78.73±1.95 | 63.44±2.99 | 53.00±2.66 | 42.24±3.12 | 71.00±5.19 | 55.33±1.10 |
HNMF [43] | 55.43±1.57 | 76.44±2.81 | 64.05±1.67 | 55.04±1.87 | 42.60±2.42 | 73.24±5.46 | 58.00±1.29 |
RHNMF [8] | 59.05±1.22 | 78.49±2.21 | 63.05±1.50 | 54.37±1.78 | 47.93±1.39 | 71.46±6.22 | 50.62±1.58 |
DHSNMF [51] | 55.96±1.77 | 75.28±2.58 | 65.12±1.58 | 52.57±1.998 | 43.15±2.58 | 70.27±4.32 | 51.91±1.34 |
RCHNMTF | 60.08±1.91 | 80.04±2.44 | 65.50±2.31 | 59.13±2.72 | 45.33±2.39 | 74.24±2.91 | 58.26±1.64 |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 51.98±1.37 | 74.10±1.63 | 69.12±1.96 | 58.84±3.63 | 48.00±2.70 | 75.13±2.71 | 51.91±1.06 |
CHNMF [7] | 55.79±1.42 | 78.30±1.41 | 67.36±1.57 | 59.57±1.49 | 47.21±2.25 | 70.22±3.02 | 52.73±1.34 |
DNMF [17] | 62.99±1.23 | 81.50±2.02 | 67.02±1.50 | 56.85±2.31 | 43.87±2.53 | 75.56±3.37 | 58.83±1.59 |
GNMF [35] | 61.20±1.45 | 82.50±1.85 | 68.34±1.73 | 55.05±2.03 | 43.75±2.90 | 73.61±3.31 | 59.12±0.84 |
HNMF [43] | 58.37±1.54 | 80.64±2.07 | 68.25±1.55 | 57.67±1.25 | 43.87±2.06 | 74.86±3.58 | 62.35±0.93 |
RHNMF [8] | 61.52±0.78 | 79.53±2.39 | 68.10±1.56 | 61.11±3.73 | 44.97±1.81 | 74.38±2.78 | 55.78±1.16 |
CaNMF [46] | 61.96±0.99 | 81.97±1.81 | 66.75±1.31 | 56.81±1.63 | 48.66±1.64 | 73.42±4.53 | 53.36±1.18 |
DHSNMF [51] | 58.89±1.46 | 79.19±1.95 | 69.20±1.33 | 55.53±1.88 | 44.06±2.08 | 72.48±2.84 | 56.30±1.21 |
RCHNMTF | 62.50±1.64 | 83.35±1.68 | 69.61±2.01 | 61.22±2.82 | 46.48±2.40 | 75.87±2.67 | 62.75±1.44 |
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
CNMTF [50] | 75.85±1.24 | 83.37±1.22 | 82.35±1.28 | 61.77±3.72 | 51.68±2.37 | 71.10±2.55 | 74.99±0.55 |
CHNMF [7] | 78.41±0.72 | 86.29±1.11 | 80.48±0.79 | 63.61±2.46 | 51.70±2.24 | 65.52±2.22 | 75.43±0.31 |
DNMF [17] | 82.54±0.54 | 88.48±1.11 | 82.56±0.37 | 60.87±2.19 | 48.26±2.11 | 69.61±2.98 | 81.77±0.32 |
GNMF [35] | 82.15±0.78 | 89.52±1.47 | 81.70±1.29 | 59.03±1.71 | 48.02±1.72 | 69.63±1.23 | 81.65±0.24 |
HNMF [43] | 80.62±0.96 | 88.54±1.27 | 80.87±0.88 | 60.85±1.84 | 47.83±1.38 | 70.33±1.31 | 78.29±0.81 |
RHNMF [8] | 82.31±0.20 | 87.21±1.42 | 80.69±0.93 | 63.92±3.28 | 49.38±2.10 | 69.81±1.43 | 78.27±0.18 |
CaNMF [46] | 82.08±0.50 | 88.53±1.20 | 79.74±0.70 | 59.68±1.95 | 51.51±1.41 | 68.85±3.42 | 76.18±0.46 |
DHSNMF [51] | 79.8±0.64 | 87.54±1.03 | 81.84±1.13 | 60.45±2.79 | 48.33±1.43 | 68.10±1.71 | 77.00±0.45 |
RCHNMTF | 82.56±0.77 | 89.64±0.99 | 81.84±0.56 | 64.92±2.28 | 50.46±1.08 | 67.46±2.36 | 80.78±0.67 |
Accuracy under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 58.67 | 77.83 | 63.25 | 52.15 | 44.72 | 66.88 | 54.975 |
p=0.5 | 60.08 | 80.04 | 65.50 | 59.13 | 45.33 | 74.24 | 58.26 |
p=0.25 | 59.09 | 77.39 | 63.0 | 54.43 | 43.515 | 70.63 | 58.50 |
Purity under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 61.40 | 81.32 | 67.7 | 55.23 | 46.42 | 69.22 | 59.48 |
p=0.5 | 62.50 | 83.35 | 69.61 | 61.22 | 46.48 | 75.87 | 62.75 |
p=0.25 | 61.75 | 81.02 | 67.95 | 56.77 | 44.24 | 71.78 | 62.91 |
NMI under different values of p | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
p=0.75 | 82.68 | 87.90 | 80.665 | 58.24 | 49.94 | 65.61 | 78.83 |
p=0.5 | 82.56 | 89.64 | 81.84 | 64.92 | 50.46 | 67.46 | 81.78 |
p=0.25 | 83.24 | 88.50 | 80.67 | 60.42 | 48.18 | 66.29 | 80.61 |
RCHNMTF without dual hyper-graph regularization | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 53.93 | 79.14 | 58.85 | 55.22 | 44.84 | 70.34 | 26.05 |
NMI | 76.76 | 86.69 | 73.89 | 58.14 | 49.43 | 64.54 | 58.28 |
ACC | 51.33 | 75.48 | 55.40 | 53.61 | 43.75 | 69.62 | 24.02 |
RCHNMTF without orthogonality constraints | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 33.82 | 78.21 | 67.10 | 46.51 | 39.39 | 61.50 | 29.03 |
NMI | 61.54 | 86.31 | 80.15 | 51.28 | 45.32 | 56.48 | 48.43 |
ACC | 30.32 | 74.97 | 62.75 | 43.46 | 37.57 | 60.78 | 24.33 |
RCHNMTF with dual hyper-graph regularization and orthogonality constraints | |||||||
Datasets | AR100 | PIE | ORL | MSRA | YALE | PENDIGITS | COIL100 |
PUR | 62.50 | 83.35 | 69.61 | 61.22 | 46.48 | 75.87 | 62.75 |
NMI | 82.56 | 89.64 | 81.84 | 64.92 | 50.46 | 67.46 | 81.78 |
ACC | 60.08 | 80.04 | 65.50 | 59.13 | 45.33 | 74.24 | 58.26 |