Convex-structured covariance estimation via the entropy loss under the majorization-minimization algorithm framework

Chen Chen; Xiangbing Chen; Yi Ai; Chen Chen; Xiangbing Chen; Yi Ai

doi:10.3934/math.2024692

AIMS Mathematics

2024, Volume 9, Issue 6: 14253-14273. doi: 10.3934/math.2024692

Previous Article Next Article

Research article Special Issues

Convex-structured covariance estimation via the entropy loss under the majorization-minimization algorithm framework

1.
Science College, Civil Aviation Flight University of China, Guanghan 618307, Sichuan, China
2.
Division of Mathematics, Sichuan University Jinjiang College, Meishan 620860, Sichuan, China
3.
College of Air Traffic Management, Civil Aviation Flight University of China, Guanghan 618307, Sichuan, China

Received: 31 January 2024 Revised: 07 April 2024 Accepted: 12 April 2024 Published: 18 April 2024
MSC : 62J10

We estimated convex-structured covariance/correlation matrices by minimizing the entropy loss corresponding to the given matrix. We first considered the estimation of the Weighted sum of known Rank-one matrices with unknown Weights (W-Rank1-W) structural covariance matrices, which appeared commonly in array signal processing tasks, e.g., direction-of-arrival (DOA) estimation. The associated minimization problem is convex and can be solved using the primal-dual interior-point algorithm. However, the objective functions (the entropy loss function) can be bounded above by a sequence of separable functions—we proposed a novel estimation algorithm based on this property under the Majorization-Minimization (MM) algorithmic framework. The proposed MM algorithm exhibited very low computational complexity in each iteration, and its convergence was demonstrated theoretically. Subsequently, we focused on the estimation of Toeplitz autocorrelation matrices, which appeared frequently in time-series analysis. In particular, we considered cases in which the autocorrelation coefficient decreased as the time lag increased. We transformed the Toeplitz structure into a W-Rank1-W structure via special variable substitution, and proposed an MM algorithm similar to that for the W-Rank1-W covariance estimation. However, each MM iteration involved a second-order cone programming SOCP problem that must be resolved. Our numerical experiments demonstrated the high computational efficiency and satisfactory estimation accuracy of the proposed MM algorithms in DOA and autocorrelation matrix estimation.

Keywords:

Citation: Chen Chen, Xiangbing Chen, Yi Ai. Convex-structured covariance estimation via the entropy loss under the majorization-minimization algorithm framework[J]. AIMS Mathematics, 2024, 9(6): 14253-14273. doi: 10.3934/math.2024692

Related Papers:

[1]	Gideon Simpson, Daniel Watkins . Relative entropy minimization over Hilbert spaces via Robbins-Monro. AIMS Mathematics, 2019, 4(3): 359-383. doi: 10.3934/math.2019.3.359
[2]	Abdullah Ali H. Ahmadini, Amal S. Hassan, Ahmed N. Zaky, Shokrya S. Alshqaq . Bayesian inference of dynamic cumulative residual entropy from Pareto Ⅱ distribution with application to COVID-19. AIMS Mathematics, 2021, 6(3): 2196-2216. doi: 10.3934/math.2021133
[3]	Jairo A. Angel, Francisco M.M. Rocha, Jorge I. Vélez, Julio M. Singer . A new test for detecting specification errors in Gaussian linear mixed-effects models. AIMS Mathematics, 2024, 9(11): 30710-30727. doi: 10.3934/math.20241483
[4]	Amal S. Hassan, Najwan Alsadat, Oluwafemi Samson Balogun, Baria A. Helmy . Bayesian and non-Bayesian estimation of some entropy measures for a Weibull distribution. AIMS Mathematics, 2024, 9(11): 32646-32673. doi: 10.3934/math.20241563
[5]	Tae Jong Choi . An efficient eigenvector-based crossover for differential evolution: Simplifying with rank-one updates. AIMS Mathematics, 2025, 10(2): 3500-3522. doi: 10.3934/math.2025162
[6]	Baria A. Helmy, Amal S. Hassan, Ahmed K. El-Kholy, Rashad A. R. Bantan, Mohammed Elgarhy . Analysis of information measures using generalized type-Ⅰ hybrid censored data. AIMS Mathematics, 2023, 8(9): 20283-20304. doi: 10.3934/math.20231034
[7]	Huimin Li, Jinru Wang . Pilot estimators for a kind of sparse covariance matrices with incomplete heavy-tailed data. AIMS Mathematics, 2023, 8(9): 21439-21462. doi: 10.3934/math.20231092
[8]	Sakander Hayat, Sunilkumar M. Hosamani, Asad Khan, Ravishankar L. Hutagi, Umesh S. Mujumdar, Mohammed J. F. Alenazi . A novel edge-weighted matrix of a graph and its spectral properties with potential applications. AIMS Mathematics, 2024, 9(9): 24955-24976. doi: 10.3934/math.20241216
[9]	Abdulhakim A. Al-Babtain, Amal S. Hassan, Ahmed N. Zaky, Ibrahim Elbatal, Mohammed Elgarhy . Dynamic cumulative residual Rényi entropy for Lomax distribution: Bayesian and non-Bayesian methods. AIMS Mathematics, 2021, 6(4): 3889-3914. doi: 10.3934/math.2021231
[10]	Refah Alotaibi, Mazen Nassar, Zareen A. Khan, Wejdan Ali Alajlan, Ahmed Elshahhat . Entropy evaluation in inverse Weibull unified hybrid censored data with application to mechanical components and head-neck cancer patients. AIMS Mathematics, 2025, 10(1): 1085-1115. doi: 10.3934/math.2025052

Abstract

1. Introduction

Structural covariance/correlation matrices appear in diverse fields. For example, in array signal processing, the covariance matrix of received array signal is noise covariance plus a weighted sum of rank-one matrices ^[1,2,3]; in time-series analysis, a common form is the Toeplitz structural autocovariance/autocorrelation matrix ^[4]; and in certain high-dimensional problems, the covariance matrix can often be sparse or exhibit a banded structure ^[5]. Covariance estimation using structure constraints is common in research articles, e.g., ^[5,6,7,8]. It is believed that structural estimates are more accurate than that excluding structural information because exploiting structure information in the estimation process usually implies a reduction in the number of parameters to be estimated. Structural covariance/correlation estimation plays an important role in several applications, such as low-rank signal detection ^[9,10], direction-of-arrival (DOA) estimation ^[2,3], and spectral analysis of time-series data ^[4].

The Weighted sum of known Rank-one matrices with unknown Weights (W-Rank1-W) structural matrices are often used to describe the covariance of received array signal ^[2]. In ^[11], a weighted covariance fitting criterion was proposed, providing a W-Rank1-W structural covariance estimation of the received array signals. Based on the maximum likelihood principle, ^[12] estimated the W-Rank1-W structural covariance of the received array signals that were assumed to follow a circular Gaussian distribution. The weight estimates in ^[11,12] are widely known to play an important role in DOA estimation. Subsequently, many researchers have noted that the weights of W-Rank1-W are sparse in DOA estimation, and similar DOA estimation ideas were implemented ^[2,13], where the W-Rank1-W structural covariance estimation was addressed specifically under sparse weight constraints. We estimate the W-Rank1-W structural covariance by minimizing the relative entropy (also known as Kullback-Leibler divergence) between two circular Gaussian distributions ^[14], yielding a low-complexity algorithm based on the Majorization-Minimization (MM) algorithm framework ^[15]. The entropy loss measures the difference between pairs of distributions with two different covariance matrices. The proposed MM algorithm is based on the result that the entropy loss function is bounded above by a sequence of separable functions. Our numerical experiments reveal that weight estimates based on minimizing the entropy loss are more accurate for DOA estimation than the estimates reported in ^[11,12] using the weighted covariance fitting criteria and the maximum-likelihood principle.

An auto-covariance matrix in a stationary time series always exhibits a Toeplitz structure ^[4]. Moreover, in certain cases, the autocorrelation coefficient is inversely related to the time lag, e.g., in auto-regressive time series, some moving average time series, and some auto-regressive moving average time series. ^[16] approached Toeplitz covariance estimation by minimizing the F-norm loss of the sample covariance matrix, and ^[6,17] proposed maximum likelihood Toeplitz estimates under circularly-symmetric complex normal distribution and complex elliptical distribution assumptions. ^[18] provided a banded Toeplitz estimator by projecting a given positive definite matrix onto the convex cone of nonnegative definite banded Toeplitz matrices. ^[19,20] proposed Toeplitz estimates by minimizing entropy loss and modified entropy loss, respectively. ^[21] proposed Toeplitz estimation based on entropy loss, considering the possible sparsity of the covariance matrix. This study considers the inverse relationship between the autocorrelation coefficient and time lag in time-series analysis and proposes a Toeplitz autocorrelation matrix estimation by minimizing the entropy loss between two normal distributions. As Toeplitz matrices can be expressed in the W-Rank1-W structure, an MM algorithm similar to that for the W-Rank1-W structural estimation is proposed; however, in each iteration of the algorithm, a second-order cone programming (SOCP) problem is required to be solved. Numerical experiments reveal that our estimates are more accurate than those that do not consider the inverse relationship between the autocorrelation coefficient and time lag.

The remainder of this paper is organized as follows: In Section 2, we formulate the estimation problem of interest. Section 3 presents two different algorithms for the W-Rank1-W structural covariance estimation and Section 4 presents a MM algorithm for the autocorrelation matrix estimation involving the inverse relationship between the autocorrelation coefficient and time lag. Finally, Section 5 describes the results of our numerical experiments, revealing the characteristics and applications of the proposed algorithms.

Notations: $\mathbb{C}^{ {{{v}}}} (\mathbb{R}^{ {{{v}}}})$ denotes the set of ${v}$ -dimensional vectors in complex (real) field. $\mathbb{S}^ {v}_{+} (\mathbb{S}^ {v}_{++})$ denotes the set of ${v} \times {v}$ symmetric (real field) and hermitian (complex field) positive semi-definite (definite) matrices. ${\mathbf{0}}$ denotes the zero matrix and ${\mathbf{I}}_{ {v}}$ denotes the ${v} \times {v}$ identity matrix. ${\mathbf{X}}_1 - {\mathbf{X}}_2 \succeq 0 \; (\succ 0)$ indicates that the matrix ${\mathbf{X}}_1 - {\mathbf{X}}_2$ is positive semi-definite (definite). ${\mathbf{x}} \succeq 0 (\succ 0)$ indicates that all elements in the vector ${\mathbf{x}}$ are non-negative (positive). ${\mathbf{X}}_{i, j}$ denotes the (i, j)-th entry of the matrix ${\mathbf{X}}$ , where the row and column indices begin from $0$ . ${\rm{diag}}({\mathbf{x}})$ denotes a diagonal matrix with the vector ${\mathbf{x}}$ defining its diagonal elements. The superscripts $(\cdot)^{ \mathrm{T}}$ and $(\cdot)^{ \mathrm{H}}$ represent the transpose and conjugate transpose matrices, respectively. $\|\cdot\|$ denotes the $l_2$ -norm of a vector, and $|\cdot|$ denotes the determinant of a scale.

2. Problem formulation

Let $\mathcal{CN}_1$ and $\mathcal{CN}_2$ denote circularly-symmetric complex normal distributions with zero mean and ${\bf{\Sigma}}_1 \in \mathbb{S}_{++}^{ {{{v}}}}$ and ${\bf{\Sigma}}_2 \in \mathbb{S}_{++}^{ {{{v}}}}$ as covariance matrices, respectively. The probability densities of $\mathcal{CN}_1$ and $\mathcal{CN}_2$ are

$\begin{align} f_i( {\mathbf{z}}) = \frac{1}{ \pi^{ {{{v}}}} \det( {\bf{\Sigma}}_i)} \mathrm{e}^{- {\mathbf{z}}^\mathrm{H} {\bf{\Sigma}}_i {\mathbf{z}}}, \quad i = 1, 2. \end{align}$

(2.1)

The relative entropy from $\mathcal{CN}_1$ to $\mathcal{CN}_2$ is defined ^[14] as follows:

$\begin{align} D(\mathcal{CN}_2 \| \mathcal{CN}_1) = \mathrm{E}_{ {\mathbf{z}} \sim \mathcal{CN}_1}\left({\log \left( \frac{f_1( {\mathbf{z}})}{f_2( {\mathbf{z}})} \right)} \right), \end{align}$

(2.2)

where $\mathrm{E}_{ {\mathbf{z}} \sim \mathcal{CN}_1}(\cdot)$ denotes the expectation operator, given that ${\mathbf{z}}$ follows the distribution, $\mathcal{CN}_1$ . By a brief proof (Appendix A), we have (2.2) rewritten as

$\begin{align} D(\mathcal{CN}_2 \| \mathcal{CN}_1) = {{\rm{Tr}}}( {\bf{\Sigma}}_2^{-1} {\bf{\Sigma}}_1) {- \log \det ( {\bf{\Sigma}}_2^{-1} {\bf{\Sigma}}_1)}- {v}. \end{align}$

(2.3)

This relative entropy measures the difference between the distributions, $\mathcal{CN}_1$ and $\mathcal{CN}_2$ .

In applications such as array signal processing and time-series analysis, the covariance is structural, and structural covariance estimation is a common problem. Let ${\bf{\Sigma}}_1 \triangleq {\bf{\Sigma}}$ be an unknown structural covariance matrix and ${\bf{\Sigma}}_2 \triangleq {\mathbf{S}}$ be a given covariance matrix. By minimizing the entropy loss, $D(\mathcal{CN}_2 \| \mathcal{CN}_1)$ ,

$\begin{align} \min\limits_{ {\bf{\Sigma}} \in \Omega} L_0( {\bf{\Sigma}}) \triangleq {{\rm{Tr}}}( {\mathbf{S}}^{-1} {\bf{\Sigma}}) {- \log \det ( {\mathbf{S}}^{-1} {\bf{\Sigma}})} - {v}, \end{align}$

(2.4)

where $\Omega$ denotes a nonempty set that is the intersection of the closed set that characterizes covariance structure and a positive semi-definite cone $\mathbb{S}_{+}^{ {{{v}}}}$ , we achieve structural covariance estimation.

Throughout this study, we adopt the following assumptions: the loss function $L_0({\bf{\Sigma}}_k) \rightarrow +\infty$ when the sequence $\{ {\bf{\Sigma}}_k\}$ tends towards the boundary of the positive semi-definite cone, $\mathbb{S}^{ {{{v}}}}_{+}$ . Under this assumption, the case in which ${\bf{\Sigma}}$ is singular can be excluded from the algorithm analysis.

3. W-Rank1-W structural covariance matrix estimation

Estimation of the W-Rank1-W structural covariance matrix is described in this section, i.e., the estimation of ${\bf{\Sigma}}$ in the set $\Omega_1$ :

$\begin{align} \Omega_1 \triangleq \left\{ {\mathbf{Z}} \mid {\mathbf{Z}} = \sum\limits_{i = 0}^m p_i {\mathbf{a}}_i {\mathbf{a}}_i ^ { \mathrm{H}}, \ p_i \geq 0 \right\}, \end{align}$

(3.1)

where ${\mathbf{a}}_0, \ldots, {\mathbf{a}}_m$ denote the vectors known in advance. W-Rank1-W structural covariance commonly occurs in array signal processing and can often be applied to DOA estimation.

Given a covariance matrix, which is often the sample covariance matrix, we solve the problem (2.4) with $\Omega = \Omega_1$ :

$\begin{align} \min\limits_{ {\bf{\Sigma}} \in \Omega_1} {{\rm{Tr}}}( {\mathbf{S}}^{-1} {\bf{\Sigma}}) {- \log \det ( {\mathbf{S}}^{-1} {\bf{\Sigma}})}, \end{align}$

(3.2)

yielding a W-Rank1-W structural covariance estimate. Any ${\bf{\Sigma}} \in \Omega_1$ can be rewritten as:

$\begin{align} {\bf{\Sigma}} = {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}}, \end{align}$

(3.3)

where ${\mathbf{A}} = ({\mathbf{a}}_0, {\mathbf{a}}_1, \cdots, {\mathbf{a}}_m)$ and ${\mathbf{p}} = (p_0, p_1, \cdots, p_m)^{ \mathrm{T}}$ . Because ${\bf{\Sigma}}$ is a function of ${\mathbf{p}}$ ( ${\mathbf{A}}$ is known in advance), the objective function in (3.2) can be rewritten as the following function of ${\mathbf{p}}$ :

$\begin{align} L( {\mathbf{p}}) \triangleq {{\rm{Tr}}} \left( {\mathbf{S}}^{-1} {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}}\right) {- \log \det \left( {\mathbf{S}}^{-1} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})\right)}. \end{align}$

(3.4)

Using the objective function expression, $L({\mathbf{p}})$ , the optimization problem (3.2) can be rewritten as follows:

$\begin{align} \min\limits_{ {\mathbf{p}} \succeq 0} L( {\mathbf{p}}). \end{align}$

(3.5)

The objective function in (3.2) is a convex function in terms of ${\bf{\Sigma}}$ ^[22], and ${\bf{\Sigma}}$ is an affine map of ${\mathbf{p}}$ ; therefore, it is convex in ${\mathbf{p}}$ , i.e., $L({\mathbf{p}})$ is convex in ${\mathbf{p}}$ . Moreover, the constraint set $\{ {\mathbf{p}} \mid {\mathbf{p}} \succeq 0 \}$ is convex. Thus, (3.5) represents a convex optimization problem.

The following algorithms for W-Rank1-W covariance estimation are used to solve problem (3.5). When ${\mathbf{p}}^*$ denotes the minimum point of the problem (3.5):

$\begin{align} L( {\mathbf{p}}^*) = \min\limits_{ {\mathbf{p}} \succeq 0} L( {\mathbf{p}}), \end{align}$

(3.6)

the W-Rank1-W structural covariance estimation is given by

$\begin{align} {\bf{\Sigma}}^* = {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}^*) {\mathbf{A}}^{ \mathrm{H}}. \end{align}$

(3.7)

We present two algorithms in the following subsections—the primal-dual interior-point method and another designed within the MM algorithm framework. These two algorithms have different computational complexities, and we briefly discuss and compare them.

3.1. Primal-dual interior-point algorithm

Problem (3.5) is a convex optimization problem with linear inequality constraints that can be solved effectively using the primal-dual interior-point method ^[23].

Algorithm 1 outlines the primal-dual interior-point method in the context of (3.5). In the algorithm, the primal-dual search direction $(\Delta {\mathbf{p}}, \Delta \mathit{\boldsymbol{\lambda}})$ at the point $({\mathbf{p}}, \mathit{\boldsymbol{\lambda}})$ is defined as the solution of the linear equation:

$\begin{align} \left( \begin{array}{cc} {\mathbf{H}} & - {\mathbf{I}} \\ {\rm{diag}}( \mathit{\boldsymbol{\lambda}}) & {\rm{diag}}( {\mathbf{p}}) \\ \end{array} \right) \left( \begin{array}{cc} \Delta {\mathbf{p}} \\ \Delta \mathit{\boldsymbol{\lambda}}\\ \end{array} \right) = -\left( \begin{array}{cc} {\mathbf{h}} - \mathit{\boldsymbol{\lambda}}\\ {\rm{diag}}( \mathit{\boldsymbol{\lambda}}) {\mathbf{p}} - (1/\tau) \mathit{\boldsymbol{1}} \\ \end{array} \right), \end{align}$

(3.8)

where $\tau > 0$ denotes a preset parameter, and ${\mathbf{h}} = (h_i)$ and ${\mathbf{H}} = (H_{ij})$ with $i, j = 0, 1, \ldots, m$ denote the gradient and Hessian matrix of $L({\mathbf{p}})$ at the point $({\mathbf{p}}, \mathit{\boldsymbol{\lambda}})$ , respectively. Specifically,

$\begin{align} h_i & = {{\rm{Tr}}} \left( {\mathbf{a}}_i {\mathbf{a}}_i^{ \mathrm{H}} ( {\mathbf{S}}^{-1} - ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1} ) \right), \end{align}$

(3.9)

$\begin{align} H_{ij} & = {{\rm{Tr}}} \left( {\mathbf{a}}_i {\mathbf{a}}_i^{ \mathrm{H}} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1} {\mathbf{a}}_j {\mathbf{a}}_j^{ \mathrm{H}} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1} \right). \end{align}$

(3.10)

Algorithm 1 The primal-dual interior-point algorithm for problem (3.5).
Given ${\mathbf{p}} \succ 0$ , $\mathit{\boldsymbol{\lambda}} \succ 0$ , $\mu > 1$ , tolerance $\epsilon_0 > 0$ and $\epsilon > 0$ .
repeat
(1) Determine $\tau$ . Set $\eta : = \mathit{\boldsymbol{\lambda}}^{ \mathrm{T}} {\mathbf{p}}$ and $\tau : = \mu (m+1) / \eta$ .
(2) Compute prime-dual search direction $(\Delta {\mathbf{p}}, \Delta \mathit{\boldsymbol{\lambda}})$ . Solve the linear Eq (3.8).
(3) Line search and update. Determine step length $s > 0$ , and set
$\begin{equation} {\mathbf{p}} : = {\mathbf{p}} + s \Delta {\mathbf{p}}, \ \mathit{\boldsymbol{\lambda}} : = \mathit{\boldsymbol{\lambda}} + s \Delta \mathit{\boldsymbol{\lambda}}, \end{equation}$
so that ${\mathbf{p}} \succ 0$ and $\mathit{\boldsymbol{\lambda}} \succ 0$ .
until $\eta \leq \epsilon_0$ and $\\|r_{\text{dual}}\\| \leq \epsilon$ , where $r_{\text{dual}} = {\mathbf{h}} - \mathit{\boldsymbol{\lambda}}$ denotes the dual residual and ${\mathbf{h}}$ denotes the gradient computed using (3.9).
Return ${\mathbf{p}}* : = {\mathbf{p}}$

3.2. MM algorithm

In each iteration of the primal-dual interior-point method given by Algorithm 1, the Hessian matrix computation has computational complexity = $O(m^5)$ . This section presents a novel method for decreasing the computational complexity of each iteration using the MM algorithm framework ^[15].

In the MM algorithm framework, if a continuously differentiable function $g({\mathbf{p}} \mid {\mathbf{p}}_{k})$ satisfies the following condition:

$\begin{align} g( {\mathbf{p}} \mid {\mathbf{p}}_{k}) &\geq L( {\mathbf{p}}), {\ \text{for all} \ {\mathbf{p}}, {\mathbf{p}}_k \succeq 0}, \end{align}$

(3.11)

where equality is achieved at ${\mathbf{p}} = {\mathbf{p}}_{k}$ , the sequence $\{ {\mathbf{p}}_k\}$ generated by

$\begin{align} {\mathbf{p}}_{k+1} = \mathop{\arg \min}_{ {\mathbf{p}} \succeq 0} g( {\mathbf{p}} \mid {\mathbf{p}}_{k}) \end{align}$

(3.12)

converges to a stationary point of problem (3.5). Because the optimization problem (3.5) is convex, its stationary point is actually the optimal point, ${\mathbf{p}}^*$ .

$\log \det ({\mathbf{X}})$ is a concave function that is bounded above by its first-order Taylor expansion at any ${\mathbf{X}}_k$ ^[15]:

$\begin{align} \log \det( {\mathbf{X}}) \leq \log \det( {\mathbf{X}}_k) + {{\rm{Tr}}} ( {\mathbf{X}}_k^{-1} {\mathbf{X}} ) - {v}, \ {\text{for all}} \ {\mathbf{X}}, {\mathbf{X}}_k \in \mathbb{S}_{++}^{ {{{v}}}} \end{align}$

(3.13)

with the equality achieved at ${\mathbf{X}} = {\mathbf{X}}_k$ . Let ${\mathbf{X}} \triangleq ({\mathbf{A}} {\rm{diag}}({\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1}$ and ${\mathbf{X}}_k \triangleq ({\mathbf{A}} {\rm{diag}}({\mathbf{p}}_k) {\mathbf{A}}^{ \mathrm{H}})^{-1}$ , and then inserting them into (3.13) achieves

$\begin{align} \log \det \left(( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{{-1}}\right) \leq &{{\rm{Tr}}}\left( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}_k) {\mathbf{A}}^{ \mathrm{H}} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1}\right)+\\&\log \det \left(( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}_k) {\mathbf{A}}^{ \mathrm{H}})^{-1}\right) - {v}, \end{align}$

(3.14a)

$\begin{align} = & {{\rm{Tr}}}\left( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}_k) {\mathbf{A}}^{ \mathrm{H}} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1}\right) + s_0, \end{align}$

(3.14b)

where the equality in (3.14b) is achieved at ${\mathbf{p}} = {\mathbf{p}}_k$ and $s_0$ is a constant. Applying the Schur complement (a brief proof is provided in Appendix B), for any ${\mathbf{p}}_k \succ 0$ , we obtain

$\begin{align} {{\rm{Tr}}}\left( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}_k) {\mathbf{A}}^{ \mathrm{H}} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1}\right) \leq {{\rm{Tr}}}( {\mathbf{W}}_k ({\rm{diag}}( {\mathbf{p}}))^{-1}), \end{align}$

(3.15)

where

$\begin{align} {\mathbf{W}}_k = {\rm{diag}}( {\mathbf{p}}_{k} ) {\mathbf{A}}^{ \mathrm{H}} ( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}_k) {\mathbf{A}}^{ \mathrm{H}})^{-1} {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}_{k}). \end{align}$

(3.16)

and equality is achieved at ${\mathbf{p}} = {\mathbf{p}}_k$ . By substituting (3.15) into (3.14b), we obtain

$\begin{align} \log \det\left(( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1}\right) \leq {{\rm{Tr}}}( {\mathbf{W}}_k ({\rm{diag}}( {\mathbf{p}}))^{-1}) + s_0, \end{align}$

(3.17)

where equality is achieved at ${\mathbf{p}} = {\mathbf{p}}_k$ . Because $\log \det\left(({\mathbf{A}} {\rm{diag}}({\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})^{-1}\right) = - \log \det\left(({\mathbf{A}} {\rm{diag}}({\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})\right)$ , then by (3.17) we have

$\begin{align} -\log \det\left(( {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}^{ \mathrm{H}})\right) \leq {{\rm{Tr}}}( {\mathbf{W}}_k ({\rm{diag}}( {\mathbf{p}}))^{-1}) + s_0, \end{align}$

(3.18)

where equality is achieved at ${\mathbf{p}} = {\mathbf{p}}_k$ . Let us denote

$\begin{align} g( {\mathbf{p}} \mid {\mathbf{p}}_{k}) = {{\rm{Tr}}}( {\mathbf{M}} {\rm{diag}}( {\mathbf{p}})) + {{\rm{Tr}}}( {\mathbf{W}}_k ({\rm{diag}}( {\mathbf{p}}))^{-1}) \end{align}$

(3.19)

$\begin{align} {\mathbf{M}} = {\mathbf{A}}^{ \mathrm{H}} {\mathbf{S}}^{-1} {\mathbf{A}}. \end{align}$

(3.20)

By substituting (3.18) into (3.4), for any ${\mathbf{p}}_k \succ 0$ , we obtain

$\begin{align} L( {\mathbf{p}}) \leq g( {\mathbf{p}} \mid {\mathbf{p}}_{k}) + s_0 \end{align}$

(3.21)

with equality achieved at ${\mathbf{p}} = {\mathbf{p}}_k$ .

By applying (3.21) under the MM algorithm framework, we derive the following proposition:

Proposition 3.1. For any value of ${\mathbf{p}}_0 \succ 0$ : sequentially solving

$\begin{align} {\mathbf{p}}_{k+1} = \mathop{\arg \min}_{ {\mathbf{p}} \succeq 0} g( {\mathbf{p}} \mid {\mathbf{p}}_{k}) \end{align}$

(3.22)

generates a sequence $\{ {\mathbf{p}}_k\}$ that converges to the global optimal point ${\mathbf{p}}^*$ of the problem (3.5).

Proof. Because (3.21) holds only for any ${\mathbf{p}}_k \succ 0$ , and not for any ${\mathbf{p}}_k \succeq 0$ as necessary in (3.11), the general convergence result of MM algorithms cannot be applied here. Therefore, we need to analyze the convergence further.

Consider the following $\epsilon$ -approximation of the problem (3.5):

$\begin{align} \min\limits_{ {\mathbf{p}} \succeq 0} L^{\epsilon}( {\mathbf{p}}) \triangleq L( {\mathbf{p}} + \epsilon \mathit{\boldsymbol{1}}) \end{align}$

(3.23)

with $\epsilon > 0$ .

Now, applying (3.22) to $\tilde{ {\mathbf{p}}} \triangleq {\mathbf{p}} + \epsilon \mathit{\boldsymbol{1}} \succ 0$ , we conclude that the sequence $\{ {\mathbf{p}}_k^{\epsilon}\}$ converges to the optimal point $({\mathbf{p}}^{\epsilon})^*$ of (3.23), and

$\begin{align} \nabla L^{\epsilon}(( {\mathbf{p}}^{\epsilon})^*) {\mathbf{d}} \geq 0 \end{align}$

(3.24)

for any feasible direction ${\mathbf{d}}$ , where $\nabla L^{\epsilon}(({\mathbf{p}}^{\epsilon})^*)$ denotes a gradient of the objective function $L^{\epsilon}({\mathbf{p}})$ for ${\mathbf{p}}$ in (3.23) at $({\mathbf{p}}^{\epsilon})^*$ .

Let $\epsilon_{k'}$ be a positive sequence, with $\lim_{k' \rightarrow +\infty} \epsilon_{k'} = 0$ . Then, as $\nabla L^{\epsilon}(({\mathbf{p}}^{\epsilon})^*)$ is continuous around $({\mathbf{p}}^{\epsilon})^*$ and $\epsilon$ , the limit point, ${\mathbf{p}}^*$ , of the sequence, $\{({\mathbf{p}}^{\epsilon_{k'}})^*\}$ , is the optimal point for problem (3.5).

In practice, $\epsilon$ can be chosen as to be an arbitrarily small number. Directly applying (3.22) or adapting it to solve the $\epsilon$ -approximation problem is essentially idetnical. Therefore, the sequence $\{ {\mathbf{p}}_k\}$ in (3.22) converges to the optimal point of problem (3.5). □

Problem (3.22), which generates $\{ {\mathbf{p}}_k\}$ , is a simple optimization problem. Thus, obtaining an analytical solution to it is easy. We denote the $i$ -th element of ${\mathbf{p}}_{k+1}$ by $({\mathbf{p}}_{k+1})_i$ . We have

$\begin{align} ( {\mathbf{p}}_{k+1})_i = \left(( {\mathbf{W}}_k)_{i,i} / {\mathbf{M}}_{i,i}\right)^{1/2}, \ i = 0, 1, \ldots, m, \end{align}$

(3.25)

where $({\mathbf{W}}_k)_{i, i}$ and ${\mathbf{M}}_{i, i}$ denote the $i$ -th diagonal elements of ${\mathbf{W}}_k$ in (3.16) and ${\mathbf{M}}$ in (3.20), respectively.

We outline the MM algorithm for (3.5) in Algorithm 2.

Algorithm 2 The MM algorithm for problem (3.5).
Given ${\mathbf{p}}_0 \succ 0$ , tolerance $\epsilon > 0$ , $k = 0$ .
repeat
(1) Update ${\mathbf{p}}_k$ .
$({\mathbf{p}}_{k+1})_i : = (({\mathbf{W}}_k)_{i, i} / {\mathbf{M}}_{i, i})^{1/2}, i = 0, \ldots, m.$
(2) Set $k : = k+1$ .
until $\\| {\mathbf{p}}_{k} - {\mathbf{p}}_{k-1}\\| \leq \epsilon$ .
Return ${\mathbf{p}}* : = {\mathbf{p}}_k$ .

The computational complexity of each iteration is $O(m^3)$ , which is mainly generated by matrix inversion and multiplication. To demonstrate the difference in terms of computational complexity between Algorithms 1 and 2, we compare their iteration numbers and computation times in the numerical experiments presented in Section 5.

4. Real Toeplitz correlation matrix estimation

This section describes the estimation of Toeplitz matrices. Complex Toeplitz covariance matrices appear commonly in linear array signal processing ^[17,24]. During the application of sparse signal models, the complex Toeplitz covariance matrices can always be formulated as W-Rank1-W structural matrices that have been discussed in Section 3 ^[2]. Thus, in this section, we consider only real Toeplitz matrices—specifically the Toeplitz auto-correlation matrix with decreasing correlation coefficients with respect to increasing time lags in the time series analysis. Note that, the mentioned auto-correlation matrix is composed of auto-correlation coefficients at different lags.

Let us denote

$\begin{align} {\mathbf{c}} \triangleq (c_0, c_1, \ldots, c_{ {v}-1})^{ \mathrm{T}}, \end{align}$

(4.1)

and define ${\rm{toep}}({\mathbf{c}})$ to be a symmetric Toeplitz matrix with ${\mathbf{c}}$ as its first column:

$\begin{align} {\rm{toep}}( {\mathbf{c}}) = \left( \begin{array}{ccccc} c_0 & c_1 & \cdots & c_{ {v}-2} & c_{ {v}-1} \\ c_1 & c_0 & c_1 & \cdots & c_{ {v}-2} \\ \vdots & c_1 & c_0 & \ddots & \vdots \\ c_{ {v}-2} & \vdots & \ddots & \ddots & c_1 \\ c_{ {v}-1} & c_{ {v}-2} & \cdots & c_1 & c_0 \\ \end{array} \right). \end{align}$

(4.2)

This section estimates the Toeplitz correlation matrix ${\mathbf{C}} \in \Omega_2$ , where

$\begin{align} \Omega_2 = \left\{ {\mathbf{Z}} \mid {\mathbf{Z}} = {\rm{toep}}( {\mathbf{c}}) \ \text{with} \ |c_0| \geq |c_1| \geq \cdots \geq |c_{ {v}-1}|, \ c_0 = 1 \ \text{and} \ {\rm{toep}}( {\mathbf{c}}) \succ 0 \right\}. \end{align}$

(4.3)

Note that the constraint $c_0 = 1$ can be rewritten as follows:

$\begin{align} {\mathbf{e}} {\mathbf{c}} = 1, \end{align}$

(4.4)

where ${\mathbf{e}}$ denotes a row vector with $1$ as its first element, $0$ as its other elements. The constraint $|c_0| \geq |c_1| \geq \cdots \geq |c_{ {v}-1}|$ can be rewritten as linear constraints:

$\begin{align} {\mathbf{E}}_1 {\mathbf{c}} \preceq 0, \ {\mathbf{E}}_2 {\mathbf{c}} \preceq 0, \end{align}$

(4.5)

where ${\mathbf{E}}_1$ denotes a $({v}-1) \times {v}$ matrix with diagonal elements = $-1$ and the first subdiagonal elements = $1$ , and ${\mathbf{E}}_2$ denotes a $({v}-1) \times {v}$ matrix with diagonal elements = $-1$ and the first subdiagonal elements = $-1$ :

$\begin{align} {\mathbf{E}}_1 = \left( \begin{array}{ccccc} -1 & 1 & 0 & \cdots & 0 \\ 0 & -1 & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & -1 & 1 \\ \end{array} \right), \quad {\mathbf{E}}_2 = \left( \begin{array}{ccccc} -1 & -1 & 0 & \cdots & 0 \\ 0 & -1 & -1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & -1 & -1 \\ \end{array} \right). \end{align}$

(4.6)

We denote a normal distribution with correlation matrix ${\mathbf{R}} \in \mathbb{S}_{++}^{ {{{v}}}}$ by $\mathcal{N}_0$ and a normal distribution with Toeplitz correlation matrix ${\mathbf{C}} \in \Omega_2$ by $\mathcal{N}$ . Assuming the variances to be equal, the relative entropy from $\mathcal{N}$ to $\mathcal{N}_0$ is:

$\begin{align} D(\mathcal{N}_0 \| \mathcal{N}) = {{\rm{Tr}}}( {\mathbf{R}}^{-1} {\mathbf{C}}) {-\log \det ( {\mathbf{R}}^{-1} {\mathbf{C}})}- { {v}}. \end{align}$

(4.7)

Given a correlation matrix estimate of ${\mathbf{R}}$ , such as the sample correlation matrix, minimizing $D(\mathcal{N}_0 \| \mathcal{N})$ in terms of ${\mathbf{C}}$ :

$\begin{align} \min\limits_{ {\mathbf{C}} \in \Omega_2} {{\rm{Tr}}}( {\mathbf{R}}^{-1} {\mathbf{C}}) {- \log \det ( {\mathbf{R}}^{-1} {\mathbf{C}})}, \end{align}$

(4.8)

we deduce the Toeplitz correlation estimate ${\mathbf{C}}^* \in \Omega_2$ . The optimization problem (4.8) is convex and can be resolved using the primal-dual interior-point algorithm with the SDT3 toolbox in Matlab. This problem involves a nonlinear objective function and a positive-definiteness constraint of ${\mathbf{C}}$ . Application of the primal-dual interior-point algorithm requires the computation of the Hessian matrix and the positive-definiteness to be ensured in each iteration. In the following section, we present an estimation algorithm that avoids these time-consuming computations.

4.1. Estimation of the real Toeplitz correlation matrix in $\Omega_2$

In this subsection, we first transform the estimate of the real Toeplitz correlation matrix to an estimate of the W-Rank1-W structural matrix by special variable substitution and then deduce an estimation algorithm within the MM algorithm framework for this estimation.

Given a circulant matrix ${\mathbf{T}} = {\rm{toep}}({\mathbf{t}})$ with

$\begin{align} {\mathbf{t}} = (c_0, c_1, \cdots, c_{ {v}-1}, d_1, d_2, \cdots, d_l, c_{ {v}-1}, c_{ {v}-2}, \cdots, c_1)^{ \mathrm{T}} \end{align}$

(4.9)

and $d_i = d_{l+1-i}$ , $i = 1, \cdots, l$ (specifically, ${\mathbf{t}} = (c_0, c_1, \cdots, c_{ {v}-1}, c_{ {v}-1}, c_{ {v}-2}, \cdots, c_1)^{ \mathrm{T}}$ if $l = 0$ ), i.e.,

$\begin{align} {\mathbf{c}} & = ( {\mathbf{I}}_{ {v}}, \mathit{\boldsymbol{0}}) {\mathbf{t}}, \end{align}$

(4.10)

$\begin{align} {\mathbf{t}} & = {\mathbf{I}}_{(-1)} {\mathbf{t}}, \end{align}$

(4.11)

where

$\begin{align} {\mathbf{I}}_{(-1)} = \left( \begin{array}{cccccc} 1 & 0 & \cdots & 0 & 0 \\ 0 & 0 & \cdots & 0 & 1 \\ \vdots & \vdots & \ddots & \ddots & 0\\ 0 & 0 & \ddots &\ddots &\vdots \\ 0 & 1 & 0 & \cdots & 0 \\ \end{array} \right). \end{align}$

(4.12)

Then, the upper-left ${v}$ -order matrix of ${\mathbf{T}}$ is the Toeplitz matrix:

$\begin{align} {\mathbf{C}} = {\rm{toep}}( {\mathbf{c}}) = \left( \begin{array}{cc} {\mathbf{I}}_{ {v}} & \mathit{\boldsymbol{0}} \\ \end{array} \right) {\mathbf{T}} \left( \begin{array}{cc} {\mathbf{I}}_{ {v}} & \mathit{\boldsymbol{0}} \\ \end{array} \right)^{ \mathrm{T}}. \end{align}$

(4.13)

The circulant matrix ${\mathbf{T}}$ has a Fourier feature decomposition ^[25]:

$\begin{align} {\mathbf{T}} = {\mathbf{F}}^{ \mathrm{H}} {\rm{diag}}( {\mathbf{p}}_{(l)}) {\mathbf{F}}, \end{align}$

(4.14)

where ${\mathbf{F}}$ is $m_l = l+2 {v}-1$ order Fourier matrix and

$\begin{align} {\mathbf{p}}_{(l)} = {\mathbf{F}} {\mathbf{t}}. \end{align}$

(4.15)

Specifically, ${\mathbf{F}}_{i, j} = \frac{1}{\sqrt{m_l}} \mathrm{e}^{ij \frac{2 \pi \mathrm{\iota}}{m_l}}$ with $\iota$ as the imaginary unit, $i, j = 0, 1, \cdots, m_l-1$ , and ${\mathbf{T}} \succeq 0$ if and only if ${\mathbf{p}}_{(l)} \succeq 0$ .

By substituting (4.14) into (4.13), we derive

$\begin{align} {\mathbf{C}} = {\rm{toep}}( {\mathbf{c}}) = {\mathbf{A}}_l {\rm{diag}}( {\mathbf{p}}_{(l)}) {\mathbf{A}}_l^{ \mathrm{H}}, \end{align}$

(4.16)

where ${\mathbf{A}}_l = ({\mathbf{I}}_{p}, \mathit{\boldsymbol{0}}) {\mathbf{F}}^{ \mathrm{H}}$ .

The Corollary C3 in ^[25] states that

$\begin{align} \{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}( {\mathbf{c}}), {\rm{toep}}( {\mathbf{c}}) \succ 0\} \subseteq \cup_{l \geq 0} \{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}( {\mathbf{c}}), {\mathbf{p}}_{(l)} \succeq 0\}, \end{align}$

(4.17)

where ${\mathbf{p}}_{(l)}$ is a function of ${\mathbf{t}}$ in (4.15) and ${\mathbf{t}}$ is a function of ${\mathbf{c}}$ in (4.9). Moreover, the item 2 of the Notes in ^[25] states that ${\rm{toep}}({\mathbf{c}}) \succ 0$ implies ${\rm{toep}}({\mathbf{c}}) \in \{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}({\mathbf{c}}), {\mathbf{p}}_{(l)} \succeq 0\}$ for a sufficiently large $l$ , meaning that if $l$ is sufficiently large,

$\begin{align} \{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}( {\mathbf{c}}), {\rm{toep}}( {\mathbf{c}}) \succ 0\} \subseteq \{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}( {\mathbf{c}}), {\mathbf{p}}_{(l)} \succeq 0\}. \end{align}$

(4.18)

In the following, we simplify the constraint ${\rm{toep}}({\mathbf{c}}) \succ 0$ in $\Omega_2$ to ${\mathbf{p}}_{(l)} \succ 0$ for certain $l$ . If $l$ is sufficiently large, compared to $\{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}({\mathbf{c}}), {\rm{toep}}({\mathbf{c}}) \succ 0\}$ containing only positive-definite matrices, by (4.18), the set $\{ {\mathbf{C}} \mid {\mathbf{C}} = {\rm{toep}}({\mathbf{c}}), {\mathbf{p}}_{(l)} \succeq 0\}$ contains more positive semi-definite matrices. Moreover, as ${\mathbf{C}}$ tends toward the boundary of the positive semi-definite cone, the value of the objective function in (4.8) tends to $+\infty$ . Thus, although we change the constraint set, if $l$ is sufficiently large, the modification ensures that the optimal point remains constant. The numerical experiments reported in Section 5 demonstrate that the estimation accuracy remains good for different $l$ , and we outline some numerical results there.

Now, we present an algorithm for the problem with the constraint ${\rm{toep}}({\mathbf{c}}) \succ 0$ modified to ${\mathbf{p}}_{(l)} \succeq 0$ for certain $l$ :

$\begin{align} \min\limits_{ {\mathbf{C}} \in \Omega_3} {{\rm{Tr}}}( {\mathbf{R}}^{-1} {\mathbf{C}}) {- \log \det ( {\mathbf{R}}^{-1} {\mathbf{C}})}, \end{align}$

(4.19)

where

$\begin{align} \Omega_3 = \{ {\mathbf{Z}} \mid {\mathbf{Z}} = {\mathbf{A}}_l {\rm{diag}}( {\mathbf{p}}_{(l)}) {\mathbf{A}}_l^{ \mathrm{H}} \ \text{with} \ &|c_0| \geq |c_1| \geq \cdots \geq |c_{ {v}-1}|, c_0 = 1, \ \text{and} \ {\mathbf{p}}_{(l)} \succeq 0 \}. \end{align}$

(4.20)

In particular, by applying the expressions (4.4), (4.5), (4.10), (4.11), (4.15) and (4.16), the problem to be resolved, i.e., problem (4.19), becomes

$\begin{align} \min\limits_{ {\mathbf{C}}, {\mathbf{c}}, {\mathbf{t}}, {\mathbf{p}}_{(l)}} \quad & {{\rm{Tr}}}( {\mathbf{R}}^{-1} {\mathbf{C}}) {- \log \det ( {\mathbf{R}}^{-1} {\mathbf{C}})} \end{align}$

(4.21a)

$\begin{align} {\rm{s.t.}} \quad & {\mathbf{C}} = {\mathbf{A}}_l {\rm{diag}}( {\mathbf{p}}_{(l)}) {\mathbf{A}}_l^{ \mathrm{H}}, \end{align}$

(4.21b)

$\begin{align} & {\mathbf{p}}_{(l)} = {\mathbf{F}} {\mathbf{t}}, \end{align}$

(4.21c)

$\begin{align} & {\mathbf{t}} = {\mathbf{I}}_{(-1)} {\mathbf{t}}, \end{align}$

(4.21d)

$\begin{align} & {\mathbf{c}} = ( {\mathbf{I}}_m, \mathit{\boldsymbol{0}}) {\mathbf{t}}, \end{align}$

(4.21e)

$\begin{align} & {\mathbf{E}}_1 {\mathbf{c}} \preceq 0, \end{align}$

(4.21f)

$\begin{align} & {\mathbf{E}}_2 {\mathbf{c}} \preceq 0, \end{align}$

(4.21g)

$\begin{align} & {\mathbf{e}} {\mathbf{c}} = 1, \end{align}$

(4.21h)

$\begin{align} & {\mathbf{p}}_{(l)} \succeq 0. \end{align}$

(4.21i)

It is easy to see ${\mathbf{F}}^{ \mathrm{H}} {\mathbf{F}} = {\mathbf{I}}$ and (4.21c) can be rewritten as follows:

$\begin{align} {\mathbf{t}} = {\mathbf{F}}^{ \mathrm{H}} {\mathbf{p}}_{(l)}. \end{align}$

(4.22)

By substituting (4.22) into (4.21e), we obtain (4.21e) as follows:

$\begin{align} {\mathbf{c}} = {\mathbf{A}}_l {\mathbf{p}}_{(l)}. \end{align}$

(4.23)

Now, we apply the equality constraints (4.21c) and (4.21e), i.e., (4.22) and (4.23), and transform the optimization problem (4.21) into a new optimization problem involving only the variable, ${\mathbf{p}}_{(l)}$ .

It is easy to see that ${\mathbf{I}}_{(-1)} {\mathbf{F}}^{ \mathrm{H}} = {\mathbf{F}}^{ \mathrm{H}} {\mathbf{I}}_{(-1)}$ . Then, by substituting (4.22) into (4.21d), we obtain the constraint (4.21d) as follows:

$\begin{align} {\mathbf{p}}_{(l)} = {\mathbf{I}}_{(-1)} {\mathbf{p}}_{(l)}. \end{align}$

(4.24)

By substituting (4.23) into (4.21f)–(4.21h), we obtain the constraints (4.21f)–(4.21h) as follows:

$\begin{align} {\mathbf{E}}_1 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.25)

$\begin{align} {\mathbf{E}}_2 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.26)

$\begin{align} {\mathbf{e}} {\mathbf{A}}_l {\mathbf{p}}_{(l)} = 1. \end{align}$

(4.27)

Most importantly, applying (4.24)–(4.27), the optimization problem can be rewritten as a problem involving only the variable, ${\mathbf{p}}_{(l)}$ , i.e., as an estimation problem of W-Rank-W structural matrices with an unknown weight vector, ${\mathbf{p}}_{(l)}$ :

$\begin{align} \min\limits_{ {\mathbf{p}}_{(l)}} \quad & {{\rm{Tr}}}( {\mathbf{R}}^{-1} {\mathbf{C}}) {- \log \det ( {\mathbf{R}}^{-1} {\mathbf{C}})} \end{align}$

(4.28a)

$\begin{align} {\rm{s.t.}} \quad & {\mathbf{C}} = {\mathbf{A}}_l {\rm{diag}}( {\mathbf{p}}_{(l)}) {\mathbf{A}}_l^{ \mathrm{H}}, \end{align}$

(4.28b)

$\begin{align} & {\mathbf{p}}_{(l)} = {\mathbf{I}}_{(-1)} {\mathbf{p}}_{(l)}, \end{align}$

(4.28c)

$\begin{align} & {\mathbf{E}}_1 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.28d)

$\begin{align} & {\mathbf{E}}_2 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.28e)

$\begin{align} & {\mathbf{e}} {\mathbf{A}}_l {\mathbf{p}}_{(l)} = 1, \end{align}$

(4.28f)

$\begin{align} & {\mathbf{p}}_{(l)} \succeq 0. \end{align}$

(4.28g)

Denoting the optimal point of (4.28) by ${\mathbf{p}}_{(l)}^*$ , we have ${\mathbf{C}}^* = {\mathbf{A}}_l {\rm{diag}}({\mathbf{p}}_{(l)}^*) {\mathbf{A}}_l^{ \mathrm{H}}$ as an estimate of the correlation matrix.

In the following subsection, we present an MM algorithm for problem (4.28) by applying the design concept of Algorithm 2.

4.1.1. MM algorithm for the real Toeplitz correlation matrix estimation

By substituting (4.28b) into the objective function in (4.28a), we obtain: Applying (3.18) to the objective function (4.28a), we obtain the following result. For any $({\mathbf{p}}_{(l)})_k \succ 0$ , we have

$\begin{align} {{\rm{Tr}}}( {\mathbf{R}}^{-1} {\mathbf{C}}) {- \log \det ( {\mathbf{R}}^{-1} {\mathbf{C}})} &\leq {{\rm{Tr}}}( {\mathbf{M}}_l {\rm{diag}}( {\mathbf{p}}_{(l)})) + {{\rm{Tr}}}(( {\mathbf{W}}_l)_k ({\rm{diag}}( {\mathbf{p}}_{(l)}))^{-1}) + s_1, \end{align}$

(4.29)

$\begin{align} & = \sum\limits_{i = 0}^{m_l-1} \left(M_{i(l)} q_i + (W_{i(l)})_k \frac{1}{q_i}\right) + s_1, \end{align}$

(4.30)

where $s_1$ is a constant and

$\begin{align} {\mathbf{p}}_{(l)} & = (q_0, q_1, \cdots, q_{m_l-1})^{ \mathrm{T}}, \end{align}$

(4.31)

and

$\begin{align} {\mathbf{M}}_l & = {\mathbf{A}}_l^{ \mathrm{H}} {\mathbf{R}}^{-1} {\mathbf{A}}_l, \end{align}$

(4.32)

$\begin{align} ( {\mathbf{W}}_l)_k & = {\rm{diag}}\left( ( {\mathbf{p}}_{(l)})_k \right) {\mathbf{A}}_l^{ \mathrm{H}} \left( {\mathbf{A}}_l {\rm{diag}}( ( {\mathbf{p}}_{(l)})_k ) {\mathbf{A}}_l^{ \mathrm{H}} \right)^{-1} {\mathbf{A}}_l {\rm{diag}}( ( {\mathbf{p}}_{(l)})_k), \end{align}$

(4.33)

and

$\begin{align} (W_{i(l)})_k = (( {\mathbf{W}}_l)_k)_{i,i}, \ M_{i(l)} = ( {\mathbf{M}}_l)_{i,i}, \ i = 0, \cdots, m_l-1, \end{align}$

(4.34)

and the equality in (4.29) is achieved at ${\mathbf{p}}_{(l)} = ({\mathbf{p}}_{(l)})_k$ .

Applying (4.30) and convergence analysis similar to that for Proposition 3.1, we have the following proposition under the MM algorithm framework.

Proposition 4.1. Given $({\mathbf{p}}_{(l)})_0 \succ 0$ , sequentially solving the optimization problem

$\begin{align} ( {\mathbf{p}}_{(l)})_{k+1} = \mathop{\arg \min}_{ {\mathbf{p}}_{(l)}} \ &\sum\limits_{i = 0}^{m_l-1} \left(M_{i(l)} q_i + (W_{i(l)})_k \frac{1}{q_i} \right) \end{align}$

(4.35a)

$\begin{align} {\rm{s.t.}} \ & {\mathbf{p}}_{(l)} = {\mathbf{I}}_{(-1)} {\mathbf{p}}_{(l)}, \end{align}$

(4.35b)

$\begin{align} & {\mathbf{E}}_1 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.35c)

$\begin{align} & {\mathbf{E}}_2 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.35d)

$\begin{align} & {\mathbf{e}} {\mathbf{A}}_l {\mathbf{p}}_{(l)} = 1, \end{align}$

(4.35e)

$\begin{align} & {\mathbf{p}}_{(l)} \succeq 0, \end{align}$

(4.35f)

generates a sequence $\{({\mathbf{p}}_{(l)})_k\}$ that converges to the optimal point ${\mathbf{p}}_{(l)}^*$ of the problem (4.28).

Denoting $r_i = 1 / q_i$ and using the epigraph form, we can rewrite problem (4.35) as a SOCP problem:

$\begin{align} \min\limits_{q_i, r_i} \ &\sum\limits_{i = 0}^{m_l-1} \left(M_{i(l)} q_i +(W_{i(l)})_k r_i\right) \end{align}$

(4.36a)

$\begin{align} {\rm{s.t.}} \ &\bigg\| \left( \begin{array}{c} 2 \\ q_i - r_i \\ \end{array} \right) \bigg\| \leq q_i + r_i, \ i = 0, 1, \cdots, m_l-1. \end{align}$

(4.36b)

$\begin{align} & {\mathbf{p}}_{(l)} = {\mathbf{I}}_{(-1)} {\mathbf{p}}_{(l)}, \end{align}$

(4.36c)

$\begin{align} & {\mathbf{E}}_1 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.36d)

$\begin{align} & {\mathbf{E}}_2 {\mathbf{A}}_l {\mathbf{p}}_{(l)} \preceq 0, \end{align}$

(4.36e)

$\begin{align} & {\mathbf{e}} {\mathbf{A}}_l {\mathbf{p}}_{(l)} = 1, \end{align}$

(4.36f)

$\begin{align} & {\mathbf{p}}_{(l)} \succeq 0. \end{align}$

(4.36g)

This SOCP problem can be solved using the Matlab toolbox, SDPT3.

In conclusion, the MM algorithm for problem (4.28) is presented as Algorithm 3, and is referred to as the MM-T algorithm in the following. In each iteration, Algorithm 3 involves SOCP optimization that can be resolved using the Matlab Toolbox, SDP3.

Algorithm 3 The MM-T algorithm.
Given $({\mathbf{p}}_{(l)})_0$ satisfying (4.36b)-(4.36f) and $({\mathbf{p}}_{(l)})_0 \succ 0$ , tolerance $\epsilon > 0$ , $k = 0$ .
repeat
(1) Update $({\mathbf{p}}_{(l)})_k$ . solve the SOCP problem (4.36).
(2) Set $k : = k+1$ .
until $\\|({\mathbf{p}}_{(l)})_k - ({\mathbf{p}}_{(l)})_{k-1}\\| \leq \epsilon$ .
Return ${\mathbf{p}}_{(l)}^* : = ({\mathbf{p}}_{(l)})_k$ , and ${\mathbf{C}}^* = {\mathbf{A}}_l {\rm{diag}}({\mathbf{p}}_{(l)}^*) {\mathbf{A}}_l^{ \mathrm{H}}$ .

5. Numerical experiments

In this section, we evaluate the performance of Algorithms 1–3 numerically. First, Algorithms 1 and 2 are utilized for DOA estimation. Then, Algorithm 3 is used to estimate Toeplitz autocorrelation matrices in time-series analysis and its performance is compared with those of other estimation methods. All computations were performed using Matlab2018a on a system equipped with an Intel(R) Xeon(R) Platinum 8372HC CPU at 3.40GHz.

5.1. DOA estimation

The observation ${\mathbf{x}}(t) \in \mathbb{C}^{ {{{v}}}}$ received by a uniform linear array at time $t$ is always modeled as ^[2]:

$\begin{align} {\mathbf{x}}(t) = \sum\limits_{i = 0}^{360} {\mathbf{a}}(\theta_i) s_i(t) + {\mathbf{n}}(t), \end{align}$

(5.1)

where $\theta_i = (i/2)^{\circ}$ denotes the grid point in $180$ -degree coverage: $0^{\circ} \backsim 180^{\circ}$ , $s_i(t) \in \mathbb{C}$ denotes the source signal from direction $\theta_i$ , ${\mathbf{a}}(\theta_i) = (1, \mathrm{e}^{\imath \pi \cos(\theta_i)}, \cdots, \mathrm{e}^{({v}-1) \imath \pi \cos(\theta_i)})^{ \mathrm{T}}$ denotes the steering vector, and ${\mathbf{n}}(t) \in \mathbb{C}^{ {{{v}}}}$ denotes the noise vector. In addition, let us assume that ${\mathbf{s}}(t) = (s_i(t))$ is $\mathcal{CN}$ -distributed with zero mean and covariance ${\rm{diag}}({\mathbf{p}}_{{d}})$ , ${\mathbf{n}}(t)$ is $\mathcal{CN}$ -distributed with zero mean and covariance ${\rm{diag}}(\mathit{\boldsymbol{\sigma}})$ , and ${\mathbf{s}}(t)$ and ${\mathbf{n}}(t)$ are both temporally and mutually independent. In particular, the noise energy vector is $\mathit{\boldsymbol{\sigma}} = \sigma^2 \mathit{\boldsymbol{1}}$ with $\mathit{\boldsymbol{1}}$ as a vector with all elements being $1$ , and $\sigma$ can be designed to vary the signal-to-noise ratio (SNR) as follows:

$\begin{align} \text{SNR} = 10 \log_{10} \frac{{{\rm{Tr}}}( {\mathbf{A}}_d {\rm{diag}}( {\mathbf{p}}_{{d}}) {\mathbf{A}}_d^{ \mathrm{H}})}{p \sigma^2}, \end{align}$

(5.2)

where ${\mathbf{A}}_d = ({\mathbf{a}}(\theta_0), {\mathbf{a}}(\theta_1), \cdots, {\mathbf{a}}(\theta_{360}))$ . The covariance matrix of ${\mathbf{x}}(t)$ is given by the W-Rank1-W structure:

$\begin{align} {\bf{\Sigma}} = {\mathbf{A}}_d {\rm{diag}}( {\mathbf{p}}_{{d}}) {\mathbf{A}}_d^{ \mathrm{H}} + {\rm{diag}}( \mathit{\boldsymbol{\sigma}}) = {\mathbf{A}} {\rm{diag}}( {\mathbf{p}}) {\mathbf{A}}, \end{align}$

(5.3)

where ${\mathbf{A}} = [ {\mathbf{A}}_d, {\mathbf{I}}]$ and ${\mathbf{p}} = (({\mathbf{p}}_{{d}})^{ \mathrm{T}}, \mathit{\boldsymbol{\sigma}}^{ \mathrm{T}})^{ \mathrm{T}}$ . We denote ${\mathbf{p}}_{{d}} = (p'_i)$ , where $p'_i$ is the signal energy in the direction, $\theta_i$ . Peak detection is performed on the energy vector, ${\mathbf{p}}_{{d}}$ , where the peak energy occurs at the source signal, enabling the DOA estimation.

In this experimental scenario, $4$ source signals arrived from the directions: $\xi_1 = 30^{\circ}$ , $\xi_2 = 60^{\circ}$ , $\xi_3 = 120^{\circ}$ , and $\xi_4 = 160^{\circ}$ , with corresponding signal energies of $15$ , $5$ , $5$ , and $15$ , respectively. Let us consider the case in which the linear array consists of ${v} = 15$ sensors and $n$ independent array observations are simulated by the model (5.1) to obtain the sample covariance matrix. By applying Algorithms 1 and 2 to estimate ${\bf{\Sigma}}$ in (5.3), we obtain an estimate of ${\mathbf{p}}_{{d}}$ , denoted by ${\mathbf{p}}_{{d}}^*$ . By performing peak detection on ${\mathbf{p}}_{{d}}^*$ , we obtained the DOA estimates: $\xi^*_1, \cdots, \xi_4^*$ , assuming the number of true DOAs is known in advance.

Table 1 presents a performance comparison of Algorithms 1 and 2 in terms of computational complexity and estimation accuracy, where

$\begin{align} \text{RMSE} = \frac{1}{1000} \sum\limits_{i = 1}^{1000} \left(( {\mathbf{p}} - {\mathbf{p}}_{(i)}^*)^{ \mathrm{T}}( {\mathbf{p}} - {\mathbf{p}}_{(i)}^*)\right)^{1/2}, \end{align}$

(5.4)

where ${\mathbf{p}}_{(i)}^*$ denotes the estimate of ${\mathbf{p}}$ in (5.3) in the $i$ -th Monte Carlo simulation. These results demonstrate that the MM algorithm (Algorithm 2) exhibits better computational efficiency than the PDIP (primal-dual interior-point) algorithm (Algorithm 1), and that their computational accuracies are comparable. In Figure 1, the DOA estimation performances of the MM method (Algorithm 2) and the classical DOA estimation methods: Likelihood-based estimation of sparse parameters (LIKES) in ^[12] and sparse iterative covariance-based estimation (SPICE) in ^[11], are compared, where

$\begin{align} \text{RMSE}_{\text{DOA}} = \frac{1}{1000} \sum\limits_{i = 1}^{1000} \sum\limits_{j = 1}^{4} |(\xi_j)_{(i)} - (\xi_j^*)_{(i)}|, \end{align}$

(5.5)

and $(\xi_j^*)_{(i)}$ is the estimate of $\xi_j$ in the $i$ -th Monte Carlo simulation. It is observed that the MM method offers higher DOA estimation accuracy. In , the normalized spectrum is defined to be ${\mathbf{p}}^* / \max({\mathbf{p}}^*)$ , where $\max({\mathbf{p}}^*)$ denotes the maximal element in the vector, ${\mathbf{p}}^*$ . The black dashed lines represent the true DOAs and the normalized spectrum curves represent the results of a randomly selected realization. The location of the peak spectrum lies very close to the true DOA, and a higher SNR is observed corresponding to a higher angular resolution.

Table 1. Comparison of the performances of the MM and PDIP algorithms in terms of the number of iterations, computation time (seconds), and Root Mean-Squared Error (RMSE) under different SNRs. The number of iterations and computation time displayed are the averaged results over 1000 Monte Carlo simulation repetitions. The sample size is

$n = 500$ .

SNR(dB)	Number of iterations		Computation time (seconds)		RMSE
SNR(dB)	MM	PDIP	MM	PDIP	MM	PDIP
0	22.67	79.68	0.3900	147.54	20.69	19.69
2	24.16	87.44	0.4100	161.66	20.18	18.88
4	25.52	107.31	0.4400	198.10	19.56	17.57
6	27.03	139.78	0.4700	257.44	18.75	14.47

| Show Table

DownLoad: CSV

Figure 1. Comparison of the MM, SPICE, and LIKES methods in terms of the

$\text{RMSE}_{\text{DOA}}$ (degrees) of DOA estimation. (a) SNR

$= 0$ dB; (b) The sample size is

$n = 500$ .

DownLoad: Full-Size Img PowerPoint

Figure 2. Normalized spectrum. The sample size is

$n = 500$ .

DownLoad: Full-Size Img PowerPoint

5.2. Toeplitz correlation matrix estimation

In a $q$ -order moving average time series (MA( $\hat{q}$ )),

$\begin{align} x_t = \varepsilon_t + \phi_1 \varepsilon_{t-1} + \phi_2 \varepsilon_{t-2} + \cdots + \phi_q \varepsilon_{t-\hat{q}}, \end{align}$

(5.6)

where $\phi_i = \phi^i$ , $i = 1, \cdots, \hat{q}$ , and $\varepsilon_t, \cdots, \varepsilon_{t-\hat{q}}$ are independently and identically distributed with zero mean and variance 1. A simple calculation yields that the ${v} \times {v}$ autocorrelation matrix ${\mathbf{C}}$ exhibits a Toeplitz structure with decreasing autocorrelation coefficients as the time lag increases. Denote the first column of ${\mathbf{C}}$ as ${\mathbf{c}}$ , and use $\hat{q} = 5$ . This experiment assessed the performance of the MM-T algorithm in estimating this autocorrelation matrix, focusing on its estimation accuracy and computation time, and compared them with these of other methods:

● PDIP-T: the primal-dual interior-point algorithm solving (4.8);

● Samp: the estimator of Toeplitz covariance matrices mentioned in ^[16] that minimizes the F-norm loss between the sample covariance matrices;

● HY: the estimation of Toeplitz covariances via the entropy loss function using the alternating direction method of multipliers algorithm given in ^[21].

Note that the sample autocorrelation matrix is used as the given correlation matrix ${\mathbf{R}}$ in the MM-T and PDIP-T algorithms, and the sample autoconvariance matrix is used as the given covariance matrix in the HY method, and that the sample size is $n = 1000$ .

lists the RMSE $_{\text{toep}}$ corresponding to different $l$ values under different ${v}$ and $\phi$ . The RMSE $_{\text{toep}}$ is defined as

$\begin{align} \text{RMSE}_{\text{toep}} = \frac{1}{1000} \sum\limits_{i = 1}^{1000} \left(( {\mathbf{c}} - {\mathbf{c}}_{(i)}^*)^{ \mathrm{T}} ( {\mathbf{c}} - {\mathbf{c}}_{(i)}^*)\right)^{1/2}, \end{align}$

(5.7)

where ${\mathbf{c}}_{(i)}^*$ denotes the estimate of ${\mathbf{c}}$ achieved during the $i$ -th Monte Carlo repetition. For different $l$ , $\text{RMSE}_{\text{toep}}$ values remain stable. Therefore, we assumed $l = 10$ to complete this experiment.

depicts the computation times of the MM-T and PDIP-T algorithms. The results indicate that as the order of the autocorrelation matrix increases (i.e., the considered time lag ${v}-1$ increases), the rate of increase in the computational time of MM-T becomes slower than that of PDIP-T, and when ${v} \geq 50$ , MM-T requires less time.

Table 2. Comparison of

$\text{RMSE}_{\text{toep}}$ s of the MM-T algorithm corresponding to different values of

$l$ .

$l$		0	10	20	30	40	50
${v}=30$	$\phi = 0.25$	0.0211	0.0215	0.0213	0.0213	0.0213	0.0213
${v}=30$	$\phi = 0.75$	0.0655	0.0650	0.0654	0.0655	0.0654	0.0654
${v}=60$	$\phi = 0.25$	0.0231	0.0235	0.0234	0.0233	0.0233	0.0232
${v}=60$	$\phi = 0.75$	0.0753	0.0750	0.0753	0.0752	0.0753	0.0754

| Show Table

DownLoad: CSV

Figure 3. Performance comparison for varying orders

${v}$ of the autocorrelation matrix

${\mathbf{C}}$ . (a) Comparison of the MM-T and PDIP-T algorithms in terms of computation time (seconds); (b) Comparison of the MM-T, PDIP-T, HY and Samp methods in terms of

$\text{RMSE}_{\text{toep}}$ .

DownLoad: Full-Size Img PowerPoint

compares the estimation accuracies of the different methods. The MM-T and PDIP-T algorithms exhibit lower $\text{RMSE}_{\text{toep}}$ values, which is reasonable as they utilize a decreasing correlation structure, unlike the Samp and HY methods. The estimation accuracies of the MM-T and PDIP-T algorithms are essentially equal; however, when ${v}$ is moderate to large, the computation time required by MM-T is shorter. Therefore, the MM-T algorithm is recommended over PDIP-T.

5.3. Real data analysis

We collected the closing stock prices of 511 companies in the technology field on Fridays during the period between 2023-9-15 and 2023-11-17 (8 weeks). We compute the volatility series as follows:

$\begin{align} \text{Volatility}_{\text{current Friday}} = \frac{|\text{Close}_{\text{current Friday}} - \text{Close}_{\text{previous Friday}}|}{\text{Close}_{\text{current Friday}}}, \end{align}$

(5.8)

where $\text{Close}_{\text{current Friday}}$ and $\text{Close}_{\text{previous Friday}}$ denote the closing stock prices on current and previous Fridays, respectively, and $\text{Volatility}_{\text{current Friday}}$ denotes the volatility of closing stock prices on Fridays.

Subsequently, to investigate the temporal dependencies in the volatility series, we assume that the autocorrelation matrix of the volatility series exhibits a Toeplitz structure: We apply the MM-T and HY algorithms to estimate the autocorrelation matrix. Table 3 presents the first column of the autocorrelation matrix estimation; i.e., the autocorrelation coefficients corresponding to different time lags.

Table 3. The estimation of the autocorrelation coefficients at different time lags, using the MM-T algorithm proposed in this paper and the HY method in ^[21].

Time lag	1	2	3	4	5	6	7	8
MM-T	0.2480	0.2274	0.2274	0.2274	0.2274	0.2274	0.2274	0.2257
HY	0.2371	0.2033	0.1810	0.2184	0.2191	0.2684	0.2380	0.2584

| Show Table

DownLoad: CSV

shows that the results of the MM-T and HY methods are similar. Moreover, the autocorrelation coefficients corresponding to different lags are nearly identical, but lie around a small value of $0.22$ . The small autocorrelation coefficients indicate no obvious dependency between volatilities on different Fridays, which is consistent with the unpredictable nature of stock price fluctuations.

6. Conclusions

We focus on estimating of the W-Rank1-W structural covariance matrix, which appears commonly in signal processing, and the auto-correlation matrix characterized by decreasing correlation coefficients, which appears commonly in time-series analysis. To this end, novel algorithms are proposed using the MM algorithm framework. Numerical experiments reveal that the proposed MM algorithms exhibit superior computational efficiencies. The proposed strategy for dealing with the surrogate function under the MM framework can be applied to other similar convex optimization problems.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

This work was supported in part by the National Nature Science Foundation of China under Grants 62203451 and the Sichuan Province Science and Technology Support Program under Grants 2022JDRC0068 and 2021JDRC0080. The Fundamental Research Funds for the Central Universities 24CAFUC03055.

Conflict of interest

The authors declare that they have no competing interests.

References

[1]	S. Haykin, J. Litva, T. Shepherd, Radar array processing, Heidelberg: Springer-Verlag, 1993. https://doi.org/10.1007/978-3-642-77347-1
[2]	Z. Yang, J. Li, P. Stoica, L. Xie, Sparse methods for direction-of-arrival estimation, In: Academic press library in signal processing, volume 7: array, radar and communications engineering, London: Academic Press, 2018,509–581. https://doi.org/10.1016/B978-0-12-811887-0.00011-0
[3]	X. Wu, X. Yang, X. Jia, F. Tian, A gridless DOA estimation method based on convolutional neural network with Toeplitz prior, IEEE Signal Proc. Lett., 29 (2022), 1247–1251. https://doi.org/10.1109/LSP.2022.3176211 doi: 10.1109/LSP.2022.3176211
[4]	J. Hamilton, Time series analysis, Princeton: Princeton University Press, 1994.
[5]	T. Cai, Z. Ren, H. Zhou, Estimating structured high-dimensional covariance and precision matrices: optimal rates and adaptive estimation, Electron. J. Stat., 10 (2016), 1–59. https://doi.org/10.1214/15-EJS1081 doi: 10.1214/15-EJS1081
[6]	Y. Sun, P. Babu, D. Palomar, Robust estimation of structured covariance matrix for heavy-tailed elliptical distributions, IEEE T. Signal Proces., 64 (2016), 3576–3590. https://doi.org/10.1109/TSP.2016.2546222 doi: 10.1109/TSP.2016.2546222
[7]	A. Mieldzioc, M. Mokrzycka, A. Sawikowska, Covariance regularization for metabolomic data on the drought resistance of barley, Biometrical Letters, 56 (2019), 165–181. https://doi.org/10.2478/bile-2019-0010 doi: 10.2478/bile-2019-0010
[8]	K. Filipiak, D. Klein, A. Markiewicz, M. Mokrzycka, Approximation with a kronecker product structure with one component as compound symmetry or autoregression via entropy loss function, Linear Algebra Appl., 610 (2021), 625–646. https://doi.org/10.1016/j.laa.2020.10.013 doi: 10.1016/j.laa.2020.10.013
[9]	D. Ramírez, G. Vazquez-Vilar, R. López-Valcarce, J. Vía, I. Santamaría, Detection of rank-p signals in cognitive radionetworks with uncalibrated multiple antennas, IEEE T. Signal Proces., 59 (2011), 3764–3774. https://doi.org/10.1109/TSP.2011.2146779 doi: 10.1109/TSP.2011.2146779
[10]	O. Besson, Adaptive detection of gaussian rank-one signalsusing adaptively whitened data and rao, gradient and durbin tests, IEEE Signal Proc. Lett., 30 (2023), 399–402. https://doi.org/10.1109/LSP.2023.3265802 doi: 10.1109/LSP.2023.3265802
[11]	P. Stoica, P. Babu, J. Li, SPICE: a sparse covariance-based estimation method for array processing, IEEE T. Signal Proces., 59 (2011), 629–638. https://doi.org/10.1109/TSP.2010.2090525 doi: 10.1109/TSP.2010.2090525
[12]	P. Stoica, P. Babu, SPICE and LIKES: two hyperparameter-free methods for sparse-parameter estimation, Signal Process., 92 (2012), 1580–1590. https://doi.org/10.1016/j.sigpro.2011.11.010 doi: 10.1016/j.sigpro.2011.11.010
[13]	C. Chen, J. Zhou, M. Tang, Direction of arrival estimation in elliptical models via sparse penalized likelihood approach, Sensors, 19 (2019), 2356. https://doi.org/10.3390/s19102356 doi: 10.3390/s19102356
[14]	S. Kullback, R. Leibler, On information and sufficiency, Ann. Math. Statist., 22 (1951), 79–86. https://doi.org/10.1214/aoms/1177729694 doi: 10.1214/aoms/1177729694
[15]	Y. Sun, P. Babu, D. Palomar, Majorization-minimization algorithms in signal processing, communications, and machine learning, IEEE T. Signal Proces., 65 (2017), 794–816. https://doi.org/10.1109/TSP.2016.2601299 doi: 10.1109/TSP.2016.2601299
[16]	T. Cai, Z. Ren, H. Zhou, Optimal rates of convergence for estimating toeplitz covariance matrices, Probab. Theory Relat. Fields, 156 (2013), 101–143. https://doi.org/10.1007/s00440-012-0422-7 doi: 10.1007/s00440-012-0422-7
[17]	H. Li, P. Stoica, J. Li, Computationally efficient maximum likelihood estimation of structured covariance matrices, IEEE T. Signal Proces., 47 (1999), 1314–1323. https://doi.org/10.1109/78.757219 doi: 10.1109/78.757219
[18]	K. Filipiak, A. Markiewicz, A. Mieldzioc, A. Sawikowska, On projection of a positive definite matrix on a cone of nonnegative definite Toeplitz matrices, Electron. J. Linear Al., 33 (2018), 74–82. https://doi.org/10.13001/1081-3810.3750 doi: 10.13001/1081-3810.3750
[19]	L. Lin, N. Higham, J. Pan, Covariance structure regularization via entropy loss function, Comput. Stat. Data Anal., 72 (2014), 315–327. https://doi.org/10.1016/j.csda.2013.10.004 doi: 10.1016/j.csda.2013.10.004
[20]	C. Chen, J. Zhou, J. Pan, Correlation structure regularization via entropy loss function for high-dimension and low-sample-size data, Commun. Stat.-Simul. Comput., 50 (2021), 993–1008. https://doi.org/10.1080/03610918.2019.1571607 doi: 10.1080/03610918.2019.1571607
[21]	Y. Yang, J. Zhou, J. Pan, Estimation and optimal structure selection of high-dimensional toeplitz covariance matrix, J. Multivariate Anal., 184 (2021), 104739. https://doi.org/10.1016/j.jmva.2021.104739 doi: 10.1016/j.jmva.2021.104739
[22]	C. Stein, Lectures on the theory of estimation of many parameters, Journal of Soviet Mathematics, 34 (1986), 1373–1403. https://doi.org/10.1007/BF01085007 doi: 10.1007/BF01085007
[23]	S. Boyd, L. Vandenberghe, Convex optimization, Cambridge: Cambridge University Press, 2004.
[24]	B. Zhang, S. Yuan, Shrinkage estimators of large covariance matrices with Toeplitz targets in array signal processing, Sci. Rep., 12 (2022), 19032. https://doi.org/10.1038/s41598-022-21889-8 doi: 10.1038/s41598-022-21889-8
[25]	A. Dembo, C. Mallows, L. Shepp, Embedding nonnegative definite toeplitz matrices in nonnegative definite circulant matrices with application to covariance estimation, IEEE T. Inform. Theory, 35 (1989), 1206–1212. https://doi.org/10.1109/18.45276 doi: 10.1109/18.45276
[26]	F. Zhang, The Schur complement and its applications, New York: Springer, 2005. https://doi.org/10.1007/b105056

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Mathematics

1.8 3.4

Metrics

Article views(965) PDF downloads(30) Cited by(0)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(3) / Tables(3)

AIMS Mathematics

Convex-structured covariance estimation via the entropy loss under the majorization-minimization algorithm framework

Related Papers:

Abstract

1. Introduction

2. Problem formulation

3. W-Rank1-W structural covariance matrix estimation

3.1. Primal-dual interior-point algorithm

3.2. MM algorithm

4. Real Toeplitz correlation matrix estimation

4.1. Estimation of the real Toeplitz correlation matrix in $\Omega_2$

4.1.1. MM algorithm for the real Toeplitz correlation matrix estimation

5. Numerical experiments

5.1. DOA estimation

5.2. Toeplitz correlation matrix estimation

5.3. Real data analysis

6. Conclusions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Mathematics

Convex-structured covariance estimation via the entropy loss under the majorization-minimization algorithm framework

Related Papers:

Abstract

1. Introduction

2. Problem formulation

3. W-Rank1-W structural covariance matrix estimation

3.1. Primal-dual interior-point algorithm

3.2. MM algorithm

4. Real Toeplitz correlation matrix estimation

4.1. Estimation of the real Toeplitz correlation matrix in Ω2 \Omega_2

4.1.1. MM algorithm for the real Toeplitz correlation matrix estimation

5. Numerical experiments

5.1. DOA estimation

5.2. Toeplitz correlation matrix estimation

5.3. Real data analysis

6. Conclusions

Use of AI tools declaration

Acknowledgments

Conflict of interest

References

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog

4.1. Estimation of the real Toeplitz correlation matrix in $\Omega_2$