The parameter-Newton iteration for the second-order cone linear complementarity problem

Peng Zhou; Teng Wang; Peng Zhou; Teng Wang

doi:10.3934/era.2022076

Electronic Research Archive

2022, Volume 30, Issue 4: 1454-1462. doi: 10.3934/era.2022076

Previous Article Next Article

Research article Special Issues

The parameter-Newton iteration for the second-order cone linear complementarity problem

Peng Zhou ^1,2,
Teng Wang ^{1
,
,}

1.
School of Mathematics and Computer Sciences, Nanchang University, China
2.
Jiangxi Institute of Economic Administrators, China

Received: 08 December 2021 Revised: 27 February 2022 Accepted: 06 March 2022 Published: 21 March 2022

In this paper, we propose the parameter-Newton (PN) method to solve the second-order linear complementarity problem (SOCLCP). The key idea of PN method is that we transfer the SOCLCP into a system of nonlinear equations by bringing in a parameter. Then we solve the system of nonlinear equations by Newton method. At last, we prove that the PN method has quadratic convergence. Compared with the bisection-Newton (BN) method, the PN method has less CPU time and higher accuracy in numerical tests.

Keywords:

second-order cone linear complementarity problem,
Newton iteration,
parameter,
higher accuracy,
quadratic convergence

Citation: Peng Zhou, Teng Wang. The parameter-Newton iteration for the second-order cone linear complementarity problem[J]. Electronic Research Archive, 2022, 30(4): 1454-1462. doi: 10.3934/era.2022076

Related Papers:

[1]	Wanru Du, Xiaochuan Jing, Quan Zhu, Xiaoyin Wang, Xuan Liu . A cross-modal conditional mechanism based on attention for text-video retrieval. Mathematical Biosciences and Engineering, 2023, 20(11): 20073-20092. doi: 10.3934/mbe.2023889
[2]	Hai Lu, Enbo Luo, Yong Feng, Yifan Wang . Video-based person re-identification with complementary local and global features using a graph transformer. Mathematical Biosciences and Engineering, 2024, 21(7): 6694-6709. doi: 10.3934/mbe.2024293
[3]	Liang She, Meiyue You, Jianyuan Wang, Yangyan Zeng . Video-based Person re-identification with parallel correction and fusion of pedestrian area features. Mathematical Biosciences and Engineering, 2023, 20(2): 3504-3527. doi: 10.3934/mbe.2023164
[4]	Xiaobo Zhang, Donghai Zhai, Yan Yang, Yiling Zhang, Chunlin Wang . A novel semi-supervised multi-view clustering framework for screening Parkinson's disease. Mathematical Biosciences and Engineering, 2020, 17(4): 3395-3411. doi: 10.3934/mbe.2020192
[5]	Miin-Shen Yang, Wajid Ali . Fuzzy Gaussian Lasso clustering with application to cancer data. Mathematical Biosciences and Engineering, 2020, 17(1): 250-265. doi: 10.3934/mbe.2020014
[6]	Jun Wu, Xinli Zheng, Jiangpeng Wang, Junwei Wu, Ji Wang . AB-GRU: An attention-based bidirectional GRU model for multimodal sentiment fusion and analysis. Mathematical Biosciences and Engineering, 2023, 20(10): 18523-18544. doi: 10.3934/mbe.2023822
[7]	Wenkui Zheng, Guangyao Zhang, Chunling Fu, Bo Jin . An adaptive feature selection algorithm based on MDS with uncorrelated constraints for tumor gene data classification. Mathematical Biosciences and Engineering, 2023, 20(4): 6652-6665. doi: 10.3934/mbe.2023286
[8]	Xiaojun Liang . RETRACTED ARTICLE: A video images-aware knowledge extraction method for intelligent healthcare management of basketball players. Mathematical Biosciences and Engineering, 2023, 20(2): 1919-1937. doi: 10.3934/mbe.2023088
[9]	Meijuan Sun . A Vision sensing-based automatic evaluation method for teaching effect based on deep residual network. Mathematical Biosciences and Engineering, 2023, 20(4): 6358-6373. doi: 10.3934/mbe.2023275
[10]	Sonza Singh, Anne Marie France, Yao-Hsuan Chen, Paul G. Farnham, Alexandra M. Oster, Chaitra Gopalappa . Progression and transmission of HIV (PATH 4.0)-A new agent-based evolving network simulation for modeling HIV transmission clusters. Mathematical Biosciences and Engineering, 2021, 18(3): 2150-2181. doi: 10.3934/mbe.2021109

Abstract

1. Introduction

Video summarization, the task of concussing the original video to a summary, can fully catch the eye-catching video information. Since the 1990s, video summarization technology has gained considerable domestic and international attention. It has made significant contribution in quickly understanding the video information, which is an efficient tool for fast browsing and retrieval of videos ^[1].

Key-frame extraction is an indispensable assistant for static video summarization technology that characterizes the principal video contents with representative frames extraction in order to provide a convenient method to quickly and comprehensively grasp video information. The prevailing assumption is that the goal is to extract video summary accurately and automatically. Key frame extraction methods can be divided into three categories: shot based, content based and cluster based ^[2]. Currently, some shot based techniques are developed in the area of computer vision and image processing ^[3]. Huang C. extracts representative frames from each shot by computing the frame image difference in saliency and edge map features ^[4]. Mehmood I. analyzes the difference between frame images in a shot by modeling an auditory and perceptual attention feature ^[5]. Song G. H. computes the color difference in one shot by employing the average histogram method ^[6]. It is common for shot based methods to segment the original video into several shots at first. However, the shot segmentation process is computationally expensive. Content based methods can avoid this problem. Rachida H. proposes MSKVS, a content-based method, to measure the inter-frame distance by time and visual features. MSKVS guarantees superior performance over other content-based methods ^[7]. Gianluigi C. conducts experiment on six new and sport competition videos by employing his content-based method. Experimental results demonstrate that his method can effectively extract key frames ^[8]. Generally, these content-based methods analyze the video content by extracting color, texture or motion feature. A limiting factor of content-based methods is that the computational cost is incurred in the process of frame image features ^[9]. This limitation encourages the development of cluster-based methods. Cluster based techniques work by clustering together the similar frames and extracting one representative frame of each class. They avoid shot segmentation error of shot based methods, decrease inter-frame difference analysis frequency of content-based methods, and can be well-suited to key frame extraction in related fields ^[10,11,12].

Numerous limitations have been posed to encourage the development of cluster-based key-frame extraction algorithms ^[13]. The prevailing steps are cluster data, aggregate video frames into multiple clusters, and extract representative frame to compose a video summarization ^[14]. The cluster-based key-frame extraction not only avoids shot segmentation error and complexity, but also decreases the inter-frame difference analysis frequency. There are also many other problems of the cluster-based methods, such as extracting single image feature cannot fully represent frame image, manually determining the number of clusters and the initial cluster center caused a low degree of automation, the extracted key frame cannot represent the original videos. Therefore, this paper aims to improve the accuracy and the automation of key-frame extraction. In this paper, a novel cluster based key frame extraction approach is proposed. The benefits of this approach are as follows:

1) A video content analysis method is proposed to improve the representative of video feature data by using three visual features: color, texture and information complexity.

2) We develop a threshold optimization method to avoid manual selection of clustering threshold. This method can improve the automation of cluster-based key frame extraction.

3) We utilize the fusion of the frame density, inter-cluster distance and intra-cluster distance to filter the key frame candidates and employ the max weight factor parameter to further refine key frame candidates. This method is favorable to improve the frame representation and the overall fidelity.

The rest of this paper is organized as follows. Section 2 describes the details of cluster-based methods. In section 3, we present the implementation of feature extraction method. Section 4 provides a detailed description of the proposed cluster based key frame extraction method. In section 5, we explore the performance of the proposed approach. Finally, the major work is discussed and wrapped up in Section 6.

2. Related work

It is quite common for cluster-based methods to transform the frame image into data points in feature space and cluster these data points to extract key frames. It is similar to the clustering algorithm, which gathers similar elements together, and takes the cluster center as the representative of clusters. Recently, some cluster-based key frame extraction methods have been proposed in literature ^[15,16].

The basic cluster-based methods can be categorized into two: automatic and semi-automatic cluster-based methods ^[17]. In general, semi-automatic cluster-based method requires manual determination the initial cluster centers and the number of clusters. Setting the number of clusters in advance may affect key frames extraction results ^[14]. It is more reasonable to determine the number of clusters according to different video contents in clustering process. Therefore, the automatic key frame extraction technology is more practical. Among the automatic cluster schemes, Kuanar extracts the color and texture features and propose an automated method of video key frame extraction using dynamic Delaunay graph clustering ^[10]. In ^[11], a fused key frame extraction framework is proposed. This method generates video summaries by combining sparse selection and agglomerative hierarchical clustering method based on mutual information. It extracts candidate key frames by an improved MIAHC algorithm caused the fidelity and operation efficiency are improved. In ^[12], the authors propose the coarse clustering and fine clustering key frame extraction. The traditional spectral clustering method with simple histogram features is used to remove most of the redundant frames. Then, the image classification method based on SIFT feature sparse coding is used to perform fine clustering for each time period. In ^[19], Liu and his colleges propose a key frame extraction method combining k-means algorithm and hierarchical clustering algorithm. They obtain initial clusters by employing the improved hierarchical clustering algorithm. Then they use the k-means algorithm to optimize the initial clusters to obtain the optimal clusters. In ^[20], the authors proposed a novel cluster-based algorithm, which is inspired by the idea of high-density peak search clustering algorithm. This method gathers similar frames into classes by integrating the important attributes of the video. In ^[21], the proposed cluster-based method combines image information entropy and uses a density clustering algorithm to extract key frames in gesture videos. In cluster-based key frame extraction, calculation method of clustering threshold has a great impact on the fidelity and compression rate of key frames. The core idea behind such automatic cluster-based methods is to set a favorable threshold. In literature, researchers usually receive the threshold by defining formula or by setting fixed value. Kuanar S. K. computes the cluster threshold by employing formula $2\left( {1{\text{ - }}\varepsilon } \right)$ ^[10]. Jeong D. J. selects 0.0001 as the cluster threshold ^[12]. The other researchers compute cluster threshold by a self-defined formula ^[11,19]. In representative frame extraction, some cluster-based techniques take the cluster centers or centroids as the representative frames of each class. In ^[10], the author selects the frames which are closest to the cluster centroids as the representative frames. In ^[19,20], the authors directly extract cluster centers as the representative frames.

In general, those cluster-based methods may remain redundant because the clustering threshold setting influences optimal key frame extraction. Also, these methods evaluate the representative of the frames in one cluster by a single image feature. However, a single image feature cannot be able to fully characterize the frame content and complexity. In a more reasonable way, optimal threshold and feature fusion should be computed in cluster-based key frame extraction.

3. Materials and method

In this section, we provide a detailed description of the proposed OTMW method which includes feature extraction and key frame extraction. At first, the color, texture, and information complexity features are computed to express video content. Then, an optimization function is developed to compute the optimal clustering threshold. Next, the frame density, inter-distance and intra-distance are computed and fused as the clustering weight factor. Finally, a Max Weigh method is proposed to extract the cluster representative frame. The proposed approach is summarized in figure 1.

Figure 1. Framework of the proposed cluster-based method.

DownLoad: Full-Size Img PowerPoint

3.1. Feature extraction

In this section, we describe the proposed video content analysis method which distinguishes frames by computing feature data ^[22]. We extract the color, texture and information complexity features to discriminate different frame images. The video frame feature data is construct by

${\text{FFV = [C T E]}}$

(1)

where C, T and E represent the color, texture and information complexity, respectively.

3.1.1. Color feature

We take the color feature as one feature to characterize the difference of frame images. We compute the first color moment, second color moment and third color moment in H, S, and V channels to construct the color feature data vectors of frame images. The first color moment reflects the brightness difference, which is calculated by

${C_m} = \frac{1}{{w \times h}}\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } ({x_p}, {y_q}) \;\;\;\;[1, 2]$

(2)

where the parameters $w$ and $h$ are the pixel width and height, ${f_i}({x_p}, {y_q})$ is the pixel value in position $({x_p}, {y_q})$ , and $1 \leqslant p \leqslant w, 1 \leqslant q \leqslant h$ . The second color moment reflects the color distribution range, which is compute by

${C_v} = {(\frac{1}{{w \times h}}\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h ( } {f_i}({x_p}, {y_q}) - {C_m})^2}{)^{\frac{1}{2}}}$

(3)

The third color moment represents the color distribution symmetry, which is computed by

${C_s} = {(\frac{1}{{w \times h}}\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h ( } {f_i}({x_p}, {y_q}) - {C_v})^3}{)^{\frac{1}{3}}}$

(4)

where ${C_m}$ , ${C_v}$ and ${C_s}$ include first moment mean, second moment variance and third moment slope three parameters, respectively.

3.1.2. Texture feature

We take the texture feature as another feature to characterize the difference of frame images in image surface structural organization information\cite{r23}. We compute the mean of angular second moment, contrast, correlation and homogeneity texture features in 0, 45, 90 and 135 directions to construct video frame texture feature data vectors. The angular second moment characterizes the thickness and gray distribution uniformity of images, and it can be calculated by

${T_A} = \frac{1}{{w \times h}}\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } {({x_p}, {y_q})^2}$

(5)

The contrast characterizes the groove depth and clarity of images, it can be calculated by

${T_{Con}} = \frac{1}{{w \times h}}\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h ( } p - q{)^2}{f_i}({x_p}, {y_q})$

(6)

The correlation characterizes the local gray similarity in row or column direction, it can be calculated by

${T_{Cor}} = \frac{{\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h ( } {x_p} - \sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } ({x_p}, {y_q}) \times p) \times ({y_p} - \sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } ({x_p}, {y_q}) \times q) \times {f_i}{{({x_p}, {y_q})}^2}}}{{\sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } ({x_p}, {y_q}) \times {{({x_p} - \sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } ({x_p}, {y_q}) \times p)}^2}}}$

(7)

The homogenization characterizes the local gray level uniformity of images, it can be calculated by

${T_H} = \sum\limits_{p = 1}^w {\sum\limits_{q = 1}^h {{f_i}} } ({x_p}, {y_q}) \times \frac{1}{{1 + {{(p - q)}^2}}}$

(8)

3.1.3. Information complexity feature

We take the information complexity as the last feature to characterize the difference of frame images in aggregated and spatial feature. Information entropy proposed by Shannon is a holistic perspective information complexity measuring method ^[24]. It can characterize the image aggregated and spatial features. Larger image information entropy and greater internal non-uniformity degree commonly occur together in higher diversity level. The two-dimensional information entropy $E{f_i}$ can be calculated by

$E = - \frac{1}{{w \times h}}Cf \times {\log _2}Cf$

(9)

where $Cf$ is the occurrence probability of each gray level in $i$ -th frame image.

3.2. Clustering for key frame extraction

In this section, we describe the proposed cluster based key frame extraction method, which develops a new optimization function to compute the optimal threshold.

3.2.1. Threshold optimization

We narrow the search interval of optimal threshold by computing the function values of trial points. In cluster-based method, the fidelity ^[8] and ratio ^[7] are negatively and positively correlated with the threshold, respectively. Therefore, we infer that the quality of key frames is optimal when fidelity and ratio are infinitely close. We introduce a new parameter FR to characterize this relationship and to obtain the optimal key frames. The distance between frame ${x^{(i)}}$ and frame ${x^{(j)}}$ is compute by

${d}_{ij} = \Vert {x}^{(i)}-{x}^{(j)}{\Vert }_{2} = \sqrt{{\displaystyle \sum _{u = 1}^{n}|}{x}_{u}^{(i)}-{x}_{u}^{(j)}{|}^{2}}$

(10)

where $(i, j) \in [1, 2, ..., m]$ .

The average distance is computed by

${d_c} = \frac{2}{{m(m - 2)}}\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^m {{d_{ij}}} }$

(11)

The threshold is defined as

$t = {d_c} \pm \varepsilon \times st{d_{ij}}$

(12)

where $\varepsilon$ is a variable factor, $st{d_{ij}}$ is the standard deviation of ${d_{ij}}$ . We define the new parameter FR as

$FR(t) = \frac{{fidelity(t)}}{{ratio(t)}}$

(13)

The threshold optimization function is defined as

$f(t) = \mid FR(t) - 1\mid$

(14)

Giving a, b(a < b, $a = {d_c}{\text{ - }}3 \times st{d_{ij}}$ , $b = {d_c}{\text{ + }}3 \times st{d_{ij}}$ ) and c. We compute threshold change factor c and compute $f(a)$ , $f(b)$ and $f(c)$ . We change c by comparing the $f(a)$ , $f(b)$ and $f(c)$ . When $f(c) < 0$ , Fidelity is less than Ratio. Therefore, we change c to increases Fidelity and decrease Ratio. When $f(c) \geqslant 0$ , Fidelity is more than Ratio. We change c to decrease Fidelity and increase Ratio. If $f(c){\text{ = }}f(a){\text{ = }}f(b)$ in consecutive three times and $\left( {b - a} \right) \leqslant 0.001$ , we compute the optimal cluster threshold by $\partial {\text{ = }}\frac{{a + b}}{2}$ . The calculation process is Algorithm1.

Algorithm 1 Compute the optimal cluster threshold

Input: parameters a and b
Output: optimal threshold ρ
compute c=a+0.618*(b-a);
compute f(a), f(b), f(c);
  if not f(a) = f(b) = f(c) and (b−a) > 0.001 then
    while (b−a) > 0.001 do
    compute c=a+0.382*(b-a);
    compute f(a), f(b), f(c);
    while not f(a) = f(b) = f(c) do
      if f(c) < 0 then
        change c to increase fidelity and decrease ratio, b = c;
      else
        change c to decrease fidelity and increase ratio, a = c;
      end if
    end while
  end while
  ρ=(a+b)/2;
else
  ρ=(a+b)/2;
end if

| Show Table

DownLoad: CSV

3.2.2. Initial cluster centers and the number of clusters

In this section, we compute the initial cluster centers and the number of clusters by clustering the data of FFV. Frame Feature Vector data sets $FFV = \left\{ {{x^{\left( 1 \right)}}, {x^{\left( 2 \right)}}, \cdots , {x^{\left( m \right)}}} \right\}$ is a 14-dimensional video frame feature data, which includes color, texture and information entropy features.

The density of the sample frame i in video stream is defined as:

${\rho _i} = \sum\limits_j {{e^{ - {{\left( {\frac{{{d_{ij}}}}{{{d_c}}}} \right)}^2}}}}$

(15)

The distance of the j-th frame image from the i-th frame image in the same cluster is defined as:

${\eta _i}{\text{ = }}dist({x^{\left( i \right)}}, {x^{\left( j \right)}}), i \in D, j \ne i$

(16)

where i and j are in the same cluster.

The distance between the i-th frame element and the element j of another cluster is defined as:

$\varphi i = dist({x^{\left( i \right)}}, {x^{\left( j \right)}}), i \in D, j \in {\mu ^{\left( k \right)}}$

(17)

where i is a frame in one cluster, j is the elements of the cluster center that has completed the clustering.

According Eq. (17), $\varphi i$ may contain multiple elements. The product of ${\rho _i}$ , ${\eta _i}^{ - 1}$ and ${\varphi _i}$ weight factor is defined as the weighted product:

$\omega \left( x \right) = \Pi \left( {{\rho _i} \times {\eta _i}^{ - 1} \times {\varphi _i}} \right)$

(18)

Initial cluster center and cluster number k are directly related to $\omega \left( x \right)$ , the solution is shown in Algorithm 2. Firstly, the density of samples is calculated using Eq. (15), and then the maximum density frame is selected as the first initial clustering center ${\mu ^{\left( 1 \right)}}$ . The distance between frames is computed and set as the first initial cluster center. Then the frames are classified with a less than t distance into the first cluster, and these frames are removed from D. The ${\omega _i}$ of D is also calculated using Eq. (18), and the second initial cluster center ${\mu ^{\left( 2 \right)}}$ is obtained by calculating the maximum of $\omega \left( x \right)$ . Similarity, the samples that satisfies the same condition are classified as the second, three and k cluster, and are removed from D. Finally, all samples are assigned to cluster, and the initial cluster centers ${\mu }^{\left(1\right)}, {\mu }^{\left(2\right)}, \cdots , {\mu }^{\left(k\right)}$ and the number of k are obtained. Clustering process is shown in Figure 3.

Figure 2. The computation of initial cluster centers and the number of clusters.

DownLoad: Full-Size Img PowerPoint

Figure 3. Results of the video from Open video dataset.

DownLoad: Full-Size Img PowerPoint

Algorithm 2 Compute the initial cluster centers and the number of clusters

Input:

$FFV = \{ {x^{(1)}}, {x^{(2)}}, ..., {x^{(m)}}\}$
Output: the initial cluster centers

$\{ {\mu ^{(1)}}, {\mu ^{(2)}}, ..., {\mu ^{(k)}}\}$ and the number of clusters k
for each sample

$i \in D$
compute the density

${\rho _i}$ ;
end for
while

$D \ne \emptyset$ do
set the frame with maximum

${\rho _i}$ as the initial cluster center

${\mu ^{(1)}}$ ;
if

$d({\mu ^{(1)}}, i) \leqslant t$ then

$i \in {C_1}$ , remove i from D;
end if
for each sample

$i \in D \cup ({C_i} \notin D)$ do
compute

${\eta _i}, {\varphi _i}, {\omega _i};$
cluster center

${\mu ^{(i)}} = \arg \mathop {\min }\limits_i \, {\omega _i}$

$i \in {C_i}$ , remove i from D;
end for
end while

| Show Table

DownLoad: CSV

3.2.3. Key frame extraction

In this section, we classify frame images into k clusters $C = \{ {C_1}, {C_2}, ..., {C_k}\}$ and extract representative frames from this clusters. The error square sum criterion function is used as the criterion function. The frame images are classified into different clusters by employing Algorithm 3. Giving the cluster $C = {C_1}, {C_2}, \cdots , {C_k}$ , the specific steps are as follows:

Algorithm 3 Cluster the frame feature value

Input: frame feature value

$FFV = \{ {x^{(1)}}, {x^{(2)}}, ..., {x^{(m)}}\}$ , the initial cluster centers

$\{ {\mu ^{(1)}}, {\mu ^{(2)}}, ..., {\mu ^{(k)}}\}$ and the number of clusters k.
Output: k clusters

$C = \{ {C_1}, {C_2}, ..., {C_k}\}$
Let

${C_i} = \emptyset (1 \leqslant i \leqslant k)$ ;
repeat
for

$j = 1, 2, ..., m$ do
compute the distance between the sample

${x^{(j)}}$ and the cluster center

${\mu ^{(i)}}(1 \leqslant i \leqslant k)$ ;
compute the

${\lambda _j} = argmin{d_{ji}}$ ;
divide the sample

${x^{(j)}}$ into the nearest cluster

${C_{\lambda j}} = {C_{\lambda j}} \cup {x^{(j)}}$ ;
end for
for

$i = 1, 2, ..., k$ do
calculate the new cluster center

$({\mu ^{(i)}})' = \frac{1}{{\left| {{C_i}} \right|}}{\sum _{x \in {C_i}}}x$ ;
if

$({\mu ^{(i)}})' = {\mu ^{(i)}}$ then
update the current cluster center

${\mu ^{(i)}}$ to

$({\mu ^{(i)}})'$ ;
    else
      keep the current mean vector unchanged;
    endif
  end for
until the current cluster center vectors are not updated.

| Show Table

DownLoad: CSV

Step1: Calculate frame ${\rho _i}$ , ${\eta _i}$ and ${\varphi _i}$ in cluster C₁ by Eq. (13), (14) and (15).

Step2: Compute maximum weight factor by Eq. (17) and select key-frame ${f_1}$ from cluster C₁.

Step3: Similarly, the key-frame ${f_k}$ can be obtained by the computing of maximum weight factor in Eq. (17).

Step4: Repeat Step3 until all cluster representative frames are selected.

Step5: Key-frames $f = {f_1}, {f_2}, \cdots , {f_k}$ are extracted. Video summarization is generated.

4. Results and discussion

In this section, we turn to an empirical study on the proposed OTMW method in key frame extraction tasks and compare it with state-of-the-art cluster-based key frame extraction methods such as HVM ^[25], ACSC ^[26], RGPH ^[27] and FCME ^[28]. We report improved performance across open video dataset. We conduct a set of experiments by using surveillance, documentary, lecture on TV and phone recording four different video datasets. These videos are publicly shared on https://open-video.org/. In this section, we take Hcil2000_01 video as an example to report the performance of OTMW method. Here Hcil2000_01 is a random video of open video dataset.

4.1. Optimization of threshold

The parameters of Hcil2000_01 video in threshold optimization is shown in . In Hcil2000_01 video, the average distance ${d_c}$ and standard deviation $st{d_{ij}}$ of ${d_{ij}}$ are 2.3107 and 0.2259, respectively. Therefore, the parameters a = 1.6442, b = 2.9993. The variable interval of parameter c is [1.6442, 2.9993]. The parameter c is computed by c = a + 0.618*(b-a) = 2.481652. In sixth iteration, the f(a) = f(b) = f(c) = 0.0063. The calculation of parameter c is changed to c = a + 0.382*(b-a) in subsequent iterations. As shown in table 1, the value of f(a) = f(b) = f(c) = 0.0063 are not change in the next two iterations. However, b-a = 0.122156 > 0.001. Finally, b-a = 0.000509 < 0.001 in 13th iteration. Therefore, the optimal threshold ${\rm{\tilde n}} = \frac{{a + b}}{2} = 1.644709$ .

Table 1. The parameters of Hcil2000_01 video in threshold optimization.

iterations	a	b	c	f'(a)	f'(b)	f'(c)
1	1.6442	2.9993	2.4817	-0.63	-7.6	-6.95
2	1.6442	2.4817	2.1617	-0.63	-6.95	-3.33
3	1.6442	2.1617	1.9640	-0.63	-3.33	-2.11
4	1.6442	1.9640	1.8419	-0.63	-2.11	-2.11
5	1.6442	1.8419	1.7664	-0.63	-2.11	-0.63
6	1.6442	1.7663	1.7197	-0.63	-0.63	-0.63
7	1.6442	1.7197	1.6720	-0.63	-0.63	-0.63
8	1.6442	1.6720	1.6544	-0.63	-0.63	-0.63
9	1.6442	1.6544	1.6479	-0.63	-0.63	-0.63
10	1.6442	1.6479	1.6455	-0.63	-0.63	-0.63
11	1.6442	1.6455	1.6447	-0.63	-0.63	-0.63
12	1.6442	1.6447	1.6443	-0.63	-0.63	-0.63
13	1.6442	1.6447	1.6445	-0.63	-0.63	-0.63
Note: $f'(a) = 100f(a)$ , $f'(b) = 100f(b)$ and $f'(c) = 100*f(c)$

| Show Table

DownLoad: CSV

4.2. Extraction of key frames

The key frames are extracted by employing OTMW method across open video dataset. The key frames of Hcil2000_01 video is shown in figure 3. The fidelity and ratio results are shown in table 2. Nf represents the number of frames, Nrf represents the number of key frames. The fidelity measures of different videos are changed from 93 to 98 with an average of 96.12. The ratio measures are changed from 95 to 98 with an average of 97.13. The key frames are consistent with artificial judgment.

Table 2. The fidelity and ratio performance of videos.

video name	Nf	Nrf	fidelity	ratio
Traffic monitoring	120	5	95.94	95.83
Office video_01	161	5	96.84	96.89
Entertainment_01	151	5	97.14	96.69
Entertainment_02	191	7	98.01	96.34
Entertainment_03	101	4	97.79	96.04
Office video_02	77	3	96.34	96.10
Indi012	201	6	94.53	97.00
UGS01_008	301	5	93.58	98.34
UGS07_005	400	8	95.07	98.00
UGS01_006	360	8	95.81	97.78
UGS01_001	331	7	94.26	97.89
Hcil2000_01	210	7	95.58	96.67
Marchionini	100	4	96.96	96.00
RayDiessenIBM	230	6	96.73	97.39
Entertainment_04	201	3	97.63	98.51
Daily life	283	5	97.45	98.23

| Show Table

DownLoad: CSV

4.3. Comparison with several state-of-the-art methods

We compare OTMW with state-of-the-art cluster-based algorithm in term of the fidelity and ratio measure performance. In experimental, the number of clusters of semi-automatic cluster-based methods are same as OTMW method. The results of fidelity and ratio are shown in figure 4 and table 3. HVM method with an average fidelity of 83.75 and an average ratio of 95.59. ACSC with an average fidelity of 85.75 and an average ratio of 94.79. The average fidelity of RGPH and FCME are 85.61 and 85.38. The proposed OTMW method with an average fidelity of 96.24 and with an average ratio of 97.15. OTMW method achieves a 10.63–12.49 fidelity improvement over other cluster-based methods. The fluctuations of ratio measure of different videos are shown in figure 5. OTMW method with a ratio variance of 0.73. The ratio variance of HVM and ACSC cluster-based methods are 22.11 and 11.12. They are 15 and 30 times larger than OTMW, respectively. OTMW method achieves a 1.56–2.24 ratio improvement over other cluster-based methods and has a small fluctuation.

Figure 4. The fidelity measure performance of cluster-based methods.

DownLoad: Full-Size Img PowerPoint

Table 3. The ratio measure performance of cluster-based methods.

video name	HVM	ACSC	OTMW
Traffic monitoring	97.5	95.0	95.83
Officevideo_01	98.76	88.20	96.8
Entertainment_01	81.46	92.05	96.67
Entertainment_02	98.43	93.72	96.34
Entertainment_03	98.14	94.06	96.04
Officevideo_02	98.70	89.61	96.10
Indi012	99.50	94.50	97.02
UGS01_008	95.26	98.50	98.34
UGS07_005	93.75	98.50	98.00
UGS01_006	96.11	97.78	97.78
UGS01_001	89.43	98.19	97.89
Hcil2000_01	90.95	97.14	96.67
Marchionini	99.00	96.00	96.96
RayDiessenIBM	94.35	99.13	97.39
Entertainment_04	99.50	89.55	98.51
Daily life	98.58	94.70	98.23

| Show Table

DownLoad: CSV

Figure 5. The ratio error-bar of cluster-based methods.

DownLoad: Full-Size Img PowerPoint

4.3.1. Extraction of key frames on various datasets

To assess the performance of OTMW method, we consider key frame extraction tasks on surveillance, documentary, lecture on TV and phone recording datasets. The HVM method achieves an average fidelity of 82.42, 86.90, 84.36 and 81.22, respectively. It achieves an average ratio of 95.76, 92.73, 94.767 and 99.05. ACSC method achieves an average fidelity of 88.40, 84.07, 81.02 and 89.57. It achieves an average ratio of 92.19, 97.37, 97.42 and 92.13. The average fidelities of FCME method are 86.39, 83.77, 85.51 and 87.62. The average ratios of RGPH method are 86.39, 83.18, 85.94 and 86.45. OTMW method achieves an average fidelity of 97.07, 94.40, 95.87 and 97.54 and an average ratio of 96.43, 97.83, 97.01 and 98.37. The fidelity measures on various datasets are shown in figure 6. OTMW method achieves a 9.91-11.66 fidelity and 0.91-2.77 ratio improvement over HVM, ACSC, FCME, RGPH.

Figure 6. The fidelity results of cluster-based methods on different datasets.

DownLoad: Full-Size Img PowerPoint

5. Conclusion

In this paper, an innovative cluster based key frame extraction method is presented for multi-type videos. This method analyzes the video content by extracting the color, texture and information complexity features. The threshold optimization function is constrained by fidelity and ratio measures. It avoids the dependence on a fixed threshold problem in traditional cluster-based method by computing the optimal threshold. The parameters density ${\rho _i}$ , inter-distance ${\eta _i}$ , intra-distance ${\varphi _i}$ and weight factor ${\omega _i}$ are used to compute the initial cluster centers and the number of clusters, extract representative frames from k clusters. The method shows promising result on different video datasets. Meanwhile, OTMW achieves competitive and even better fidelity and ratio measure performance when compared with several state-of-the-art cluster-based methods. Overall, we found that OTMW well suited to process key frame extraction problem in the field of static video summarization. However, whether the proposed method also applies in the real-life production and life environment is subject to be verified. In the future, we will explore to apply our proposed method for real-time video surveillance. In addition, we will investigate how to integrate the proposed method into the camera client and how to apply it to daily production and life.

Acknowledgments

The subject is sponsored by the National Natural Science Foundation of P. R. China (No. 61872196, No. 61872194, No. 61902196, No. 62102194 and No. 62102196), Scientific and Technological Support Project of Jiangsu Province (No. BE2019740, No. BK20200753 and No. 20KJB520001), Major Natural Science Research Projects in Colleges and Universities of Jiangsu Province (No. 18KJA520008), Major Natural Science Research Projects in Colleges and Universities of Anhui Province (No.KJ2019ZD20), Six Talent Peaks Project of Jiangsu Province (No. RJFW-111), Postgraduate Research and Practice Innovation Program of Jiangsu Province (No. KYCX19_0909, No. KYCX19_0911, No. KYCX20_0759, No. KYCX21_0787, No. KYCX21_0788 and No. KYCX21_0799).

Conflict of interest

The authors declare there is no conflict of interest.

References

[1]	A. Alfred, Variational inequalities over the cone of semidefinite positive symmetric matrices and over the Lorentz cone, Optim. Method Softw., 18 (2003), 359–376. https://doi.org/10.1080/1055678031000122586 doi: 10.1080/1055678031000122586
[2]	M. Kojima, S. Shindoh, S. Hara, Interior-point methods for the monotone semidfinite linear complementarity problem in symmetric matrices, SIAM J. Optim., 7 (1997), 86–125. https://doi.org/10.1137/S1052623494269035 doi: 10.1137/S1052623494269035
[3]	A. Yoshise, Interior point trajectories and a homogeneous model for nonlinear complementarity problems over symmetric cones, SIAM J. Optim., 17 (2006), 1129–1153. https://doi.org/10.1137/04061427X doi: 10.1137/04061427X
[4]	S. Hayashi, N. Yamashita, M. Fukushima, Robust Nash equilibria and second-order cone complementarity problems, J. Nonlinear Convex Anal., 6 (2005), 283–296.
[5]	X. D. Chen, D. Sun, J. Sun, Complementarity functions and numerical experiments on some smoothing Newton methods for second-order-cone complementarity problems, Comput. Optim. Appl., 25 (2003), 39–56. https://doi.org/10.1023/A:1022996819381 doi: 10.1023/A:1022996819381
[6]	S. L. Miguel, V. Lieven, B. Stephen, L. Herve, Applications of second-order cone programming, Linear Algebra Appl., 284 (1998), 193–228. https://doi.org/10.1016/S0024-3795(98)10032-0 doi: 10.1016/S0024-3795(98)10032-0
[7]	P. K. Shivaswamy, C. Bhattacharyya, A. J. Smola, Second order cone programming approaches for handling missing and uncertain data, J. Mach. Learn. Res., 7 (2006), 1283–1314.
[8]	S. Kim, M. Kojima, M. Yamashita, Second order cone programming relaxation of a positive semidefinite constraint, Optim. Method Software, 18 (2003), 535–541. https://doi.org/10.1080/1055678031000148696 doi: 10.1080/1055678031000148696
[9]	T. Sasakawa, T. Tsuchiya, Optimal magnetic shield design with second-order cone programming, SIAM J. Sci. Comput., 24 (2003), 1930–1950. https://doi.org/10.1137/S1064827500380350 doi: 10.1137/S1064827500380350
[10]	S. Yan, Y. L. Ma, Optimal design and verification of temporal and spatial filters using second-order cone programming approach, Sci. China Ser. F Inf. Sci., 49 (2006), 235–253. https://doi.org/10.1007/s11432-006-0235-3 doi: 10.1007/s11432-006-0235-3
[11]	Y. Kanno, M. Ohsaki, Contact analysis of cable networks by using second-order cone programming, SIAM J. Sci. Comput., 27 (2006), 2032–2052. https://doi.org/10.1137/S1064827503431946 doi: 10.1137/S1064827503431946
[12]	X. Peng, I. King, Robust BMPM training based on second-order cone programming and its application in medical diagnosis, Neural Netw., 21 (2008), 450–457. https://doi.org/10.1016/j.neunet.2007.12.051 doi: 10.1016/j.neunet.2007.12.051
[13]	D. Goldfarb, W. T. Yin, Second-order cone programming methods for total variation-based image restoration, SIAM J. Sci. Comput., 27 (2005), 622–645. https://doi.org/10.1137/040608982 doi: 10.1137/040608982
[14]	S. Yan, Y. L. Ma, Robust supergain beamforming for circular array via second order cone programm, Appl. Acoust., 66 (2005), 1018–1032. https://doi.org/10.1016/j.apacoust.2005.01.003 doi: 10.1016/j.apacoust.2005.01.003
[15]	M. Fukushima, Z. Q. Luo, P. Tseng, Smoothing functions for second-order cone complementarity problems, SIAM J. Optim., 12 (2001), 436–460. https://doi.org/10.1137/S1052623400380365 doi: 10.1137/S1052623400380365
[16]	M. S. Gowda, Y. Song, On semidfinite linear complementarity problems, Math. Program., 88 (2000), 575–587. https://doi.org/10.1007/PL00011387 doi: 10.1007/PL00011387
[17]	M. S. Gowda, R. Sznajder, Automorphism invariance of P-matrix and GUS properties of linear transformations on Euclidean Jordan algebras, Math. Oper. Res., 31 (2006), 109–123. https://doi.org/10.1287/moor.1050.0182 doi: 10.1287/moor.1050.0182
[18]	W. H. Yang, Y. Yuan, The GUS-property of second-order cone linear complementarity problems, Math. Comput., 141 (2013), 295–317. https://doi.org/10.1007/s10107-012-0523-1 doi: 10.1007/s10107-012-0523-1
[19]	S. Hayashi, N. Yamashita, M. Fukushima, A combined smoothing and regularization method for monotone second-order cone complementarity problems, SIAM J. Optim., 15 (2005), 593–615. https://doi.org/10.1137/S1052623403421516 doi: 10.1137/S1052623403421516
[20]	S. Z. Xiang, Y. L. San, H. L. Zhen, A smoothing method for second order cone complementarity problem, J. Comput. Appl. Math., 228 (2009), 83–91. https://doi.org/10.1016/j.cam.2008.08.040 doi: 10.1016/j.cam.2008.08.040
[21]	B. Kheirfam, N. M. Amiri, A new interior-point algorithm based on modified Nesterov-Todd direction for symmetric cone linear complementarity problem, Optim. Lett., 8 (2014), 1017–1029. https://doi.org/10.1007/s11590-013-0618-5 doi: 10.1007/s11590-013-0618-5
[22]	G. Q. Wang, D. T. Zhu, A class of polynomial interior-point algorithms for the Cartesian $P_{}(\kappa)$ second-order cone linear complementarity problem, Nonlinear Anal., Theory, Methods Appl.*, 73 (2010), 3705–3722. https://doi.org/10.1007/s10957-011-9938-8 doi: 10.1007/s10957-011-9938-8
[23]	G. Q. Wang, Y. Q. Bai, A class of polynomial interior-point algorithms for the Cartesian P-matrix linear complementarity problem over symmetric cones, J. Optim. Theory Appl., 152 (2012), 739–772. https://doi.org/10.1007/s10957-011-9938-8 doi: 10.1007/s10957-011-9938-8
[24]	G. Lesaja, G. Q. Wang, D. T. Zhu, Interior-point methods for Cartesian $P_{}(\kappa)$ -linear complementarity problems over symmetric cones based on the eligible kernel functions, Optim. Methods Software*, 27 (2012), 827–843. https://doi.org/10.1080/10556788.2012.670858 doi: 10.1080/10556788.2012.670858
[25]	G. Q. Wang, G. Lesaja, Full Nesterov-Todd step feasible interior-point method for the Cartesian $P_{}(\kappa)$ -SCLCP, Optim. Methods Software*, 28 (2013), 600–618. https://doi.org/10.1080/10556788.2013.781600 doi: 10.1080/10556788.2013.781600
[26]	L. Fang, C. Y. Han, A new one-step smoothing newton method for the second-order cone complementarity problem, Math. Meth. Appl. Sci., 34 (2011), 347–359. https://doi.org/10.1002/mma.1366 doi: 10.1002/mma.1366
[27]	J. Y. Tang, G. P. He, D. Li, F. Liang, J. C. Zhou, A smoothing Newton method for the second-order cone complementarity problem, J. Optim. Theory Appl., 58 (2013), 223–247. https://doi.org/10.1007/s10492-013-0011-9 doi: 10.1007/s10492-013-0011-9
[28]	L. H Zhang, W. H. Yang, An efficient algorithm for second-order cone linear complementarity problems, Math. Comput., 83 (2013), 1701–1726. https://doi.org/10.1090/S0025-5718-2013-02795-0 doi: 10.1090/S0025-5718-2013-02795-0
[29]	Z. J. Hao, Z. P. Wan, X. N. Chi, A power penalty method for second-order cone linear complementarity problems, Oper. Res. Lett., 43 (2015), 137–142. https://doi.org/10.1016/j.orl.2014.12.012 doi: 10.1016/j.orl.2014.12.012

This article has been cited by:

1.	Biao Ma, Minghui Ji, Miaochao Chen, Motion Feature Retrieval in Basketball Match Video Based on Multisource Motion Feature Fusion, 2022, 2022, 1687-9139, 1, 10.1155/2022/9965764
2.	Shiyue Qin, Xinyi Ke, Cuijia Chen, 2022, Algorithm Exploration and Experimentation of RPCA-Based Matrix Separation, 9781450396806, 12, 10.1145/3582935.3582938
3.	Alina Banerjee, Ravinder Megavath, Ela Kumar, Opposition-based optimized max pooled 3D convolutional features for action video retrieval, 2024, 16, 2511-2104, 4815, 10.1007/s41870-024-02102-7
4.	Alexandros Vrochidis, Nikolaos Dimitriou, Stelios Krinidis, Savvas Panagiotidis, Stathis Parcharidis, Dimitrios Tzovaras, A Deep Learning Framework for Monitoring Audience Engagement in Online Video Events, 2024, 17, 1875-6883, 10.1007/s44196-024-00512-w
5.	Jhuma Sunuwar, Samarjeet Borah, A comparative analysis on major key-frame extraction techniques, 2024, 83, 1573-7721, 73865, 10.1007/s11042-024-18380-z
6.	Yubo Sun, Xiaofang Chen, Weihua Gui, Yongfang Xie, Shiwen Xie, 2023, Superheat Degree Category Refinement of Aluminum Electrolysis Cell Using Flame Hole Video Apparent Spatio-Temporal Feature Clustering Network, 979-8-3503-0375-9, 1675, 10.1109/CAC59555.2023.10450214
7.	M. Dhanushree, R. Priya, P. Aruna, R. Bhavani, Static video summarization with multi-objective constrained optimization, 2024, 15, 1868-5137, 2621, 10.1007/s12652-024-04777-z

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Electronic Research Archive

1 1.3

Metrics

Article views(1842) PDF downloads(60) Cited by(0)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Tables(1)

Electronic Research Archive

The parameter-Newton iteration for the second-order cone linear complementarity problem

Related Papers:

Abstract

1. Introduction

2. Related work

3. Materials and method

3.1. Feature extraction

3.1.1. Color feature

3.1.2. Texture feature

3.1.3. Information complexity feature

3.2. Clustering for key frame extraction

3.2.1. Threshold optimization

3.2.2. Initial cluster centers and the number of clusters

3.2.3. Key frame extraction

4. Results and discussion

4.1. Optimization of threshold

4.2. Extraction of key frames

4.3. Comparison with several state-of-the-art methods

4.3.1. Extraction of key frames on various datasets

5. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Electronic Research Archive

The parameter-Newton iteration for the second-order cone linear complementarity problem

Related Papers:

Abstract

1. Introduction

2. Related work

3. Materials and method

3.1. Feature extraction

3.1.1. Color feature

3.1.2. Texture feature

3.1.3. Information complexity feature

3.2. Clustering for key frame extraction

3.2.1. Threshold optimization

3.2.2. Initial cluster centers and the number of clusters

3.2.3. Key frame extraction

4. Results and discussion

4.1. Optimization of threshold

4.2. Extraction of key frames

4.3. Comparison with several state-of-the-art methods

4.3.1. Extraction of key frames on various datasets

5. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog