Emotion recognition in talking-face videos using persistent entropy and neural networks

Eduardo Paluzo-Hidalgo; Rocio Gonzalez-Diaz; Guillermo Aguirre-Carrazana; Eduardo Paluzo-Hidalgo; Rocio Gonzalez-Diaz; Guillermo Aguirre-Carrazana

doi:10.3934/era.2022034

Electronic Research Archive

2022, Volume 30, Issue 2: 644-660. doi: 10.3934/era.2022034

Previous Article Next Article

Research article Special Issues

Emotion recognition in talking-face videos using persistent entropy and neural networks

Department of Applied Mathematics I, University of Seville, Seville, Spain

Academic Editor: Qing Tian

Received: 16 November 2021 Revised: 07 February 2022 Accepted: 14 February 2022 Published: 18 February 2022

The automatic recognition of a person's emotional state has become a very active research field that involves scientists specialized in different areas such as artificial intelligence, computer vision, or psychology, among others. Our main objective in this work is to develop a novel approach, using persistent entropy and neural networks as main tools, to recognise and classify emotions from talking-face videos. Specifically, we combine audio-signal and image-sequence information to compute a topology signature (a 9-dimensional vector) for each video. We prove that small changes in the video produce small changes in the signature, ensuring the stability of the method. These topological signatures are used to feed a neural network to distinguish between the following emotions: calm, happy, sad, angry, fearful, disgust, and surprised. The results reached are promising and competitive, beating the performances achieved in other state-of-the-art works found in the literature.

Keywords:

Citation: Eduardo Paluzo-Hidalgo, Rocio Gonzalez-Diaz, Guillermo Aguirre-Carrazana. Emotion recognition in talking-face videos using persistent entropy and neural networks[J]. Electronic Research Archive, 2022, 30(2): 644-660. doi: 10.3934/era.2022034

Related Papers:

[1]	Kaiqing Huang, Yizhi Chen, Miaomiao Ren . Additively orthodox semirings with special transversals. AIMS Mathematics, 2022, 7(3): 4153-4167. doi: 10.3934/math.2022230
[2]	Tatjana Grbić, Slavica Medić, Nataša Duraković, Sandra Buhmiler, Slaviša Dumnić, Janja Jerebic . Liapounoff type inequality for pseudo-integral of interval-valued function. AIMS Mathematics, 2022, 7(4): 5444-5462. doi: 10.3934/math.2022302
[3]	Xuliang Xian, Yong Shao, Junling Wang . Some subvarieties of semiring variety COS $^{+}_{3}$ . AIMS Mathematics, 2022, 7(3): 4293-4303. doi: 10.3934/math.2022237
[4]	Johnny Henderson, Abdelghani Ouahab, Samia Youcefi . Existence results for ϕ-Laplacian impulsive differential equations with periodic conditions. AIMS Mathematics, 2019, 4(6): 1610-1633. doi: 10.3934/math.2019.6.1610
[5]	Liaqat Ali, Yaqoub Ahmed Khan, A. A. Mousa, S. Abdel-Khalek, Ghulam Farid . Some differential identities of MA-semirings with involution. AIMS Mathematics, 2021, 6(3): 2304-2314. doi: 10.3934/math.2021139
[6]	Fenhong Li, Liang Kong, Chao Li . Non-global nonlinear mixed skew Jordan Lie triple derivations on prime $\ast$ -rings. AIMS Mathematics, 2025, 10(4): 7795-7812. doi: 10.3934/math.2025357
[7]	Xia Li, Wen Guan, Da-Bin Wang . Least energy sign-changing solutions of Kirchhoff equation on bounded domains. AIMS Mathematics, 2022, 7(5): 8879-8890. doi: 10.3934/math.2022495
[8]	Liang Kong, Chao Li . Non-global nonlinear skew Lie triple derivations on factor von Neumann algebras. AIMS Mathematics, 2022, 7(8): 13963-13976. doi: 10.3934/math.2022771
[9]	Heng Yang, Jiang Zhou . Compactness of commutators of fractional integral operators on ball Banach function spaces. AIMS Mathematics, 2024, 9(2): 3126-3149. doi: 10.3934/math.2024152
[10]	Wenbo Huang, Jiankui Li, Shaoze Pan . Some zero product preserving additive mappings of operator algebras. AIMS Mathematics, 2024, 9(8): 22213-22224. doi: 10.3934/math.20241080

Abstract

1. Introduction

Discrete lifetime data—such as the number of appliance failures of a particular brand within a given time frame, the total number of machine operations prior to a failure, the number of bullets fired by a weapon before the first malfunction, and the anticipated lifespan of humans (in years)—are frequently handled in reliability lifetime studies. For more classic examples, see Szymkowiak and Iwinska ^[1]. Data scientists typically employ discrete models as analysis tools, such as the Poisson distribution, negative binomial distribution, and geometric distribution, in order to more correctly define, analyze, and model these data. But in many situations, these discrete distribution functions are not the best options. For instance, seasonal or periodic data cannot be handled by the Poisson distribution, while underdispersed data cannot be described by the negative binomial distribution. More suitable discrete lifetime distributions are required to explore many additional kinds of complex discrete lifespan data. Discretizing continuous random variables is a useful strategy that yields a discrete life model with characteristics that are comparable to the continuous model.

The essential concept of discretizing continuous random variables was first presented by Roy ^[2]. Specifically, let $Y$ be a continuous random variable with a survival function denoted by $S(y)$ . Define the random variable $Z = [Y]$ as the maximum integer less than or equal to $Y$ . The probability mass function (PMF) $P(Z = z)$ of $Z$ can be expressed as

$\begin{align*} P(Z = z) = S_Y(z)-S_Y(z+1). \end{align*}$

Many researchers have introduced various new models for discrete life distributions by the approach. For instance, the discrete normal distribution was first introduced by Roy ^[3]. Using the generic method of discretizing a continuous distribution, Krishna and Pundir ^[4] introduced the discrete Burr and Pareto distributions. In addition, Bracquemond and Gaudoin ^[5] provided an extensive overview of discrete distributions, such as the Weibull distribution, that are employed in reliability to describe discrete lifetimes of nonrepairable systems. It is well-known that the Weibull distribution has become the most commonly used distribution for analyzing continuous life data due to its ability to fit various types of data and relatively simple structure (Johnson et al. ^[6]). At least three cases exist for the corresponding discrete Weibull distribution: (a) the Type Ⅰ discrete Weibull distribution, which maintains the form of the continuous survival function (SF), as introduced by Nakagawa and Osaki ^[7]; (b) the Type Ⅱ discrete Weibull distribution, as suggested by Stein and Dattero ^[8]; and (c) the three-parameter discrete Weibull distribution, as introduced by Padgett and Spurrier ^[9]. The most popular of them is the Type Ⅰ discrete Weibull distribution, whose features have been extensively researched by numerous academics. Englehardt and Li ^[10] employed the discrete Weibull distribution to analyze pathogen counts in treated water over time. Barbiero ^[11,12] compared several parameter estimation methods of this distribution, and solved the minimum Chi-square and least squares estimation. Vila et al. ^[13] studied in detail the basic theoretical properties of the Type Ⅰ discrete Weibull and analyzed the censored data. Yoo ^[14] extended the application of the discrete Weibull regression model to accommodate missing data. In addition, El-Morshedy et al. ^[15] conducted a detailed study on a new bivariate exponential discrete Weibull distribution.

The primary goal in this study is to enhance the current techniques for estimating the complex discrete probability distribution model. The probability distribution's score function typically lacks an explicit analytical solution, hence the Newton approach is usually used to estimate the numerical solution for parameter estimation. Nevertheless, the algorithm's low convergence and strong dependence on the initial value make it challenging to achieve the best estimation outcomes. Recently, Liu et al. ^[16] employed the majorize minimize (MM) algorithm to enhance the resolution of the maximum likelihood estimation for the simplex distribution. Li and Tian ^[17] introduced a novel root-finding method known as the upper-crossing/solution (US) algorithm. In contrast to conventional iterative algorithms (like Newton's algorithm), the US algorithm can lessen the influence of initial values and achieve a strong, stable convergence to the objective equation's real root at each iteration. The benefits of this technique have been illustrated through the use of a few classic models, such as the Weibull distribution, gamma distribution, zeta distribution, and generalized Poisson distribution. Cai ^[18] has improved the maximum likelihood estimation of generalized gamma distribution parameters by combining the US algorithm with the second-derivative lower-bound function (SeLF) algorithm.

The essence of the US algorithm is to identify a $U$ -function $U(\theta|\theta^{(t)})$ , which simplifies the solution of the complicated nonlinear equation $h(\theta) = 0$ to the solution of the equation $U(\theta|\theta^{(t)}) = 0$ with an explicit solution. Li and Tian ^[17] presented a variety of approaches to discover the $U$ -function, among which the first-derivative lower bound (FLB) function method requires only by using the first derivative of the objective function $h(\theta)$ , thereby diminishing algorithmic complexity. In previous research on the US algorithm, it was generally used to solve the roots of univariate nonlinear equations or the maximum likelihood estimation problem of a multi-parameter probability distribution with an explicit partial score function. Specifically, for a probability distribution with two parameters $(\alpha, \beta)$ , while solving for maximum likelihood estimation, the estimator of the parameter $\alpha$ can be explicitly expressed by the other estimator of the parameter $\beta(\alpha)$ . However, there is no further discussion provided in Li and Tian ^[17] about whether this approach can be applicable in more complex multi-parameter distributions, where an estimator of one parameter cannot be clearly represented by the other, and it is an issue deserving of more investigation.

The rest of the paper will proceed as follows. Section 2 provides a detailed introduction to the US algorithm and FLB function method. The application of the suggested approach to the Type Ⅰ discrete Weibull distribution's maximum likelihood parameter estimation is covered in Section 3. Numerical simulation experiments will be conducted in Section 4 in order to evaluate the performance of the employed methods and compare them with alternative estimation approaches. Section 5 will demonstrate the applicability of the US algorithm through the analysis of two real data sets. Conclusions and discussions will be provided in Section 6.

2. The US algorithm

One of the most frequent issues in numerical computations is figuring out the zero point of a function or an equation's root. In classical statistics, the maximum likelihood estimate (MLE) of parameters and the calculation of maximum a posteriori probability in Bayesian statistics may typically be turned into the problem of solving the zero point of a nonlinear function $h (\theta)$ . In summary, since $h(\theta)$ is a nonlinear function of a single variable $\theta$ , we must identify the unique root $\theta^\star$ such that

$\begin{equation} h(\theta) = 0,\qquad \theta \in \Theta \subseteq \mathbb{R}. \end{equation}$

(2.1)

The US algorithm is the most recent method for discovering roots. It has a similar procedure to the commonly used EM (expectation maximum) and MM (maximize minimize) algorithms^[17]. There are two primary steps in this process: the upper-crossing step (U-step) and the solution step (S-step). The two primary advantages of this algorithm are as follows:

(a) It converges strongly and stably to the root $\theta^\star$ of the Eq (2.1) with each iteration, that is, for an iterative points set sequence $\left\{\theta^{(t)} \right\}^{\infty}_{t = 0}$ , there is

$\theta^{(0)} < \theta^{(1)} < \cdots < \theta^{(t)} < \cdots\leq \theta^\star \quad or \quad \theta^\star \leq \cdots < \theta^{(t)} < \cdots < \theta^{(1)} < \theta^{(0)}.$

(b) The Newton algorithm's sensitivity to the initial value is decreased.

Two new symbols, $\overset{sgn(\alpha)}{\leq}$ and $\overset{sgn(\alpha)}{\geq}$ , are introduced regarding the changing direction (CD) inequalities in order to simplify the explanation of the US algorithm. The specific definition is presented as follows: for two functions $f_1(x)$ and $f_2(x)$ on the same domain $\mathbb{Q}$ ,

$f_1(x)\overset{sgn(\alpha)}{\leq}f_2(x) \Leftrightarrow \left\{\begin{aligned} f_1(x)\leq f_2(x), \qquad \alpha > 0,\\ f_1(x) = f_2(x), \qquad \alpha = 0,\\ f_1(x)\geq f_2(x), \qquad \alpha < 0, \end{aligned}\right.$

and

$f_1(x)\overset{sgn(\alpha)}{\geq}f_2(x) \Leftrightarrow \left\{\begin{aligned} f_1(x)\geq f_2(x), \qquad \alpha > 0,\\ f_1(x) = f_2(x), \qquad \alpha = 0,\\ f_1(x)\leq f_2(x), \qquad \alpha < 0. \end{aligned}\right.$

2.1. Definition of the U-function

It is typically challenging to locate the root $\theta^\star$ of the nonlinear equation $h(\theta) = 0$ directly. The US algorithm aims to create an alternative function $U(\theta|\theta^{(T)})$ to replace $h (\theta)$ , transforming the challenge of solving complex nonlinear equations into solving the equation $U(\theta|\theta^{(T)})$ with explicit solutions. First, we assume that

$\begin{equation} \begin{aligned} h(\theta) < 0,\quad \forall \theta > \theta^\star \ and \ h(\theta) > 0,\quad \forall \theta < \theta^\star. \end{aligned} \end{equation}$

(2.2)

If $h(\theta) < 0$ when $\theta < \theta^\star$ , then both sides of the equation $h(\theta) = 0$ can be multiplied by -1, which can also obtain the same root $\theta^\star$ satisfying the assumption (2.2). Let $\theta^{(t)}$ represent the solution after the ( $t$ -1)-th iteration, and the function $U(\theta|\theta^{(t)})$ satisfying the following criteria is designated as the U-function of $h(\theta)$ at $\theta = \theta^{(t)}$ :

$\begin{equation} \begin{aligned} h(\theta)\quad\leq \quad&U(\theta|\theta^{(t)}),\qquad&\theta < \theta^{(t)},\\ h(\theta^{(t)})\quad = \quad&U(\theta^{(t)}|\theta^{(t)}),\qquad& \theta = \theta^{(t)},\\ h(\theta)\quad\geq \quad&U(\theta|\theta^{(t)}),\qquad&\theta > \theta^{(t)}. \end{aligned} \end{equation}$

(2.3)

According to the definition of the CD inequalities symbol, the above condition may be represented as

$\begin{equation} h(\theta)\overset{sgn(\theta-\theta^{(t)})}{\geq}U(\theta|\theta^{(t)}),\qquad \forall \theta,\theta^{(t)}\in \Theta. \end{equation}$

(2.4)

2.2. The U-equation

As described above, the US algorithm is an iterative approach for solving nonlinear equations, with each iteration including a U-step and an S-step. The purpose of the U-step is to find a U-function that satisfies the condition (2.4), whereas the S-step involves solving the simplified U-equation: $U(\theta|\theta^{(t)}) = 0$ to obtain its root $\theta^{(t+1)}$ ,

$\begin{equation} \theta^{(t+1)} = sol\left\{U(\theta|\theta^{(t)}) = 0,\quad\forall\theta,\theta^{(t)}\in\Theta \right\}. \end{equation}$

(2.5)

In typical scenarios, $\theta^{(t+1)}$ can be explicitly expressed, even as a linear equation. Through the iterative execution of these two steps, $\left\{\theta^{(t)} \right\}^{\infty}_{t = 0}$ can gradually converge to the real root $\theta^\star$ of the U-equation.

2.3. The first-derivative lower bound function method

There are numerous U-functions for a given objective function $h(\theta)$ ; as Eq (2.4) illustrates, distinct U-functions correlate to distinct US algorithms. We may express the U-function using the lower-order derivatives of the goal function $h(\theta)$ . This can be accomplished by a variety of techniques, such as the first-derivative lower bound (FLB), second-derivative lower-upper bound (SLUB) constants method, and third-derivative lower bound (TLB) constant method ^[17]. These three methodologies enhance efficient solutions when the objective function is complex and the solution is not closed, each with a distinct convergence speed. In terms of maximizing the objective function, the US algorithm based on the FLB approach shares qualities with the EM algorithm and the MM algorithm, both of which exhibit linear convergence. The FLB function technique, which is dependent on the target function's first-order derivative, is mostly used in this article to generate the required U-function. First for parameter space $\Theta$ , we suppose that there exists a certain first-derivative lower bound function $b(\theta)$ for the first derivative of $h(\theta)$ , i.e.,

$\begin{equation} \begin{aligned} h^\prime(\theta)\geq b(\theta),\quad \forall\theta \in\Theta. \end{aligned} \end{equation}$

(2.6)

The U-function of $h(\theta)$ at $\theta = \theta^{(t)}$ can be formally defined as follows

$\begin{equation} \begin{aligned} U(\theta|\theta^{(t)})\overset{\Delta}{ = }h(\theta^{(t)})+\int^\theta_{\theta^{(t)}}b(z)\rm d z,\quad\forall \theta,\theta^{(t)}\in\Theta. \end{aligned} \end{equation}$

(2.7)

In fact,

$\begin{equation} \begin{aligned} h(\theta)-U(\theta|\theta^{(t)})\quad& = \quad [h(\theta)-h(\theta^{(t)})]-\int^\theta_{\theta^{(t)}}b(z)\rm d z = \int^\theta_{\theta^{(t)}}\textit h^\prime(z)\rm d z-\int^\theta_{\theta^{(t)}}\textit b(z)\rm d z \\[0.5cm]& = \quad \int^\theta_{\theta^{(t)}}[h^\prime(z)-b(z)]\rm d z \quad\overset{sgn(\theta-\theta^{(t)})}{\geq} 0 , \quad \forall \theta,\theta^{(t)}\in\Theta. \end{aligned} \end{equation}$

Let $\theta^\star$ be the unique root of the equation $h(\theta) = 0$ , and then the corresponding US iteration is as follows:

$\begin{equation} \begin{aligned} \theta^{(t+1)} = \quad sol\left\{U(\theta|\theta^{(t)}) = h(\theta^{(t)})+\int_{\theta^{(t)}}^{\theta}b(z)\rm d z = 0,\forall \theta,\theta^{(t)} > 0\right\}\overset{\Delta}{ = }g(\theta^{(t)}), \end{aligned} \end{equation}$

(2.8)

where $g(\theta^{(t)}) = g(\theta^\star)+(\theta^{(t)}-\theta^\star)h^\prime(\theta^\star)+0.5(\theta^{(t)}-\theta^\star)^2h^{\prime\prime}(\hat{\theta})$ is the first-order Taylor expansion around $\theta^\star$ , and $\theta^\star$ is a point between $\theta^{(t)}$ and $\theta^\star$ .

Although Li and Tian ^[17] proposed the idea of the US algorithm, in practical applications, only the distribution of univariate and binary parameters were studied. In the case of binary parameter distribution, when discussing the solution of the scoring equation, only one parameter can be explicitly expressed with another parameter. However, when one parameter cannot be explicitly expressed by the other parameter for this more general and complex situation, whether the US algorithm can be effectively applied is not further discussed, which is the issue to be carried out in this article. For the parameters of interest, the new FLB functions are constructed in this article, then, starting with initial values, the iterative values are updated using the corresponding S-step until the convergence criteria are met.

3. The US algorithm for the MLE of the Type Ⅰ discrete Weibull distribution

The discrete Weibull distribution's score function has a complex double exponential form, which makes it impossible to depict its solution and, thus, prevents its two parameters from being mutually expressed. For investigating the US algorithm's applicability in complicated models, we combine the US algorithm with the FLB method in this section to optimize the maximum likelihood estimation.

3.1. The Type Ⅰ discrete Weibull distribution

Assuming a random variable following the Weibull distribution $W(\lambda, \beta)$ , where $\lambda > 0$ and $\beta > 0$ , the cumulative distribution function (CDF) of the Weibull distribution is defined as $H\left(t, \lambda, \beta\right) = 1-e^{-\lambda t^\beta}$ , where $t > 0$ . Define $\alpha = e^{-\lambda}$ , and then $0 < \alpha < 1$ . If the probability mass function (PMF) for a random variable $X$ can be represented as

$\begin{equation} P(X = x;\alpha,\beta) = \alpha^{([x]-1)^\beta}-\alpha^{[x]^\beta},(x\geq1), \end{equation}$

then we say that $X$ follows the Type Ⅰ discrete Weibull distribution, denoted as $X\sim DW(\mathit{\boldsymbol{\theta)}}$ . Here, $[x]$ represents the maximum integer less than or equal to $x$ . When $\beta$ = 1, the discrete Weibull distribution degenerates to the geometric distribution $Geo(q)$ with $q = 1-\alpha$ .

Naturally, the cumulative distribution function of $X$ takes the following form:

$\begin{equation} F(x;\alpha,\beta) = 1-\alpha^{[x]^\beta}. \end{equation}$

3.2. The US algorithm for the MLE of $\alpha$ and $\beta$

This section will go into detail on the application of the US algorithm for maximum likelihood estimation of the Type Ⅰ discrete Weibull distribution. Assume $X$ is a random variable with the Type Ⅰ discrete Weibull distribution $DW(\mathit{\boldsymbol{\theta}})$ , where the parameter vector $\mathit{\boldsymbol{\theta}} = (\alpha, \beta)^T$ is in the parameter space $\Theta \subset \mathbb{R}^2$ . Let $\textbf{x} = (x_1, ..., x_n)$ denote the observed values of the random sample $(X_1, ..., X_n)$ . Then, the log-likelihood function of the parameter vector $\mathit{\boldsymbol{\theta}}$ is given by

$\ell(\mathit{\boldsymbol{\theta}}|\textbf{x}) = \sum\limits_{i = 1}^{n}\log(\alpha^{(x_i-1)^\beta}-\alpha^{(x_i)^\beta}).$

First, the first-order partial derivative of $\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})$ with respect to $\alpha$ can be calculated as

$\begin{equation} \frac{\partial\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})}{\partial\alpha} = \sum\limits_{i = 1}^{n}\alpha^{-1}\left[\frac{\alpha^{(x_i-1)^\beta}(x_i-1)^\beta-\alpha^{x_i^\beta}x_i^\beta}{\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta}}\right]\overset{\Delta}{ = }h_1(\alpha). \end{equation}$

(3.1)

Next, we construct the FLB function regarding the parameter $\alpha$ :

$\begin{equation} \begin{aligned} h_1^\prime(\alpha)& = \sum\limits_{i = 1}^{n}\frac{\left[[(x_i-1)^\beta-1](x_i-1)^\beta\alpha^{(x_i-1)^\beta-2}-(x_i^\beta-1)x_i^\beta\alpha^{x_i^\beta-2}\right](\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})^2} \\[0.5cm]&-\sum\limits_{i = 1}^{n}\frac{\left[\alpha^{(x_i-1)^\beta}(x_i-1)^\beta-\alpha^{x_i^\beta-1}x_i^\beta\right]((x_i-1)^\beta\alpha^{(x_i-1)^\beta-1}-x_i^\beta\alpha^{x_i^\beta-1})}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})^2} \\[0.5cm]&\geq-\sum\limits_{i = 1}^{n}\frac{\left[\alpha^{2(x_i-1)^\beta-2}[(x_i-1)^\beta+(x_i-1)^{2\beta}]+\alpha^{x_i^\beta+(x_i-1)^\beta-2}[(x_i-1)^{2\beta}+x_i^{2\beta}]+\alpha^{2x_i^\beta-2}[x_i^{2\beta+x_i^\beta}]\right]}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})} \\[0.5cm]&\geq-\sum\limits_{i = 1}^{n}\alpha^{2(x_i-1)^\beta-2}\frac{[2(x_i-1)^{2\beta}+2x_i^{2\beta}+(x_i-1)^\beta+x_i^\beta]}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})^2} \\[0.5cm]&\geq-\sum\limits_{i = 1}^{n}\frac{4x_i^{2\beta}+2x_i^\beta}{(\alpha-\alpha^2)^2} = -\sum\limits_{i = 1}^{n}3[4x_i^{2\beta}+2x_i^\beta]\left[\frac{1}{\alpha^2}+\frac{1}{(1-\alpha)^2}\right]\overset{\Delta}{ = }b_1(\alpha). \end{aligned} \end{equation}$

(3.2)

Then, the US iteration of $\alpha$ can be obtained as follows:

$\begin{equation} \begin{aligned} \alpha^{(t+1)}& = sol\left[U_\alpha(\alpha,\beta|\alpha^{(t)},\beta^{(t)}) = h_1(\alpha^{(t)})+\int^\alpha_{\alpha^{(t)}}b_1(z)\rm d z = 0\right] \\[0.5cm]& = sol\left[C_1+3C_2\left[\frac{1}{\alpha}-\frac{1}{1-\alpha}\right]-3C_2\left[\frac{1}{\alpha^{(t)}}-\frac{1}{1-\alpha^{(t)}}\right] = 0]\right] \\[0.5cm]& = sol\left[(C_3-C_1)\alpha^2-(6C_2+C_3-C_1)\alpha+3C_2 = 0\right], \end{aligned} \end{equation}$

(3.3)

where $C_1 = h_1(\alpha^{(t)})$ , $C_2 = \sum_{i = 1}^{n}[4x_i^{2\beta^{(t)}}+2x_i^{\beta^{(t)}}]$ , and $C_3 = 3C_2\left[\frac{1}{\alpha^{(t)}}-\frac{1}{1-\alpha^{(t)}}\right]$ . Similarly, we can obtain the first-order partial derivative of $\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})$ with respect to $\beta$ ,

$\begin{equation} \frac{\partial\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})}{\partial\beta} = \sum\limits_{i = 1}^{n}\log\alpha\left[\frac{\alpha^{(x_i-1)^\beta}(x_i-1)^\beta \log(x_i-1)-\alpha^{x_i^\beta}x_i^\beta \log(x_i)}{\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta}}\right]\overset{\Delta}{ = }h_2(\beta). \end{equation}$

(3.4)

In order to construct the FLB function and derive the US algorithm without explicit solutions for the two parameters, we need the following two lemmas.

Lemma 1. ^[18] Given that $\theta > 0$ , we have

$-e^\theta\geq-\frac{4e^{2max(\theta^{(t)}-1,0)}}{(2\theta^{(t)}-\theta)^2},\quad\forall0\leq\theta\leq2\theta^{(t)} \quad and\quad \theta^{(t)} > 0.$

Lemma 2. ^[18] For any $\theta\geq0$ , we have

$-e^{-\theta}\geq-\frac{2}{3}\theta^{-2}.$

We will then build the FLB function with respect to $\beta$ . First, we calculate $h_2(\beta)$ 's first derivative.

$\begin{equation} \begin{aligned} h_2^\prime(\beta)& = \sum\limits_{i = 1}^{n}\log(\alpha)\frac{\left[\log^2(x_i-1)(x_i-1)^{2\beta}\alpha^{2(x_i-1)^\beta}(1-\log\alpha)+\log^2(x_i)x_i^{2\beta}\alpha^{2(x_i-1)^\beta}(1-\log(\alpha))\right]}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})^2} \\[0.5cm]&+\sum\limits_{i = 1}^{n}\log(\alpha)\frac{\left[2\log(\alpha)\log(x_i)\log(x_i-1)x_i^\beta(x_i-1)^\beta-\log^2(x_i-1)(x_i-1)^\beta-\log^2(x_i)x_i^\beta\right]\alpha^{2x_i^\beta}}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})^2} \\[0.5cm]&\geq\sum\limits_{i = 1}^{n}\frac{\left[2\log^2(x_i)x_i^{2\beta}\alpha^{2(x_i-1)^\beta}[1-\log(\alpha)]-2\log^2(x_i-1)(x_i-1)^\beta\alpha^{2x_i^\beta}[1-\log(\alpha)]\right]}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})} \\[0.5cm]&\geq\sum\limits_{i = 1}^{n}\frac{2\log(\alpha)(1-\log(\alpha))\log^2(x_i)x_i^{2\beta}\alpha^{2(x_i-1)^\beta}}{(\alpha^{(x_i-1)^\beta}-\alpha^{x_i^\beta})^2} \\[0.5cm]&\geq-\frac{2\log(\alpha)(1-\log(\alpha))}{[1-\alpha]^2}\left[-\sum\limits_{i = 1}^{n}\log^2(x_i)x_i^{2\beta}\right]. \end{aligned} \end{equation}$

It can be deduced from Lemma 1 and Lemma 2 that

$\begin{equation} \begin{aligned} &-\sum\limits_{i = 1}^{n}\log^2(x_i)x_i^{2\beta} = -\sum\limits_{i = 1}^{n}e^{2\beta\log(x_i)}\log^2(x_i) \\[0.5cm]\geq&-\sum\limits_{i = 1}^{n}\left[\frac{4e^{2max(2\beta^{(t)}\log(x_i)-1,0)}I(\log(x_i) > 0)}{\left[4\beta^{(t)}\log(x_i)-2\beta\right]^2}+\frac{2I(\log(x_i)\leq0)}{3[-2\beta\log(x_i)]^2}\right]\log^2(x_i) \\[0.5cm] = &-\sum\limits_{i = 1}^{n}\left[\frac{4e^{2max(2\beta^{(t)}\log(x_i)-1,0)}I(x_i > 1)}{(4\beta^{(t)}-2\beta)^2}+\frac{I(x_i = 1)}{6\beta^2}\right], \end{aligned} \end{equation}$

where $I(\cdot)$ is the indicator function. Then, the corresponding FLB function is obtained as follows:

$\begin{equation} \begin{aligned} \frac{2\log(\alpha)(1-\log(\alpha))}{[1-\alpha]^2}\sum\limits_{i = 1}^{n}\left[\frac{e^{2max(2\beta^{(t)}\log(x_i)-1,0)}I(x_i > 1)}{(2\beta^{(t)}-\beta)^2}+\frac{I(x_i = 1)}{6\beta^2}\right]\overset{\Delta}{ = }b_2(\beta). \end{aligned} \end{equation}$

(3.5)

Therefore, the US iterative process for parameter $\beta$ can be given by

$\begin{equation} \begin{aligned} \beta^{(t+1)}& = sol\left[U_\beta(\alpha,\beta|\alpha^{(t+1)},\beta^{(t)}) = h_2(\beta^{(t)})+\int^\beta_{\beta^{(t)}}b_2(z)\rm d z = 0\right] \\[0.5cm]& = sol\left[h_2(\beta^{(t)})+a_1[\frac{a_2}{2\beta^{(t)}-\beta}-\frac{I(x_i = 1)}{6\beta}]-a_1[\frac{a_2}{\beta^{(t)}}-\frac{I(x_i = 1)}{6\beta^{(t)}}]\right] \\[0.5cm]& = sol\left[6a_3\beta^2-a_4\beta+a_5 = 0\right], \end{aligned} \end{equation}$

(3.6)

where

$\begin{equation} \begin{aligned} &a_1 = \frac{2\log(\alpha^{(t+1)})(1-\log(\alpha^{(t+1)}))}{(1-\alpha^{(t+1)})^2}, \\[0.3cm]&a_2 = e^{2max(2\beta^{(t)}\log(x_i)-1,0)}I(x_i > 1), \\[0.3cm]&a_3 = h_2(\beta^{(t)})-a_1[\frac{a_2}{\beta^{(t)}}-\frac{I(x_i = 1)}{6\beta^{(t)}}], \\[0.3cm]&a_4 = 12a_3\beta^{(t)}+6a_1a_2+a_1I(x_i = 1), \\[0.3cm]&a_5 = 2a_1I(x_i = 1)\beta^{(t)}. \end{aligned} \end{equation}$

The algorithm process for estimating two parameters can be described as follows. In the first stage, we determine the FLB functions for parameters $\alpha$ and $\beta$ using Eqs (3.2) and (3.5), respectively. Subsequently, we set two initial values $\alpha^{(t)}$ and $\beta^{(t)}$ , calculate Eq (3.3) to get $\alpha^{(t+1)}$ , and then compute $\beta^{(t+1)}$ via Eq (3.6) using $\alpha^{(t+1)}$ and $\beta^{(t)}$ . If both of the estimates for the parameters satisfy the convergence criteria, then their corresponding values will be returned. Otherwise, we resume to update the iteration value and repeat the preceding steps until the two estimated parameters converge.

Algorithm :Calculating the MLEs of $\alpha$ and $\beta$ via the US algorithm.
Input: The initial value $\alpha^{(0)}$ and $\beta^{(0)}$ ; The observed data $X_{obs} = \left\{x_i \right\}^{n}_{i = 0}$ ;
Output: $\hat{\alpha}, \hat{\beta}.$
1 Select FLB function for parameters $\alpha$ and $\beta$ , respectively;
2 Set initial values $\alpha^{(t)}, \beta^{(t)}$ , $t$ = 0;
3 repeat
4 Using $\alpha^{(t)}$ and $\beta^{(t)}$ , calculate $\alpha^{(t+1)}$ based on (3.3);
5 Using $\alpha^{(t+1)}$ and $\beta^{(t)}$ , calculate $\beta^{(t+1)}$ based on (3.6), update $t$ = $t$ + 1;
6 until convergence.

4. Simulation study

In this section, we conduct simulation studies to confirm the applicability to complex nonlinear equations and compare its performance to that of the classic Newton algorithm. First, we provide the calculation steps for parameter estimation using the Newton algorithm as follows:

$\begin{equation} \begin{aligned} &\textbf{Step 1 : }\alpha^{(t+1)} = \alpha^{(t)}-\frac{\partial\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})}{\partial\alpha}(\alpha^{(t)},\beta^{(t)})/\frac{\partial^2\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})}{\partial\alpha^2}(\alpha^{(t)},\beta^{(t)}), \\[0.3cm]&\textbf{Step 2 : }\beta^{(t+1)} = \beta^{(t)}-\frac{\partial\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})}{\partial\beta}(\alpha^{(t+1)},\beta^{(t)})/\frac{\partial^2\ell(\mathit{\boldsymbol{\theta}}|\textbf{x})}{\partial\beta^2}(\alpha^{(t+1)},\beta^{(t)}). \end{aligned} \end{equation}$

The sample size of the studies is set as $n$ = (50,100,200), and the parameters are set as $\alpha = (0.2, 0.4, 0.6, 0.8)$ and $\beta = (0.5, 1.0, 1.5, 2.0)$ , respectively. We independently generated $X_1^{(k)}, \dots, X_n^{(k)} \overset{iid}{\sim} DW(\alpha, \beta)$ , where $k = 1, \dots, K$ ( $K$ = 1000). The MLE of the parameters under the US algorithm were computed via Eqs (3.2) and (3.4). For every combination of parameters, we ran 1000 iterations of the experiments and evaluated the two methods' fitting performance using the convergence percentage and the mean squared error (MSE) of parameter estimation.

– display the outcomes of the two algorithms' simulations for each scenario. The MSE of the parameters under both algorithms progressively drops as the sample size rises, according to the statistics in the table, suggesting that both techniques are asymptotically unbiased. When the value of $\beta$ is fixed, as the value of $\alpha$ increases, the MSE of $\beta$ will steadily decrease. Overall, the MSE of both parameters under the US algorithm is smaller than that of the Newton algorithm, suggesting that the US algorithm performs better when it comes to estimation. Furthermore, it is evident from the table's convergence percentages that the US algorithm is more stable.

Table 1. The MSE and percentage from simulated data for

$\beta = 0.5$ .

Sample size	50
Parameters	$(\alpha, \beta)=(0.2, 0.5)$		$(\alpha, \beta)=(0.4, 0.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00312	0.00234	0.00429	0.00345
MSE ( $\hat{\beta}$ )	0.02222	0.02031	0.00931	0.00631
Parameters	$(\alpha, \beta)=(0.6, 0.5)$		$(\alpha, \beta)=(0.8, 0.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00422	0.00217	0.00233	0.00117
MSE ( $\hat{\beta}$ )	0.00535	0.00115	0.00547	0.00062
Sample size	100
Parameters	$(\alpha, \beta)=(0.2, 0.5)$		$(\alpha, \beta)=(0.4, 0.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00149	0.00122	0.00253	0.00208
MSE ( $\hat{\beta}$ )	0.01294	0.01072	0.00451	0.00307
Parameters	$(\alpha, \beta)=(0.6, 0.5)$		$(\alpha, \beta)=(0.8, 0.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00193	0.00178	0.00106	0.00081
MSE ( $\hat{\beta}$ )	0.00215	0.00079	0.00207	0.00047
Sample size	200
Parameters	$(\alpha, \beta)=(0.2, 0.5)$		$(\alpha, \beta)=(0.4, 0.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00082	0.00069	0.00128	0.00117
MSE ( $\hat{\beta}$ )	0.00647	0.00544	0.00244	0.00183
Parameters	$(\alpha, \beta)=(0.6, 0.5)$		$(\alpha, \beta)=(0.8, 0.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00106	0.00073	0.00064	0.00048
MSE ( $\hat{\beta}$ )	0.00112	0.00057	0.00102	0.00045

| Show Table

DownLoad: CSV

Table 2. The MSE and percentage from simulated data for

$\beta = 1.0$ .

Sample size	50
Parameters	$(\alpha, \beta)=(0.2, 1.0)$		$(\alpha, \beta)=(0.4, 1.0)$
Algorithms	Newton	US	Newton	US
Percentage	87%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00273	0.00214	0.00542	0.00259
MSE ( $\hat{\beta}$ )	0.06586	0.05798	0.05064	0.02234
Parameters	$(\alpha, \beta)=(0.6, 1.0)$		$(\alpha, \beta)=(0.8, 1.0)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00492	0.00297	0.00269	0.00090
MSE ( $\hat{\beta}$ )	0.02579	0.01286	0.01075	0.00106
Sample size	100
Parameters	$(\alpha, \beta)=(0.2, 1.0)$		$(\alpha, \beta)=(0.4, 1.0)$
Algorithms	Newton	US	Newton	US
Percentage	97%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00160	0.00114	0.00260	0.00150
MSE ( $\hat{\beta}$ )	0.04316	0.02192	0.02013	0.01119
Parameters	$(\alpha, \beta)=(0.6, 1.0)$		$(\alpha, \beta)=(0.8, 1.0)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00196	0.00123	0.00132	0.00065
MSE ( $\hat{\beta}$ )	0.01236	0.00674	0.00922	0.00091
Sample size	200
Parameters	$(\alpha, \beta)=(0.2, 1.0)$		$(\alpha, \beta)=(0.4, 1.0)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00078	0.00048	0.00114	0.00074
MSE ( $\hat{\beta}$ )	0.01835	0.00674	0.00866	0.00607
Parameters	$(\alpha, \beta)=(0.6, 1.0)$		$(\alpha, \beta)=(0.8, 1.0)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00109	0.00065	0.00051	0.00030
MSE ( $\hat{\beta}$ )	0.00541	0.00320	0.00319	0.00088

| Show Table

DownLoad: CSV

Table 3. The MSE and percentage from simulated data for

$\beta = 1.5$ .

Sample size	50
Parameters	$(\alpha, \beta)=(0.2, 1.5)$		$(\alpha, \beta)=(0.4, 1.5)$
Algorithms	Newton	US	Newton	US
Percentage	39%	100%	98%	100%
MSE ( $\hat{\alpha}$ )	0.00529	0.00229	0.00520	0.00276
MSE ( $\hat{\beta}$ )	0.10368	0.10098	0.08433	0.05747
Parameters	$(\alpha, \beta)=(0.6, 1.5)$		$(\alpha, \beta)=(0.8, 1.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00375	0.00309	0.00334	0.00187
MSE ( $\hat{\beta}$ )	0.05022	0.03443	0.04337	0.00426
Sample size	100
Parameters	$(\alpha, \beta)=(0.2, 1.5)$		$(\alpha, \beta)=(0.4, 1.5)$
Algorithms	Newton	US	Newton	US
Percentage	59%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00141	0.00108	0.00183	0.00152
MSE ( $\hat{\beta}$ )	0.04498	0.04198	0.05027	0.02460
Parameters	$(\alpha, \beta)=(0.6, 1.5)$		$(\alpha, \beta)=(0.8, 1.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00193	0.00115	0.00123	0.00071
MSE ( $\hat{\beta}$ )	0.02331	0.00817	0.01853	0.00367
Sample size	200
Parameters	$(\alpha, \beta)=(0.2, 1.5)$		$(\alpha, \beta)=(0.4, 1.5)$
Algorithms	Newton	US	Newton	US
Percentage	85%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00074	0.00070	0.00121	0.00099
MSE ( $\hat{\beta}$ )	0.03839	0.02496	0.01860	0.00719
Parameters	$(\alpha, \beta)=(0.6, 1.5)$		$(\alpha, \beta)=(0.8, 1.5)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00067	0.00049	0.00065	0.00048
MSE ( $\hat{\beta}$ )	0.00921	0.00393	0.00960	0.00324

| Show Table

DownLoad: CSV

Table 4. The MSE and percentage from simulated data for

$\beta = 2.0$ .

Sample size	50
Parameters	$(\alpha, \beta)=(0.2, 2.0)$		$(\alpha, \beta)=(0.4, 2.0)$
Algorithms	Newton	US	Newton	US
Percentage	5%	100%	73%	100%
MSE ( $\hat{\alpha}$ )	0.00589	0.00235	0.00445	0.00286
MSE ( $\hat{\beta}$ )	0.50894	0.15827	0.06508	0.05573
Parameters	$(\alpha, \beta)=(0.6, 2.0)$		$(\alpha, \beta)=(0.8, 2.0)$
Algorithms	Newton	US	Newton	US
Percentage	99%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00478	0.00198	0.00205	0.00153
MSE ( $\hat{\beta}$ )	0.11310	0.01770	0.06167	0.01236
Sample size	100
Parameters	$(\alpha, \beta)=(0.2, 2.0)$		$(\alpha, \beta)=(0.4, 2.0)$
Algorithms	Newton	US	Newton	US
Percentage	7%	100%	89%	100%
MSE ( $\hat{\alpha}$ )	0.00258	0.00155	0.00213	0.00153
MSE ( $\hat{\beta}$ )	0.02659	0.07759	0.05018	0.03908
Parameters	$(\alpha, \beta)=(0.6, 2.0)$		$(\alpha, \beta)=(0.8, 2.0)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00201	0.00092	0.00121	0.00084
MSE ( $\hat{\beta}$ )	0.04086	0.00654	0.03288	0.00463
Sample size	200
Parameters	$(\alpha, \beta)=(0.2, 2.0)$		$(\alpha, \beta)=(0.4, 2.0)$
Algorithms	Newton	US	Newton	US
Percentage	27%	100%	99%	100%
MSE ( $\hat{\alpha}$ )	0.00136	0.00077	0.00129	0.00102
MSE ( $\hat{\beta}$ )	0.10796	0.05642	0.04357	0.02541
Parameters	$(\alpha, \beta)=(0.6, 2.0)$		$(\alpha, \beta)=(0.8, 2.0)$
Algorithms	Newton	US	Newton	US
Percentage	100%	100%	100%	100%
MSE ( $\hat{\alpha}$ )	0.00116	0.00054	0.00066	0.00041
MSE ( $\hat{\beta}$ )	0.02345	0.00357	0.01886	0.00200

| Show Table

DownLoad: CSV

The trend of the predicted values of $\alpha$ and $\beta$ using the US algorithm and Newton technique as the number of iterations grows is shown in Figure 1. From a stability standpoint, the Newton method shows significant instability and often requires multiple twists to preserve the correct trend, while the US algorithm approaches the true values of parameters monotonically. From a convergence speed perspective, the FLB method exhibits linear convergence, while the Newton algorithm demonstrates quadratic convergence. Consequently, the FLB method has a comparatively slower convergence rate.

Figure 1. Simulation results of both algorithms.

DownLoad: Full-Size Img PowerPoint

5. Applications

This section describes the analysis of two different real data sets to illustrate the applicability of the US algorithm. The first data set in Table 5 contains the remission times in weeks of 20 leukemia patients with treatment studied by Hassan et al. ^[19]. Another data set is from the National Highway Traffic Safety Administration (www-fars.nhtsa.dot.gov) of the United States, which reports the number of fatalities due to motor vehicle accidents among children under the age of 5 in 32 states during the year 2022.

Table 5. The leukaemia patients data and the vehicle fatalities data.

Data 1	1 3 3 6 7 7 10 12 14 15 18 19 22 26 28 29 34 40 48 49
Data 2	15 1 6 3 23 6 1 25 14 4 13 15 7 14 2 2 8 12 3 6 12 13 2 8 2 10 5 15 47 7 3 4

| Show Table

DownLoad: CSV

We model the two data sets with the Type Ⅰ discrete Weibull distribution and the geometric distribution. The fitting results for the leukemia patients data by two distributions are provided in Table 6. The results of the Cramer-von Mises test, Anderson-Darling test, and Kolmogorov-Smirnov test show that the two distributions can successfully fit the data set. The p-values from the three tests of the Type Ⅰ discrete Weibull distribution employing the US algorithm demonstrate the best fitting effect. Moreover, the values of Akaike information criterion (AIC) ^[20] and Bayesian information criterion (BIC) ^[21] also show that the DW distribution based on the US algorithm has better estimation effect.

Table 6. Fitting for the leukaemia patients data by the DW distribution and the geometric distribution.

Methods	Estimates	AIC	BIC	p-value(KS)	p-value(AD)	p-value(CVM)
DW distribution (New)	$\alpha$ =0.9282 $\beta$ =0.9477	163.1752	165.1667	0.4207	0.1952	0.1533
DW distribution (US)	$\alpha$ =0.9513 $\beta$ =1.0000	161.9296	163.9211	0.9722	0.8817	0.8463
Geo distribution	q=0.0512	161.9783	162.9741	0.7035	0.4456	0.3946

| Show Table

DownLoad: CSV

shows the results of fitting the vehicle fatality data with the Type Ⅰ discrete Weibull distribution and the geometric distribution. Comparing the p-values of the three tests at the significance level $\alpha = 0.05$ reveals that the Geo distribution and the Type Ⅰ discrete Weibull distribution estimated by the Newton algorithm are considered to be insufficient. The fitting effectiveness of the DW distribution using the US algorithm is significant, as indicated by the values of AIC and BIC. The histogram for two different data sets evaluated by the DW distribution and the Geo distribution is shown in Figure 2. Figures 3 and 4 present QQ plots for these two distributions. It is also evident that the US algorithm performs better.

Table 7. Fitting for the vehicle fatalities data by the DW distribution and the geometric distribution.

Methods	Estimates	AIC	BIC	p-value(KS)	p-value(AD)	p-value(CVM)
DW distribution (New)	$\alpha$ =0.9122 $\beta$ =1.1756	212.8066	215.2212	0.1822	0.1324	0.1755
DW distribution (US)	$\alpha$ =0.8852 $\beta$ =0.9672	209.7482	212.6796	0.4707	0.3948	0.4147
Geo distribution	q=0.1039	214.4938	215.9596	0.0957	0.0857	0.0941

| Show Table

DownLoad: CSV

Figure 2. Histogram of leukaemia patients data (left panel) and vehicle fatalities data (right panel), and the correlation density curve fitted by the DW and Geo distributions.

DownLoad: Full-Size Img PowerPoint

Figure 3. QQ plots of the first data for the DW (US) (left panel), DW (Newton) (middle panel), and Geo distributions (right panel).

DownLoad: Full-Size Img PowerPoint

Figure 4. QQ plots of the second data for the DW (US) (left panel), DW (Newton) (middle panel), and Geo distributions (right panel).

DownLoad: Full-Size Img PowerPoint

6. Conclusions

The US algorithm is a novel iterative method with high stability and convergence. The existing research only involves simple models such as univariate nonlinear equations or univariate functions. This paper extends the US algorithm to more complex cases of two parameter discrete distribution functions, where one parameter cannot be explicitly represented by the other parameter estimate. In order to successfully estimate the parameters, this paper combines the FLB method to perform optimization estimation of a distribution function. The simulation results for the Type Ⅰ discrete Weibull distribution demonstrate that the US algorithm has good accuracy and stability. Simultaneously, for the purpose of demonstrating the applicability of the algorithm in complex situations, this paper conducted empirical research on two real data sets that follow the Type Ⅰ discrete Weibull distribution, namely, the data from patients with leukemia and children who die from motor vehicle accidents. After comparing and analyzing the US method with the conventional Newton algorithm, the results show that the recommended strategy has an excellent fitting effect.

Author contributions

Yuanhang Ouyang: Formal analysis, Writing original draft, Software, Investigation, Methodology, Data curation; Ruyun Yan: Validation, Software, Formal analysis; Jianhua Shi: Validation, Resources, Writing-review and editing, Methodology. All authors have read and approved the final version of the manuscript for publication.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

This research was conducted under a project titled "The National Social Science Fund of China" (20XTJ003).

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

References

[1]	E. Ertay, H. Huang, Z. Sarsenbayeva, T. Dingler, Challenges of emotion detection using facial expressions and emotion visualisation in remote communication, in Processing of the 2021 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Academic Press, (2021), 230–236. https://doi.org/10.1145/3460418.3479341
[2]	B. Sun, S. Cao, D. Li, J. He, Dynamic micro-expression recognition using knowledge distillation, IEEE Trans. Affect. Comput., (2020), In press. https://doi.org/10.1109/TAFFC.2020.2986962
[3]	J. Gou, B. Yu, S. J. Maybank, D. Tao, Knowledge distillation: A survey, Int. J. Comput. Vis., 129 (2021), 1789–1819. https://doi.org/10.1007/s11263-021-01453-z doi: 10.1007/s11263-021-01453-z
[4]	I. Ofodile, K. Kulkarni, C. A. Corneanu, S. Escalera, X. Baro, S. Hyniewska, et al., Automatic recognition of deceptive facial expressions of emotion, Comput. Sci., 2017. https://arXiv.org/abs/1707.04061.
[5]	S. Shojaeilangari, W. Y. Yau, E. K. Teoh, Pose-invariant descriptor for facial emotion recognition, Mach. Vis. Appl., 27 (2016), 1063–1070. https://doi.org/10.1007/s00138-016-0794-2 doi: 10.1007/s00138-016-0794-2
[6]	J. Wan, S. Escalera, G. Anbarjafari, H. J. Escalante, X. Baró, I. Guyon, et al., Results and analysis of chalearn lap multi-modal isolated and continuous gesture recognition, and real versus fake expressed emotions challenges, in IEEE International Conference on Computer Vision Workshop, (2017), 3189–3197. https://doi.org/10.1109/ICCVW.2017.377
[7]	E. Avots, T. Sapiński, M. Bachmann, D. Kamińska, Audiovisual emotion recognition in wild, Mach. Vis. Appl., 30 (2019), 975–985. https://doi.org/10.1007/s00138-018-0960-9 doi: 10.1007/s00138-018-0960-9
[8]	A. Kleinsmith, N. Bianchi-Berthouze, Affective body expression perception and recognition: A survey, IEEE Trans. Affect. Comput., 4 (2012), 15–33. https://doi.org/10.1109/T-AFFC.2012.16 doi: 10.1109/T-AFFC.2012.16
[9]	C. T. Lu, C. W. Su, H. L. Jiang, Y. Y. Lu, An interactive greeting system using convolutional neural networks for emotion recognition, Entertain. Comput., 40 (2022), 100452. https://doi.org/10.1016/j.entcom.2021.100452 doi: 10.1016/j.entcom.2021.100452
[10]	F. Noroozi, D. Kaminska, C. Corneanu, T. Sapinski, S. Escalera, G. Anbarjafari, Survey on emotional body gesture recognition, IEEE Trans. Affect. Comput., 12 (2018), 505–523. https://doi.org/10.1109/TAFFC.2018.2874986 doi: 10.1109/TAFFC.2018.2874986
[11]	P. Pławiak, T. Sośnicki, M. Niedźwiecki, Z. Tabor, K. Rzecki, Hand body language gesture recognition based on signals from specialized glove and machine learning algorithms, IEEE Trans. Industr. Inform. 12 (2016), 1104–1113. https://doi.org/10.1109/TII.2016.2550528 doi: 10.1109/TII.2016.2550528
[12]	T. Sapiński, D. Kamińska, A. Pelikant, C. Ozcinar, E. Avots, G. Anbarjafari, Multimodal database of emotional speech, video and gestures, in Pattern Recognition and Information Forensics, ICPR 2018 Lecture Notes in Computer Science, 11188 (2019). https://doi.org/10.1007/978-3-030-05792-315
[13]	R. Jenke, A. Peer, M. Buss, Feature extraction and selection for emotion recognition from eeg, IEEE Trans. Affect. Comput., 5 (2014), 327–339. https://doi.org/10.1109/TAFFC.2014.2339834 doi: 10.1109/TAFFC.2014.2339834
[14]	S. Kwon, Mlt-dnet: Speech emotion recognition using 1d dilated cnn based on multi-learning trick approach, Expert Syst. Appl., 167 (2021), 114177. https://doi.org/10.1016/j.eswa.2020.114177 doi: 10.1016/j.eswa.2020.114177
[15]	D. Issa, M. F. Demirci, A. Yazici, Speech emotion recognition with deep convolutional neural networks, Biomed. Signal Process. Control, 59 (2020), 101894. https://doi.org/10.1016/j.bspc.2020.101894 doi: 10.1016/j.bspc.2020.101894
[16]	S. R. Livingstone, F. A. Russo, The ryerson audio-visual database of emotional speech and song (ravdess): A dynamic, multimodal set of facial and vocal expressions in north american english, Plos One, 13 (2018), 1–35. https://doi.org/10.1371/journal.pone.0196391 doi: 10.1371/journal.pone.0196391
[17]	R. Gonzalez-Diaz, E. Paluzo-Hidalgo, J. F. Quesada, Towards emotion recognition: A persistent entropy application, in Processing of the International Conference on Computational Topology in Image Context, Academic Press, (2019), 96–109. https://doi.org/10.1007/978-3-030-10828-18
[18]	B. Zhang, G. Essl, E. M. Provost, Recognizing emotion from singing and speaking using shared models, in Processing of the IEEE International Conference on affective computing and intelligent interaction, Academic Press, (2015), 139–145. https://doi.org/10.1109/ACII.2015.7344563
[19]	H. Elhamdadi, S. Canavan, P. Rosen, Affective TDA: Using topological data analysis to improve analysis and explainability in affective computing, IEEE Trans. Vis. Comput. Graph., 28 (2021), 769–779. https://doi.org/0.1109/TVCG.2021.3114784
[20]	H. Edelsbrunner, J. Harer, Computational topology: an introduction, Am. Math. Soc., Academic Press, (2010). https://doi.org/10.1090/mbk/069
[21]	X. Guo, L. F. Polanía, K. E. Barner, Audio-video emotion recognition in the wild using deep hybrid networks, 2020. https://arXiv.org/abs/2002.09023.
[22]	J. Kossaifi, G. Tzimiropoulos, S. Todorovic, M. Pantic, Afew-va database for valence and arousal estimation in-the-wild, Image Vis. Comput., 65 (2017), 23–36. https://doi.org/10.1016/j.imavis.2017.02.001 doi: 10.1016/j.imavis.2017.02.001
[23]	H. Chintakunta, T. Gentimis, R. Gonzalez-Diaz, M. J. Jimenez, H. Krim, An entropy-based persistence barcode, Pattern Recognit., 48 (2015), 391–401. https://doi.org/10.1016/j.patcog.2014.06.023 doi: 10.1016/j.patcog.2014.06.023
[24]	N. Atienza, R. Gonzalez-Diaz, M. Soriano-Trigueros, On the stability of persistent entropy and new summary functions for topological data analysis, Pattern Recognit., 107 (2020), 107509. https://doi.org/10.1016/j.patcog.2020.107509 doi: 10.1016/j.patcog.2020.107509
[25]	M. Rucco, R. Gonzalez-Diaz, M. J. Jimenez, N. Atienza, C. Cristalli, E. Concettoni, et al., A new topological entropy-based approach for measuring similarities among piecewise linear functions, Signal Process., 134 (2017), 130–138. https://doi.org/10.1016/j.sigpro.2016.12.006 doi: 10.1016/j.sigpro.2016.12.006
[26]	A. Myers, E. Munch, F. A. Khasawneh, Persistent homology of complex networks for dynamic state detection, Phys. Rev. E, 100 (2019), 022314. https://doi.org/10.1103/PhysRevE.100.022314 doi: 10.1103/PhysRevE.100.022314
[27]	X. Wang, F. Sohel, M. Bennamoun, Y. Guo, H. Lei, Scale space clustering evolution for salient region detection on 3d deformable shapes, Pattern Recognit., 71 (2017), 414–427. https://doi.org/10.1016/j.patcog.2017.05.018 doi: 10.1016/j.patcog.2017.05.018
[28]	Y. M. Chung, C. S. Hu, Y. L. Lo, H. T. Wu, A persistent homology approach to heart rate variability analysis with an application to sleep-wake classification, Front. Phys., 12 (2021), 202. https://doi.org/10.3389/fphys.2021.637684 doi: 10.3389/fphys.2021.637684
[29]	M. Rucco, G. Viticchi, L. Falsetti, Towards personalized diagnosis of glioblastoma in fluid-attenuated inversion recovery (flair) by topological interpretable machine learning, Electr. Eng. Syst. Sci., 8 (2020), 770. https://doi.org/10.3390/math8050770 doi: 10.3390/math8050770
[30]	J. Lamar-Leon, R. Alonso-Baryolo, E. Garcia-Reyes, R. Gonzalez-Diaz, Persistent homology-based gait recognition robust to upper body variations, in Processing of the 23rd International Conference on Pattern Recognition, Academic Press, (2016), 1083–1088. https://doi.org/10.1109/ICPR.2016.7899780
[31]	J. Lamar-Leon, R. Alonso-Baryolo, E. Garcia-Reyes, R. Gonzalez-Diaz, Topological features for monitoring human activities at distance, in Processing of the 2nd International Workshop on Activity Monitoring by Multiple Distributed Sensing, 8703 (2014), 40–51. https://doi.org/10.1007/978-3-319-13323-2
[32]	J. Lamar-Leon, A. Cerri, E. Garcia-Reyes, R. Gonzalez-Diaz, Gait-based gender classification using persistent homology, in Processing of the 18th Iberoamerican Congress on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Apps, 8259 (2013) 366–373. https://doi.org/10.1007/978-3-642-41827-346
[33]	C. D. Toth, J. O'Rourke, J. E. Goodman, Handbook of discrete and computational geometry, CRC press, Academic Press, (2017). https://doi.org/10.1201/9781315119601
[34]	A. Zomorodian, G. Carlsson, Computing persistent homology, Discrete Comput. Geom., 33 (2005), 249–274. https://doi.org/10.1007/s00454-004-1146-y doi: 10.1007/s00454-004-1146-y
[35]	S. S. Haykin, Neural networks and learning machines, Pearson Education, Upper Saddle River, NJ, Academic Press, 2009.
[36]	D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, (2017). arXiv https://arXiv.org/abs/1412.6980
[37]	N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, J. Mach. Learn. Res., 15 (2014), 1929–1958. http://jmlr.org/papers/v15/srivastava14a.html
[38]	R. Gonzalez-Diaz, P. Real, On the cohomology of 3d digital images, Discret. Appl. Math., 147 (2005), 245–263. https://doi.org/10.1016/j.dam.2004.09.014 doi: 10.1016/j.dam.2004.09.014
[39]	E. Diener, R. J. Larsen, S. Levine, R. A. Emmons, Intensity and frequency: dimensions underlying positive and negative affect, J. Pers. Soc. Psychol., 48 (1985), 1253. https://doi.org/10.1037//0022-3514.48.5.1253 doi: 10.1037//0022-3514.48.5.1253
[40]	H. Schlosberg, Three dimensions of emotion, Psychol. Rev., 61 (1954), 81. https://doi.org/10.1037/h0054570 doi: 10.1037/h0054570
[41]	D. Kamińska, T. Sapiński, A. Pelikant, Recognition of emotion intensity basing on neutral speech model, in Man-Machine Interactions 3, Springer, 242 (2014), 451–458. https://doi.org/10.1007/978-3-319-02309-049
[42]	S. W. Byun, S. P. Lee, Human emotion recognition based on the weighted integration method using image sequences and acoustic features, Multimed. Tools. Appl., 80 (2020), 35871–35885. https://doi.org/10.1007/s11042-020-09842-1 doi: 10.1007/s11042-020-09842-1
[43]	M. F. H. Siddiqui, A. Y. Javaid, A multimodal facial emotion recognition framework through the fusion of speech with visible and infrared images, Multimodal Technol. Int., 4 (2020), 46. https://doi.org/10.3390/mti4030046 doi: 10.3390/mti4030046
[44]	C. Luna-Jimenez, D. Griol, Z. Callejas, R. Kleinlein, J. Montero, F. Fernandez-Martinez, Multimodal Emotion Recognition on RAVDESS Dataset Using Transfer Learning, Sensors, 21 (2021), 7665. https://doi.org/10.3390/s21227665 doi: 10.3390/s21227665
[45]	E. Ghaleb, J. Niehues, S. Asteriadis, Multimodal attention-mechanism for temporal emotion recognition, in Processng of the IEEE International Conference on Image Processing, Academic Press, (2020), 251–255. https://doi.org/10.1109/ICIP40778.2020.9191019

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)