Efficient thyroid disorder identification with weighted voting ensemble of super learners by using adaptive synthetic sampling technique

Noor Afshan; Zohaib Mushtaq; Faten S. Alamri; Muhammad Farrukh Qureshi; Nabeel Ahmed Khan; Imran Siddique; Noor Afshan; Zohaib Mushtaq; Faten S. Alamri; Muhammad Farrukh Qureshi; Nabeel Ahmed Khan; Imran Siddique

doi:10.3934/math.20231238

AIMS Mathematics

2023, Volume 8, Issue 10: 24274-24309. doi: 10.3934/math.20231238

Previous Article Next Article

Research article Special Issues

Efficient thyroid disorder identification with weighted voting ensemble of super learners by using adaptive synthetic sampling technique

1.
Department of Software Engineering, Faculty of Computer Science, Lahore Garrison University, Lahore 54000, Pakistan
2.
Department of Electrical Engineering, CET, University of Sargodha, Sargodha 40100, Pakistan
3.
Department of Mathematical Sciences, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
4.
Department of Electrical Engineering, Riphah International University, Islamabad 44000, Pakistan
5.
Department of Mathematics, University of Management and Technology, Lahore 54770, Pakistan

Received: 27 April 2023 Revised: 08 July 2023 Accepted: 01 August 2023 Published: 14 August 2023
MSC : 62J02, 62J99

There are millions of people suffering from thyroid disease all over the world. For thyroid cancer to be effectively treated and managed, a correct diagnosis is necessary. In this article, we suggest an innovative approach for diagnosing thyroid disease that combines an adaptive synthetic sampling method with weighted average voting (WAV) ensemble of two distinct super learners (SLs). Resampling techniques are used in the suggested methodology to correct the class imbalance in the datasets and a group of two SLs made up of various base estimators and meta-estimators is used to increase the accuracy of thyroid cancer identification. To assess the effectiveness of our suggested methodology, we used two publicly accessible datasets: the KEEL thyroid illness (Dataset1) and the hypothyroid dataset (Dataset2) from the UCI repository. The findings of using the adaptive synthetic (ADASYN) sampling technique in both datasets revealed considerable gains in accuracy, precision, recall and F1-score. The WAV ensemble of the two distinct SLs that were deployed exhibited improved performance when compared to prior existing studies on identical datasets and produced higher prediction accuracy than any individual model alone. The suggested methodology has the potential to increase the accuracy of thyroid cancer categorization and could assist with patient diagnosis and treatment. The WAV ensemble strategy computational complexity and the ideal choice of base estimators in SLs continue to be constraints of this study that call for further investigation.

Keywords:

Citation: Noor Afshan, Zohaib Mushtaq, Faten S. Alamri, Muhammad Farrukh Qureshi, Nabeel Ahmed Khan, Imran Siddique. Efficient thyroid disorder identification with weighted voting ensemble of super learners by using adaptive synthetic sampling technique[J]. AIMS Mathematics, 2023, 8(10): 24274-24309. doi: 10.3934/math.20231238

Related Papers:

[1]	Ruimei Gao, Yuyu Wang . The characteristic polynomials of semigeneric threshold arrangements. AIMS Mathematics, 2023, 8(12): 28569-28581. doi: 10.3934/math.20231462
[2]	Ruimei Gao, Jingjing Qiang, Miao Zhang . The Characteristic polynomials of semigeneric graphical arrangements. AIMS Mathematics, 2023, 8(2): 3226-3235. doi: 10.3934/math.2023166
[3]	Rabha W. Ibrahim, Dumitru Baleanu . Global stability of local fractional Hénon-Lozi map using fixed point theory. AIMS Mathematics, 2022, 7(6): 11399-11416. doi: 10.3934/math.2022636
[4]	Peirong Li, Hong Bian, Haizheng Yu, Yan Dou . Clar covering polynomials of polycyclic aromatic hydrocarbons. AIMS Mathematics, 2024, 9(5): 13385-13409. doi: 10.3934/math.2024653
[5]	Qian Cao, Xiaojin Guo . Anti-periodic dynamics on high-order inertial Hopfield neural networks involving time-varying delays. AIMS Mathematics, 2020, 5(6): 5402-5421. doi: 10.3934/math.2020347
[6]	Emad H. M. Zahran, Omar Abu Arqub, Ahmet Bekir, Marwan Abukhaled . New diverse types of soliton solutions to the Radhakrishnan-Kundu-Lakshmanan equation. AIMS Mathematics, 2023, 8(4): 8985-9008. doi: 10.3934/math.2023450
[7]	M. Syed Ali, M. Hymavathi, Bandana Priya, Syeda Asma Kauser, Ganesh Kumar Thakur . Stability analysis of stochastic fractional-order competitive neural networks with leakage delay. AIMS Mathematics, 2021, 6(4): 3205-3241. doi: 10.3934/math.2021193
[8]	Chahn Yong Jung, Muhammad Shoaib Saleem, Shamas Bilal, Waqas Nazeer, Mamoona Ghafoor . Some properties of η-convex stochastic processes. AIMS Mathematics, 2021, 6(1): 726-736. doi: 10.3934/math.2021044
[9]	Mohammed A. Almalahi, Satish K. Panchal, Fahd Jarad, Mohammed S. Abdo, Kamal Shah, Thabet Abdeljawad . Qualitative analysis of a fuzzy Volterra-Fredholm integrodifferential equation with an Atangana-Baleanu fractional derivative. AIMS Mathematics, 2022, 7(9): 15994-16016. doi: 10.3934/math.2022876
[10]	Bundit Unyong, Vediyappan Govindan, S. Bowmiya, G. Rajchakit, Nallappan Gunasekaran, R. Vadivel, Chee Peng Lim, Praveen Agarwal . Generalized linear differential equation using Hyers-Ulam stability approach. AIMS Mathematics, 2021, 6(2): 1607-1623. doi: 10.3934/math.2021096

Abstract

1. Introduction

Generalized linear models are extensions of the linear regression model to avoid the selection of the normality response and linearity imposed by the linear regression model, which is impossible for binary or count responses. Regression for count data is widely performed by models, such as Poisson ^[1], negative binomial ^[2] and zero-inflated regressions ^[3,4,5]. The well-known property of a Poisson distribution shows its mean that is equal to the variance. This situation is often unrealistic as the distribution of counts tends to have a variance not equal to its mean. When data handles over-dispersion, a negative binomial distribution is utilized to model count variables. Zero-inflated models are used to model count data that have many zero counts. Moreover, the Conway-Maxwell Poisson distribution is used to deal with under-dispersion and over-dispersion ^[2]. Also, the discrete Weibull distribution is examined to handle under-dispersion and over-dispersion discrete data. This model was first introduced by Nakagawa and Osaki ^[6]. The motivation for considering the discrete Weibull distribution stems from the vital role the that continuous Weibull distribution plays in the survival analysis and failure time study. Similarly, the continuous Weibull distribution is widely used in probabilistic modeling and a fatigue life prediction model ^[7,8]. However, there are many challenging things about this distribution that has yet to be proposed.

The inference for parameters of the discrete Weibull regression model has been investigated in a few studies based on a parameter affecting zero observation related to the explanatory variables through the log-log and logit link functions. Kalktawi ^[9] and Englehardt and Li ^[10] showed how a discrete Weibull regression model can be adapted to address over-dispersion and under-dispersion via the log-log link function. Moreover, Klakattawi et al. ^[11] proposed an ability to adapt in a simple way to different types of dispersions: Over-dispersion, under-dispersion and covariate-specific dispersion. Peluso and Vinciotti ^[12] conducted a simulation study linking two parameters to inspect a discrete Weibull regression model's level of flexibility.

The maximum likelihood estimation of parameters is valid for an asymptotically large sample size of data ^[13]. One of the most common problems occurring in the count regression model is the maximum likelihood estimates that become unstable with larger standard errors of the estimates that affect statistical inference when insufficiently large sample sizes manifest. To overcome the problem, various alternatives to the maximum likelihood estimation have been proposed, and the Bayesian estimation is one of them. However, the Bayes estimators depend on the prior distributions of the parameters in the model. Many researchers at different periods of time worked in this area of research proposed different prior distributions in the count regression model (see, for example, Haselimashhadi et al. ^[14], Gelman et al. ^[15], Fu ^[16] and Chanialidis et al. ^[17]). Recently, Chaiprasithikul and Duangsaphon ^[18,19] proposed the Bayesian estimation for censored data and the zero-inflated and hurdle discrete Weibull regression models via the log-log link function for a parameter that affects zero observation. Uniform non-informative and normal prior distributions were used to account for the regression coefficients. It was demonstrated that this suggestion performs well when applied to real datasets and in simulation studies. Alternatively, a regression structure for the discrete Weibull model through the median link function was proposed by Kalktawi ^[9]. In addition, there are many works that have developed the Bayesian estimation procedure for the reliability and life testing experiments (see, for example, Ahmadini et al. ^[20] and Okasha et al. ^[21]).

In the present article, the Bayesian estimation is examined, based on the random walk Metropolis algorithm for the median discrete Weibull regression model under the three different prior distributions: Uniform non-informative, normal and Laplace prior distributions. A simulation study is conducted to compare the performance of three different prior distributions and the maximum likelihood estimation in both under-dispersion and over-dispersion cases. Moreover, a real dataset is analyzed to see how the model works in practice.

The rest of paper is organized as follows. An overview of the median discrete Weibull regression model is presented in section two, along with the maximum likelihood estimation, Bayesian estimation, simulation study and real data analysis. In section three, results and discussion are explained. Finally, this paper is concluded in section four.

2. Materials and methods

2.1. The median discrete Weibull regression model

A discrete Weibull distribution (type one) and some properties were introduced by Nakagawa and Osaki ^[6]. The cumulative distribution function and the probability mass function of a random variable are given by

$\begin{equation} F_Y(y;q, \beta) = \begin{cases} 1-{q}^{{(y+1)}^{\beta}}; & y = 0, 1, ... \\ 0; & otherwise, \end{cases} \end{equation}$

(1)

and

$\begin{equation} p_Y(y;q, \beta) = \begin{cases} {q}^{{y}^{\beta}}-{q}^{{(y+1)}^{\beta}}; & y = 0, 1, ... \\ 0; & otherwise, \end{cases} \end{equation}$

(2)

respectively, where $0 < q < 1$ and $\beta > 0$ are the shape parameters. When $y = 0$ , the parameter $q = 1-p_Y(0;q, \beta)$ , which is the probability of $Y$ more than zero. In other words, when $q$ is small, an excessive zero case occurs. Parameter $\beta$ indicates the skewness and controls the range of values of $Y$ . Moreover, Kalktawi ^[9] proposed the parameter $\beta$ reflects the dispersion of data through the numerical analyses; if $0 < \beta \leq 1$ , the data is over-dispersion, if $\beta \geq 2$ , the data is under-dispersion and if $1 < \beta < 2$ , the data can be either over-dispersed or under-dispersed depending on the value of $q$ . Thus, the discrete Weibull distribution is suitable for both over-dispersion and under-dispersion. Meanwhile, there are the special cases of a discrete Weibull distribution: Geometric distribution, the discrete Exponential distribution (see Sato et al. ^[22]), the discrete Rayleigh distribution (see Roy ^[23]) and the Bernoulli distribution.

The mean and variance of a discrete Weibull distribution are no closed-form expressions, but the numerical approximations can be obtained (see Barbiero ^[24]). Another property of a discrete Weibull distribution is the quantile function $Q(\tau)$ . Let's say the $\tau -th$ $(0 < \tau < 1)$ quantile that is the smallest value of $y$ for which $F(y)\geq \tau$ . The quantile $Q(\tau)$ has a closed-form expression as given by

$\begin{equation} Q(\tau) = \left[\left({\frac{{\text{ln}}(1-\tau)}{{\text{ln}} \;q}}\right)^{1/\beta}-1\right]. \end{equation}$

(3)

The quantile formula provided in Eq (3) can be applied. The median for discrete distributions can be defined as any value of $y$ that $F (y)\geq 0.5$ , then the median can be easily obtained from the closed form as given by

$\begin{equation} M = {\left(-\frac{{\text{ln 2}}}{{\text{ln}} \;q}\right)}^{1/\beta}-1. \end{equation}$

(4)

The presence of regression analysis for count data is a statistical process to measure the relationship between a count variable and one or more explanatory variables. Klakattawi et al. ^[11] showed how a discrete Weibull regression model can be adapted to address over-dispersion and under-dispersion via the log-log link function to the parameter $q$ . Moreover, Haselimashhadi et al. ^[14] applied the logit link function to the parameter $q$ , which is commonly used in classification problems for probabilities that are bounded between zero and one. Furthermore, they performed $\beta$ that depends on explanatory variables through the log link function. The proposed discrete Weibull regression model, unlike generalized linear models in which the conditional mean is central to the interpretation, has the advantage that the conditional quantiles can be easily extracted from the fitted model. Moreover, the regression coefficients can be easily interpreted in terms of changes in the conditional median. According to the median, it has a closed-form expression and is more appropriate than the mean because of the skewness and outliers commonly found for counting data. Kalktawi ^[9] performed the median link function to a discrete Weibull regression model and used the maximum likelihood method for parameters estimation.

This study determines $Y_i, i = 1, 2, ..., n$ as a count response variable, which takes only the non-negative integer values with the $k$ explanatory variables, ${\boldsymbol{x}}_i = (1, x_{i1}, ..., x_{ik})$ and a vector composed of regression coefficients as ${\boldsymbol{\alpha}} = (\alpha_0, \alpha_1, ..., \alpha_k)'$ . It is assumed that the parameter $q_i = q({\boldsymbol{x}}_i)$ is related to $k$ explanatory variables ${\boldsymbol{x}}_i$ via the median link function as follows

$\begin{equation} g(M_i) = {\boldsymbol{x}}_i{\boldsymbol{\alpha}} = \alpha_0+\alpha_1 x_{i1}+...+\alpha_k x_{ik}. \end{equation}$

(5)

In the context, it is useful to assume that

$\begin{equation} g(M_i) = {\text{ln}}(M_i+1) . \end{equation}$

(6)

Thus,

$\begin{equation} M_i + 1 = e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}} . \end{equation}$

(7)

Equation (7) is substituted to Eq (4); hence, the parameter $q_i$ can be obtained as

$\begin{equation} q( {\boldsymbol{x}}_i) = e^{\left(\frac{{-{\text{ln2}}}}{e^{\beta {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}\right).} \end{equation}$

(8)

The conditional probability mass function of $Y_i$ given ${\boldsymbol{x}}_i$ can be written as

$\begin{equation} p_Y(y_i| {\boldsymbol{x}}_i) = \begin{cases} {e}^{-({\text{ln}}2)({\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}-{e}^{-({\text{ln}}2)({\frac{y_i+1}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}} ; & y = 0, 1, ... \\ 0; & otherwise. \end{cases} \end{equation}$

(9)

2.2. Maximum likelihood estimation

In this section, the maximum likelihood estimation for the discrete Weibull regression model is performed by linking only parameter $q$ . The likelihood function of the median discrete Weibull regression is given by

$\begin{equation} L({\boldsymbol{\alpha}}, \beta| {\boldsymbol{y}}, {\boldsymbol{x}}) = \prod\limits_{i = 1}^{n}({e}^{-({\text{ln}}2)({\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{ \alpha}}}}})^{\beta}}-{e}^{-({\text{ln}}2)({\frac{y_i+1}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}). \end{equation}$

(10)

The log-likelihood function of the discrete Weibull regression model via median is given by

$\begin{equation} l({\boldsymbol{\alpha}}, \beta| {\boldsymbol{y}}, {\boldsymbol{x}}) = \sum\limits_{i = 1}^{n}{\text{ln}} ({e}^{-({\text{ln}}2)({\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}-{e}^{-({\text{ln}}2)({\frac{y_i+1}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}). \end{equation}$

(11)

The maximum likelihood estimation of the parameters is obtained by setting the first partial derivatives of the log-likelihood function with respect to each unknown parameter equal to zero;

$\frac {\partial l}{\partial \alpha_j} = \sum\limits_{i = 1}^{n} \frac{({\text{ln}} 2)\beta x_{ij}e^{- {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}{w_i ({\boldsymbol{\alpha}}, \beta)}\left( y_i {(\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}})}^{\beta-1} {e}^{-({\text{ln}}2)({\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}} - ( y_i+1) {(\frac{(y_i+1)}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}})}^{\beta-1} {e}^{-({\text{ln}}2)({\frac{(y_i+1)}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}\right),$

$\frac {\partial l}{\partial \beta} = \sum\limits_{i = 1}^{n} \frac{-{\text{ln 2}}}{w_i (\alpha, \beta)}\left( {(\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}})}^{\beta} {e}^{-({\text{ln2}})({\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}[{\text{ln}} y_i - {\boldsymbol{x}}_i {\boldsymbol{\alpha}}] - {(\frac{(y_i+1)}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}})}^{\beta} {e}^{-({\text{ln}}2)({\frac{(y_i+1)}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}[{\text{ln}}(y_i+1)- {\boldsymbol{x}}_i {\boldsymbol{\alpha}}]\right),$

where $w_i({\boldsymbol{\alpha}}, \beta) = {e}^{-({\text{ln2}})({\frac{y_i}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}} -{e}^{-({\text{ln}}2)({\frac{y_i+1}{e^{ {\boldsymbol{x}}_i {\boldsymbol{\alpha}}}}})^{\beta}}$ .

The maximum likelihood estimators do not have a closed form solution because of the complex form of the likelihood equations. It is very difficult to prove that the solution to the normal equations gives a global maximum. Therefore, the maximum likelihood estimators are estimated by using the numerical method applied in the function $optim ()$ from package stats in R language, which minimizes the negative log-likelihood function of the median discrete Weibull regression model.

Let $I({\boldsymbol{\alpha}}, \beta)$ be the observed Fisher's information matrix for the $(k+2)\times (k+2)$ unknown parameters that contain negative of the second derivative of the log-likelihood function; hence, the variance-covariance matrix is the inverse of the observed Fisher's information matrix,

$\begin{equation} \sum = I^{-1}({\boldsymbol{\alpha}}, \beta). \end{equation}$

(14)

The maximum likelihood estimators are substituted, thus resulting in an estimator of $\sum$ denoted by $\hat{\sum}$ ,

$\begin{equation} \hat{\sum} = \begin{pmatrix} \hat{\sigma}_{\alpha_ 0 \alpha _0} & \dots & \hat{\sigma}_{\alpha_ 0 \alpha _k} & \hat{\sigma}_{\alpha_ 0 \beta} \\ \vdots & \ddots & \vdots & \vdots \\ \hat{\sigma}_{\alpha_ 0 \alpha _k} & \dots & \hat{\sigma}_{\alpha_ k \alpha _k} & \hat{\sigma}_{\alpha_ k \beta} \\ \hat{\sigma}_{\alpha_ 0 \beta} & \dots & \hat{\sigma}_{\alpha_ k \alpha _\beta} & \hat{\sigma}_{\beta \beta} \end{pmatrix}. \end{equation}$

(15)

This matrix can be obtained by inverting the Hessian matrix from the function $hessian ()$ in R language. The Hessian matrix contains the second derivative of the negative log-likelihood, i.e., moreover, the Hessian matrix is the observed Fisher's information matrix.

According to the parameter inferences performed using the maximum likelihood method, under some regularity conditions ^[25], these estimators enjoy standard asymptotic properties. Thus, by the asymptotic normality of maximum likelihood estimators, the $100(1-\alpha)\%$ confidence intervals for parameters $\alpha_j$ , $j = 0, 1, 2, ...,$ $k$ and $\beta$ , respectively are

$\begin{equation} \hat{\alpha_j} \pm z_{\alpha/2} \sqrt{\hat{\sigma}_{\alpha_{j}\alpha_{j}}} \; {\text{and}} \; \hat {\beta} \pm z_{\alpha/2} \sqrt{\hat{\sigma}_{\beta \beta}}, \end{equation}$

(16)

where $z_{\alpha/2}$ is the upper $\alpha/2 -th$ quantile of the standard normal distribution.

2.3. Bayesian estimation

2.3.1. Random walk Metropolis algorithm

The Metropolis-Hastings (MH) algorithm is the most popular example of a Markov chain Monte Carlo (MCMC) method for simulating a sample from a probability distribution that is the target distribution from which direct sampling is difficult. This algorithm is similar to the acceptance-rejection method; the proposal (candidate) value can be generated from the proposal distribution, then, the proposal value is accepted with an acceptance probability. Moreover, the MH algorithm is converging to the target distribution itself. For more details on the MH algorithm, see Hastings ^[26] and Gilks et al. ^[27].

Given ${\boldsymbol{y}} = (y_1, y_2, ..., y_n)$ is the vector of the observed values of a random sample $Y_1, Y_2, ...Y_n$ , let $p({\boldsymbol{ \theta}}| {\boldsymbol{y}})$ be the target distribution, while ${\boldsymbol{ \theta}}$ is the vector of current state values (parameters) and ${\boldsymbol{ \theta}}^{*}$ is the proposal value generated from the proposal distribution $q({\boldsymbol{ \theta}}^*| {\boldsymbol{ \theta}})$ . Then, the proposal value ${\boldsymbol{ \theta}}^*$ is accepted with the probability $p = min(1, R_{ {\boldsymbol{ \theta}}})$ , where

$R_{ {\boldsymbol{ \theta}}} = \frac{p( {\boldsymbol{ \theta}}^*| {\boldsymbol{y}})}{p( {\boldsymbol{ \theta}}| {\boldsymbol{y}})}\times \frac{q( {\boldsymbol{ \theta}}| {\boldsymbol{ \theta}}^*)}{q( {\boldsymbol{ \theta}}^*| {\boldsymbol{ \theta}})} .$

The iterative steps of the MH algorithm can be described as follows

Step 1: Initialize the parameter ${\boldsymbol{ \theta}}^{(0)}$ for the algorithm.

Step 2: For $l = 1, 2, ..., L$ repeat the following steps:

a. Generate ${\boldsymbol{ \theta}}^*\sim q({\boldsymbol{ \theta}}^*| {\boldsymbol{ \theta}}^{(t-1)})$ .

b. Calculate $p = min(1, R_{ {\boldsymbol{ \theta}}})$ .

c. Generate $u$ from a uniform distribution, $u \sim U(0, 1)$ .

If $u \leq p$ , accept ${\boldsymbol{ \theta}}^*$ and set ${\boldsymbol{ \theta}}^{(l)} = {\boldsymbol{ \theta}}^*$ with probability $p$ .

If $u > p$ , reject ${\boldsymbol{ \theta}}^*$ and set ${\boldsymbol{ \theta}}^{(l)} = {\boldsymbol{ \theta}}^{(l-1)}$ with probability $1$ - $p$ .

A random walk Metropolis algorithm is a special case of the MH algorithm. In the random walk Metropolis algorithm, the proposal distribution is symmetrical, depending only on the distance between the current state value and the proposal value, then the proposal value ${\boldsymbol{ \theta}}^*$ is accepted with probability $p = min(1, R_{ {\boldsymbol{ \theta}}})$ , where

$R_{ {\boldsymbol{ \theta}}} = \frac{p( {\boldsymbol{ \theta}}^*| {\boldsymbol{y}})}{p( {\boldsymbol{ \theta}}| {\boldsymbol{y}})}.$

The algorithm of random walk Metropolis can be summarized followed by the above steps with adjusting Step 2. Generate random error $\epsilon$ from a multivariate normal distribution with a zero-mean vector and variance-covariance $\sum$ .

2.3.2. Bayesian estimation for median discrete Weibull regression model

This section performs the Bayes estimators for the median discrete Weibull regression model based on three schemes of prior distributions as follows:

ⅰ) Uniform non-informative prior distribution: If no prior information is available, a default flat prior can be resorted to, then it is easy to focus on the uniform non-informative prior distribution. The following the prior distributions are

$\pi (\alpha_j)\propto 1, j = 0, 1, ..., k , \; {\text{and}} \; \pi (\beta) \propto 1 .$

ⅱ) Normal prior distribution: As stated earlier, the possible values of $\alpha_j$ are real numbers, which corresponds to the possible values of a normal distribution; this study selects the prior distribution of $\alpha_j$ ; that is, a normal distribution with the hyperparameters as $({\mu}_{{\alpha}_{j}}, {\sigma^2}_{{\alpha}_{j}})$ , $j = 0, 1, ..., k$ . For parameter $\beta$ , this study selects the prior distribution that is Gamma distribution with the hyperparameters as $(a, b)$ . The following prior distributions are

$\pi (\alpha_j) = \frac{1}{\sqrt{2 \pi \sigma^2_{\alpha_j}}}{ e}^{{\frac{-1}{2 \sigma^2_{\alpha_j}}}{(\alpha_j-\mu_{\alpha_j}) }^2} , \mu_{\alpha_j} \in R , \sigma^2_{\alpha_j} > 0, j = 0, 1, 2, ..., k$

and

$\pi(\beta) = \frac{1}{b^a\Gamma(\alpha)}\beta^{\alpha -1}e^{-\beta/b} , a, b > 0 .$

ⅲ) Laplace prior distribution: If prior information is available, this study can perform the informative prior distribution that should include all possible values of parameter. The possible values of $\alpha_j$ are real numbers which corresponds to the possible values of a Laplace distribution; it selects the prior distribution of $\alpha_j$ ; that is, a Laplace distribution with the hyperparameters as $(0, 1/\lambda)$ . Similarly, it selects the prior distribution, which is Gamma distribution with the hyperparameters as $(a, b)$ . The following prior distributions are

$\pi(\alpha_j) = \frac{\lambda}{2}e^{-\lambda |{\alpha_j}|}, \lambda > 0, j = 0, 1, 2, ..., k,$

and

$\pi(\beta) = \frac{1}{b^a\Gamma(\alpha)}\beta^{a -1}e^{-\beta/b} , a, b > 0.$

The joint prior distributions of the parameters ${\boldsymbol{\alpha}}$ and $\beta$ under the independence assumption is

$\begin{equation} \pi( {\boldsymbol{ \theta}}) = \pi(\alpha_0)...\pi(\alpha_k)\pi(\beta), \end{equation}$

(17)

where ${\boldsymbol{ \theta}} = (\alpha_0, \alpha_1, ..., \alpha_k, \beta)$ .

The choice of the hyperparameters' values is generally modified by available information of dataset to improve the Bayes estimators. For example, it fixes the hyperparameters' values of $\alpha_j, j = 0, 1, 2, ..., k$ of normal prior distribution with mean zero and high variance. For Laplace prior distribution, it fixes the hyperparameters' values of $\alpha_j, j = 0, 1, 2, ..., k$ with some $\lambda > 0$ . In addition, the hyperparameters' values of $\beta$ are considered by the maximum likelihood estimator of $\beta$ with the mean of Gamma distribution. The joint posterior density function of the parameters ${\boldsymbol{\alpha}}$ and $\beta$ can be written as:

$\begin{equation} p( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}}) = \frac{L( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})\pi( {\boldsymbol{ \theta}})}{\int \int .. \int L( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}}) \pi( {\boldsymbol{ \theta}})d\alpha_0 ... d\alpha_k d\beta}\propto L( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})\pi( {\boldsymbol{ \theta}}), \end{equation}$

(18)

where $L({\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})$ is the likelihood function of the median discrete Weibull regression model in Eq (10).

The Bayes estimator of each parameter under the squared error loss function is the expected value of each parameter under the joint posterior density function. Therefore, the Bayes estimators are given by

$\begin{align} \hat {\alpha_j} = \int \int...\int \alpha_j p( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})d\alpha_0...d\alpha_k d\beta \end{align}$

(19)

and

$\begin{align} \hat{\beta} = \int \int...\int \beta p( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})d\alpha_0...d\alpha_k d\beta, \end{align}$

(20)

where $j = 0, 1, 2, ..., k$ .

A difficulty to the implementation of Bayesian procedure is that of obtaining the posterior distribution. The process often requires the integration which is very difficult to calculate especially when dealing with complex and high-dimensional models. In such a situation, MH algorithms are highly helpful in this case to model deviations from the posterior density and generate accurate approximations ^[26,27].

Since the integral in Eqs (19) and (20) does not have a closed form, this study chose the random walk MH algorithm to estimate the Bayes estimators. It also determines the joint posterior density function of the parameters ${\boldsymbol{\alpha}}$ and $\beta$ in Eq (18) as the target distribution, while ${\boldsymbol{ \theta}}$ is the current state value and ${\boldsymbol{ \theta}}^*$ is the proposal value generated from the proposal distribution $q({\boldsymbol{ \theta}}^*| {\boldsymbol{ \theta}})$ . Then, the proposal value ${\boldsymbol{ \theta}}^*$ is accepted with the probability $p = min(1, R_{ {\boldsymbol{ \theta}}})$ , where

$\begin{equation} R_{ {\boldsymbol{ \theta}}} = \frac{L( {\boldsymbol{ \theta}}^*| {\boldsymbol{y}}, {\boldsymbol{x}})\pi( {\boldsymbol{ \theta}}^*)}{L( {\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})\pi( {\boldsymbol{ \theta}})} \times \frac{q( {\boldsymbol{ \theta}}| {\boldsymbol{ \theta}}^*)}{q( {\boldsymbol{ \theta}}^*| {\boldsymbol{ \theta}})} . \end{equation}$

(21)

For the random walk Metropolis algorithm, the proposal distribution is symmetrical, depending only on the distance between the current state value and the proposal value. Then, the proposal value ${\boldsymbol{ \theta}}^*$ is accepted with probability $p = min(1, R_{ {\boldsymbol{ \theta}}})$ , where

(22)

The iterative steps of the random walk Metropolis algorithm can be described as follows:

Step 1: Initialize the parameters ${\boldsymbol{ \theta}}^{(0)} = ({\boldsymbol{\alpha}}^{(0)}, \beta^{(0)})$ for the algorithm using the maximum likelihood estimation (MLE) of the parameters ${\boldsymbol{ \theta}} = ({\boldsymbol{\alpha}}, \beta)$ .

Step 2: For $l = 1, 2, ..., L$ , repeat the following steps:

a. Generate random error vector $\epsilon$ from a multivariate normal distribution with a zero-mean vector and variance-covariance matrix as a diagonal matrix in which the diagonal elements are the diagonal of the inverse of the observed Fisher's information matrix; $\epsilon \sim N({\boldsymbol{ \pmb{\mathsf{ μ}}}} = 0, \sum = diag(I^{-1}({\boldsymbol{ \theta}})))$ .

Then, set ${\boldsymbol{ \theta}}^* = {\boldsymbol{ \theta}}^{(l-1)}+\epsilon$ .

b. Calculate $p = min(1, R_{ {\boldsymbol{ \theta}}})$ where $R_{ {\boldsymbol{ \theta}}} = \frac{L({\boldsymbol{ \theta}}^*| {\boldsymbol{y}}, {\boldsymbol{x}})\pi({\boldsymbol{ \theta}}^*)}{L({\boldsymbol{ \theta}}| {\boldsymbol{y}}, {\boldsymbol{x}})\pi({\boldsymbol{ \theta}})}$ .

c. Generate $\mu$ from a uniform distribution; $u \sim U(0, 1)$ .

If $u \leq p$ , accept ${\boldsymbol{ \theta}}^*$ and set ${\boldsymbol{ \theta}}^{(l)} = {\boldsymbol{ \theta}}^*$ with probability $p$ .

If $u > p$ , reject ${\boldsymbol{ \theta}}^*$ and set ${\boldsymbol{ \theta}}^{(l)} = {\boldsymbol{ \theta}}^{(l-1)} =$ with probability $1-p$ .

Step 3: Remove $B$ of the chain for $burn$ - $in$ .

Step 4: Calculate the estimated values of the Bayes estimators of the parameters ${\boldsymbol{\alpha}}$ , and $\beta$ from the average of the generated values as given by

$\begin{equation} \hat{\theta}_{Bayes} = \frac{1}{L-B}\sum\limits_{l = B+1}^{L} \theta^{(t)}, \end{equation}$

(23)

where $\theta$ is a parameter in vector ${\boldsymbol{ \theta}} = ({\boldsymbol{\alpha}}, \beta)$ .

The construction of the highest posterior density (HPD) credible intervals of the parameters $\alpha_j, j = 0, 1, 2, .., k$ , and $\beta$ follows the Monte Carlo procedure. Given an MCMC sample $\theta^{(l)}, l = B+1, B+2, ..., L$ , the HPD interval for $\theta$ can be shown as follows:

Step 1: Sort $\theta^{(l)}, l = B+1, B+2, ..., L$ to obtain the ordered value

$\theta_{(1)}\leq \theta_{(2)} \leq ... \leq \theta_{(L-B)}.$

Step 2: Compute the $100(1-\alpha)\%$ HPD credible intervals

$\begin{equation} R_i(L-B) = (\theta_{(i)}, \theta_{(i+[(1-\alpha)(L-B)])}), i = 1, 2, ..., (L-B)-[(1-\alpha)(L-B)], \end{equation}$

(24)

where $[(1-\alpha)(L-B)]$ is the integer part of $(1-\alpha)(L-B)$ and $\theta$ is a parameter in vector ${\boldsymbol{ \theta}} = ({\boldsymbol{\alpha}}, \beta).$

2.4. Simulation study

In this section, the Monte Carlo simulation is conducted to assess and compare the performance of the Bayesian estimation via the random walk Metropolis algorithm for the median discrete Weibull regression model under difference three prior distributions for the regression parameters: Uniform non-informative prior (Bayes(U)), normal prior (Bayes(N)) and Laplace prior (Bayes(L)). Moreover, the MLE is considered. The various selected sample sizes $(n)$ are 50,100 and 200. The three explanatory variables are considered: A standard normal distribution $(x_1 \sim N(0, 1))$ , a uniform distribution that lies between -0.3 and 0.3 $(x_2 \sim U(-0.3, 0.3))$ and a Bernoulli distribution with probability of success 0.4 $(x_3 \sim Ber(0.4))$ . In particular, this study selects the regression parameters to take values ( $\alpha_0, \alpha_1, \alpha_2, \alpha_3) = (1.5, 0.4, -0.2, 0.8)$ and $\beta = 0.9$ for over-dispersion, $\beta = 2.5$ for under-dispersion and $\beta = 1.6$ for either over-dispersed or under-dispersed, depending on the value of $q$ .

The parameters are estimated by using the numerical method. In this paper, the Nelder-Mead method in the function $optim ()$ from package stats in R is applied to estimate parameters of the median discrete Weibull regression model. For Bayesian estimation, it fixes the hyperparameters' values of $\alpha_j$ and $j = 0, 1, 2, 3$ of normal prior distribution with mean zero and variance $100^2$ and the hyperparameters' values of $\beta$ as one and the maximum likelihood estimator. In addition, it fixes the hyperparameters' values of $\alpha_j$ and $j = 0, 1, 2, 3$ of Laplace prior distribution as 0.5 and the hyperparameters' values of $\beta$ as one and the maximum likelihood estimator. Additionally, this study considers 10,000 iterations of the sampler and uses the first 10% of the data as burn-in. This simulation study is repeated 1,000 times. The measures of accuracy for the estimators are

(ⅰ) the estimates of the parameters (Est.)

$\begin{equation} {\text{Est.}} = \sum\limits_{l = 1}^{1,000}\hat{\alpha_j}^{(l)}/1,000 , \\ \end{equation}$

(25)

(ⅱ) the mean square error (MSE)

$\begin{equation} {\text{MSE}} = \sum\limits_{l = 1}^{1,000}{(\hat{\alpha_j}^{(l)}-\alpha_j)}^2/1,000 , \\ \end{equation}$

(26)

(ⅲ) the coverage probability (CP)

$\begin{equation} {\text{CP}} = \#({LCL}_{\alpha j} < \alpha_j < {UCL}_{\alpha j})/1,000, \, \end{equation}$

(27)

(ⅳ) the average length (AL)

$\begin{equation} {\text{AL}} = \sum\limits_{l = 1}^{1,000}({UCL}_{\alpha j}^{(l)}-{LCL}_{\alpha j}^{(l)})/1,000, \\ \end{equation}$

(28)

where $\hat{\alpha_j}$ is the $j$ -th estimator $LCL_{\alpha_j}^{(l)}$ , $UCL_{\alpha_j}^{(l)}$ are the $j$ -th lower bound and upper bound for the 95% confidence interval of the $l$ -th time and $LCL_{\alpha_j} < \alpha_j < UCL_{\alpha_j}$ is the total of the number of times that $\alpha_j$ is inside the confidence interval. The same measure of accuracy has been applied for the estimators of parameter $\beta$ . Tables 1–3 report the estimates of the parameters (Est.) together with the MSE. In addition, Tables 4–6 report the 95% CP and the AL.

Table 1. Est. and MSE when θ = (1.5, 0.4, -0.2, 0.8, 0.9).

n	parameter	MLE		Bayes(U)		Bayes(N)		Bayes(L)
n	parameter	Est.	MSE	Est.	MSE	Est.	MSE	Est.	MSE
	$\alpha_0$	1.5070	0.0578	1.5006	0.0579	1.4884	0.0585	1.4803	0.0578
	$\alpha_1$	0.4035	0.0316	0.4096	0.0308	0.4098	0.0309	0.3961	0.0296
50	$\alpha_2$	-0.3635	1.0364	-0.1902	0.9833	-0.1870	0.9819	-0.1390	0.5564
	$\alpha_3$	0.7746	0.1228	0.8001	0.1195	0.8005	0.1196	0.7704	0.1110
	$\beta$	0.9622	0.0191	0.9274	0.0150	0.9139	0.0142	0.9109	0.0140
	$\alpha_0$	1.4906	0.0273	1.4898	0.0263	1.4836	0.0265	1.4813	0.0268
	$\alpha_1$	0.3995	0.0155	0.4027	0.0149	0.4031	0.0150	0.3964	0.0148
100	$\alpha_2$	-0.3319	0.5356	-0.1948	0.4429	-0.1953	0.4420	-0.1637	0.3085
	$\alpha_3$	0.8031	0.0583	0.8164	0.0566	0.8169	0.0569	0.7998	0.0559
	$\beta$	0.9286	0.0077	0.9125	0.0065	0.9060	0.0063	0.9042	0.0063
	$\alpha_0$	1.49830	0.0149	1.5013	0.0132	1.4983	0.0132	1.4970	0.0132
	$\alpha_1$	0.3990	0.0070	0.4018	0.0065	0.4018	0.0064	0.3986	0.0064
200	$\alpha_2$	-0.3440	0.3627	-0.1686	0.2088	-0.1687	0.2088	-0.1483	0.1623
	$\alpha_3$	0.7938	0.0284	0.8031	0.0253	0.8032	0.0252	0.7948	0.0251
	$\beta$	0.9095	0.0037	0.9058	0.0031	0.9026	0.0030	0.9015	0.0030
Note: the boldface identifies the smallest MSE for each case.

| Show Table

DownLoad: CSV

Table 2. Est. and MSE when θ=(1.5,0.4,-0.2,0.8,1.6).

n	parameter	MLE		Bayes(U)		Bayes(N)		Bayes(L)
n	parameter	Est.	MSE	Est.	MSE	Est.	MSE	Est.	MSE
	$\alpha_0$	1.5009	0.0187	1.5026	0.0175	1.4962	0.0177	1.4951	0.0176
	$\alpha_1$	0.4054	0.0102	0.4052	0.0097	0.4050	0.0097	0.4003	0.0096
50	$\alpha_2$	-0.2270	0.3470	-0.1931	0.3101	-0.1933	0.3108	-0.1647	0.2268
	$\alpha_3$	0.7936	0.0398	0.8000	0.0378	0.8006	0.0377	0.7886	0.0373
	$\beta$	1.7116	0.0584	1.6525	0.0451	1.6303	0.0420	1.6281	0.0418
	$\alpha_0$	1.4952	0.0086	1.4955	0.0081	1.4922	0.0082	1.4916	0.0082
	$\alpha_1$	0.4012	0.0051	0.4017	0.0048	0.4016	0.0048	0.3995	0.0048
100	$\alpha_2$	-0.2314	0.1710	-0.2003	0.1391	-0.1982	0.1372	-0.1798	0.1147
	$\alpha_3$	0.8051	0.0193	0.8091	0.0180	0.8089	0.0180	0.8039	0.0179
	$\beta$	1.6521	0.0248	1.6248	0.0196	1.6140	0.0189	1.6126	0.0190
	$\alpha_0$	1.5006	0.0047	1.5007	0.0041	1.4991	0.0041	1.4987	0.0041
	$\alpha_1$	0.4012	0.0022	0.4010	0.0021	0.4010	0.0021	0.4001	0.0021
200	$\alpha_2$	-0.2090	0.0871	-0.1821	0.0661	-0.1823	0.0661	-0.1693	0.0582
	$\alpha_3$	0.7979	0.0093	0.8016	0.0081	0.8017	0.0081	0.7990	0.0080
	$\beta$	1.6195	0.0111	1.6090	0.0088	1.6040	0.0086	1.6027	0.0087
Note: the boldface identifies the smallest MSE for each case.

| Show Table

DownLoad: CSV

Table 3. Est. and MSE when θ=(1.5,0.4,-0.2,0.8,2.5).

n	parameter	MLE		Bayes(U)		Bayes(N)		Bayes(L)
n	parameter	Est.	MSE	Est.	MSE	Est.	MSE	Est.	MSE
	$\alpha_0$	1.4971	0.0089	1.5011	0.0074	1.4972	0.0074	1.4977	0.0075
	$\alpha_1$	0.4025	0.0043	0.4034	0.0041	0.4014	0.0041	0.4035	0.0041
50	$\alpha_2$	-0.2189	0.1713	-0.1965	0.1302	-0.1782	0.1076	-0.1971	0.1300
	$\alpha_3$	0.7958	0.0188	0.8004	0.0156	0.7952	0.0156	0.8002	0.0157
	$\beta$	2.6337	0.1496	2.5842	0.1114	2.5483	0.1030	2.5507	0.1033
	$\alpha_0$	1.4957	0.0045	1.4975	0.0034	1.4975	0.0034	1.4954	0.0034
	$\alpha_1$	0.4016	0.0024	0.4018	0.0020	0.4017	0.0020	0.4008	0.0020
100	$\alpha_2$	-0.2216	0.1024	-0.2018	0.0599	-0.2015	0.0597	-0.1889	0.0540
	$\alpha_3$	0.8022	0.0105	0.8056	0.0075	0.8057	0.0075	0.8034	0.0075
	$\beta$	2.5486	0.0729	2.5403	0.0471	2.5239	0.0454	2.5225	0.0454
	$\alpha_0$	1.4987	0.0024	1.5007	0.0018	1.4998	0.0018	1.4996	0.0018
	$\alpha_1$	0.4002	0.0012	0.4006	0.0009	0.4005	0.0009	0.4001	0.0009
200	$\alpha_2$	-0.2118	0.0703	-0.1947	0.0410	-0.1949	0.0409	-0.1868	0.0393
	$\alpha_3$	0.7987	0.0052	0.8016	0.0037	0.8014	0.0037	0.8004	0.0036
	$\beta$	2.5099	0.0401	2.5144	0.0239	2.5064	0.0236	2.5054	0.0236
Note: the boldface identifies the smallest MSE for each case.

| Show Table

DownLoad: CSV

Table 4. CP and AL when θ = (1.5, 0.4, -0.2, 0.8, 0.9).

n	parameter	MLE		Bayes(U)		Bayes(N)		Bayes(L)
		CP	AL	CP	AL	CP	AL	CP	AL
	$\alpha_0$	0.918	0.8825	0.939	0.9398	0.943	0.9523	0.948	0.9497
	$\alpha_1$	0.925	0.6384	0.952	0.6802	0.956	0.6906	0.958	0.6776
50	$\alpha_2$	0.932	3.6636	0.943	3.8821	0.953	3.9383	0.975	3.3774
	$\alpha_3$	0.923	1.2656	0.939	1.3614	0.941	1.3795	0.947	1.3446
	$\beta$	0.933	0.4552	0.932	0.4476	0.938	0.4428	0.938	0.4412
	$\alpha_0$	0.940	0.6229	0.952	0.6465	0.950	0.6499	0.952	0.6510
	$\alpha_1$	0.918	0.4443	0.931	0.4573	0.928	0.4603	0.925	0.4584
100	$\alpha_2$	0.921	2.5582	0.944	2.6243	0.951	2.6481	0.964	2.4005
	$\alpha_3$	0.935	0.8935	0.948	0.9242	0.945	0.9306	0.949	0.9235
	$\beta$	0.938	0.3066	0.948	0.3026	0.947	0.3013	0.945	0.3010
	$\alpha_0$	0.931	0.4491	0.949	0.4521	0.951	0.4538	0.946	0.4542
	$\alpha_1$	0.947	0.3138	0.956	0.3166	0.951	0.3179	0.954	0.3170
200	$\alpha_2$	0.871	1.8096	0.948	1.8175	0.946	1.8231	0.959	1.7103
	$\alpha_3$	0.939	0.6332	0.958	0.6405	0.958	0.6438	0.958	0.6430
	$\beta$	0.920	0.2115	0.937	0.2098	0.936	0.2094	0.936	0.2092

| Show Table

DownLoad: CSV

Table 5. CP and AL when θ=(1.5,0.4,-0.2,0.8,1.6).

n	parameter	MLE		Bayes(U)		Bayes(N)		Bayes(L)
		CP	AL	CP	AL	CP	AL	CP	AL
	$\alpha_0$	0.915	0.4946	0.938	0.5214	0.944	0.5286	0.946	0.5278
	$\alpha_1$	0.926	0.3605	0.944	0.3820	0.953	0.3863	0.950	0.3836
50	$\alpha_2$	0.917	2.0497	0.950	2.1733	0.951	2.2057	0.971	2.0250
	$\alpha_3$	0.926	0.7132	0.939	0.7624	0.946	0.7728	0.945	0.7663
	$\beta$	0.926	0.7826	0.940	0.7683	0.940	0.7569	0.941	0.7562
	$\alpha_0$	0.935	0.3518	0.949	0.3601	0.946	0.3618	0.948	0.3624
	$\alpha_1$	0.912	0.2509	0.923	0.2575	0.932	0.2588	0.931	0.2585
100	$\alpha_2$	0.921	1.4423	0.941	1.4748	0.946	1.4822	0.958	1.4122
	$\alpha_3$	0.936	0.5041	0.945	0.5194	0.942	0.5211	0.948	0.5199
	$\beta$	0.930	0.5268	0.947	0.5210	0.953	0.5179	0.951	0.5171
	$\alpha_0$	0.936	0.2504	0.955	0.2532	0.950	0.2534	0.952	0.2536
	$\alpha_1$	0.945	0.1769	0.955	0.1790	0.954	0.1794	0.953	0.1792
200	$\alpha_2$	0.929	1.0147	0.950	1.0240	0.945	1.0292	0.959	0.9931
	$\alpha_3$	0.944	0.3567	0.958	0.3610	0.960	0.3619	0.962	0.3623
	$\beta$	0.917	0.3627	0.943	0.3617	0.941	0.3596	0.939	0.3594

| Show Table

DownLoad: CSV

Table 6. CP and AL when θ=(1.5,0.4,-0.2,0.8,2.5).

n	parameter	MLE		Bayes(U)		Bayes(N)		Bayes(L)
		CP	AL	CP	AL	CP	AL	CP	AL
	$\alpha_0$	0.919	0.3255	0.948	0.3348	0.944	0.3375	0.947	0.3381
	$\alpha_1$	0.928	0.2383	0.937	0.2466	0.947	0.2478	0.944	0.2496
50	$\alpha_2$	0.911	1.3709	0.944	1.4014	0.961	1.3439	0.944	1.4162
	$\alpha_3$	0.920	0.4746	0.943	0.4896	0.936	0.4926	0.939	0.4948
	$\beta$	0.924	1.2153	0.941	1.1989	0.943	1.1721	0.941	1.1712
	$\alpha_0$	0.926	0.2311	0.946	0.2317	0.950	0.2327	0.946	0.2331
	$\alpha_1$	0.915	0.1657	0.929	0.1665	0.929	0.1672	0.932	0.1673
100	$\alpha_2$	0.899	0.9562	0.943	0.9509	0.943	0.9554	0.946	0.9264
	$\alpha_3$	0.924	0.3337	0.941	0.3343	0.944	0.3360	0.946	0.3356
	$\beta$	0.904	0.8210	0.946	0.8140	0.949	0.8042	0.947	0.8039
	$\alpha_0$	0.929	0.1630	0.955	0.1623	0.958	0.1623	0.957	0.1625
	$\alpha_1$	0.921	0.1158	0.946	0.1153	0.948	0.1156	0.948	0.1157
200	$\alpha_2$	0.907	0.6652	0.946	0.6583	0.948	0.6591	0.949	0.6471
	$\alpha_3$	0.934	0.2324	0.951	0.2318	0.952	0.2323	0.952	0.2324
	$\beta$	0.896	0.5652	0.942	0.5617	0.942	0.5584	0.940	0.5588

| Show Table

DownLoad: CSV

2.5. Real data analysis

In this section, the median discrete Weibull regression is applied to a real data set that shows the ability for over-dispersion data (see Kalktawi ^[9]). This data is available under the "COUNT" package in R from Hosmer and Lemeshow ^[28] and represents the number of visits to a doctor by pregnant women in the first three months of their pregnancies with 189 observations. The response variable is the number of physician visits in first trimester, and the three explanatory variables are history of mother smoking (1 = history of mother smoking; 0 = mother nonsmoker) $(x_1)$ , weight (lbs) at last menstrual period $(x_2)$ and age of mother $(x_3)$ . For fitting the discrete Weibull distribution of the response variable, the Kolmogorov-Smirnov statistic is 0.0985 less than the critical value of 0.0989. Thus, this data can be modeled by the discrete Weibull distribution. Moreover, this data is modeled by the Poisson, negative binomial and discrete Weibull distributions. The results show that the Akaike information criterion (AIC) from the Poisson, negative binomial and discrete Weibull distributions are 476.59,466.85 and 466.84, respectively. Additionally, the mean and variance of the data are 0.7937 and 1.1221, respectively, which indicates an over-dispersion case.

We estimate parameters and construct the 95% confidence intervals via the maximum likelihood estimation method. To demonstrate how the proposed Bayesian method under the three prior distributions can be used in practice, this study calculates parameter estimates and the 95% HPD interval of the parameters with $L = 10,000$ replicates and 10% of the chain for burn-in; $B = 1,000$ . In addition, the three information criteria, namely, the AIC, the Bayesian information criterion (BIC) and the deviance information criterion (DIC) (see in Kalktawi ^[9] and Haselimashhadi et al. ^[14]) are applied to compare models with different estimates of parameters, which are models for the three explanatory variables and a subset of parameters of that significance. All results are reported in Table 7. Along with, the traceplot, autocorrelation for sampled values and posterior densities of significant independent variables are performed in Figure 1.

Table 7. Parameter estimates, the 95% confidence intervals, and the three information criteria.

Models	Parameters	MLE	Bayes(U)	Bayes(N)	Bayes(L)
$x_1, x_2, x_3$	$\alpha_0$	-1.0983*	-1.1566*	-1.667*	-1.0553*
		(-1.8269, -0.3697)	(-2.0877, -0.4473)	(-1.8724, -0.4639)	(-1.7395, -0.3875)
	$\alpha_1$	-0.6240	-0.6690	-0.6980	-0.0478
		(-0.3173, 0.1924)	(-0.3491, 0.1849)	(-0.3169, 0.1738)	(-0.3182, 0.1795)
	$\alpha_2$	0.0029	0.0031	0.0030	0.0029
		(-0.0009, 0.0069)	(-0.0008, 0.0072)	(-0.0007, 0.0070)	(-0.0009, 0.0074)
	$\alpha_3$	0.0295*	0.0306*	0.0314*	0.0273*
		(0.0058, 0.0532)	(0.0055, 0.0585)	(0.0083, 0.0552)	(0.0057, 0.0490)
	$\beta$	1.19312*	1.1676*	1.1564*	1.1717*
		(0.9986, 1.3875)	(0.9897, 1.3639)	(0.9684, 1.3795)	(0.9848, 1.3579)
	AIC	463.1398	463.2289	463.3156	463.2471
	BIC	479.3486	479.4376	479.5244	479.4558
	DIC	-	463.3949	463.0142	462.3886
$x_3$	$\alpha_0$	-0.7974*	-0.8387*	-0.7974*	-0.7791*
		(-1.3929, -0.2019)	(-1.5015, -0.2737)	(-1.3929, -0.2019)	(-1.4164, -0.1910)
	$\alpha_3$	0.0321*	0.0333*	0.0321*	0.0310*
		(0.0083, 0.0558)	(0.0111, 0.0574)	(0.0083, 0.0558)	(0.0083, 0.0555)
	$\beta$	1.1733*	1.1587*	1.1733*	1.1545*
		(0.9850, 1.3617)	(0.9676, 1.3525)	(0.9850, 1.3617)	(0.9532, 1.3618)
	AIC	459.5681**	461.6021**	461.7598**	461.6235**
	BIC	466.0515**	471.3273**	471.3521**	471.3487**
	DIC	-	461.5384**	461.7598**	461.7922**
Note: () denotes the 95% confidence intervals does not contain zero (statistically significant) and (*) denotes the minimum value of each information criteria between models of and only.

| Show Table

DownLoad: CSV

Figure 1. Traceplot, autocorrelation and posterior densities for regression coefficient of Bayes(L).

DownLoad: Full-Size Img PowerPoint

3. Results and discussion

3.1. Results and discussion of simulation study

An inspection of – show that the estimates of the parameters (Est.) in the simulation study obtained by all methods seem to be close to the true parameter values. Moreover, all of the estimators have monotonic behaviors according to the MSE, namely, when $n$ increases, the estimated MSE values decrease. The Bayes estimators have a smaller MSE than the estimators of MLE. The MSE of the Bayes(L) outperforms other methods in almost all situations. Additionally, note that the MSE for all estimators of the three Bayesian methods behave very similarly when $n = 200$ . Conversely, the MLE presents the highest MSE but has a satisfactory performance when sample size increases. In addition, , note that the MSE of the estimators of $\alpha_2$ are very high and may cause what we define as a strong effect on $x_2$ or the high variance of the estimators of $\alpha_2$ . Furthermore, the explanatory variable affects the MSE of estimators. However, when $n$ is large enough and $\beta$ increases, the MSE of estimators will decrease and will no longer be very high anymore.

– show that when sample sizes were increased, the CP of all methods was generally close to the nominal confidence level. The CP obtained when using the three Bayesian methods are closer to the nominal level than using MLE method. Additionally, the CP for the three prior distributions of the Bayesian method behave remarkably similar. Regarding the AL, as sample sizes were increased, the AL of the 95% confidence intervals decreased for all methods. For cases $\beta = 0.9$ and $\beta = 1.6$ , the AL based on the Bayes(L) were the shortest for almost all situations after the CP was considered, while the Bayes(U) were the shortest in the case of $\beta = 2.5$ . Although the AL for the MLE method performs the shortest in some situations, but the CP is farthest from the nominal confidence level. Additionally, the results of the AL for the estimator $\alpha_2$ are quite wide, which is the explanation given for the extremely high MSE values of the estimator $\alpha_2$ .

3.2. Results and discussion of real data analysis

shows the results of significant explanatory variables that are selected from the three explanatory variables. We only report that an explanatory variable $x_3$ (age of mother) shows significant in all methods. Results from comparing models with different estimates of parameters suggest that a model for only $x_3$ provided better fitting than a model for the three explanatory variables, according to the AIC, BIC and DIC. Regarding the DIC of the three Bayesian methods for each of the two models, they exhibited remarkably similar behavior. Additionally, the AIC and BIC are included as well. This finding corresponds to the results of the simulation study in case $n = 200$ . shows the traceplot, autocorrelation for sampled values and posterior densities for regression coefficient $\alpha_3$ based on the Bayes(L). It can be seen that the trace plot showed adequate convergence. Moreover, it is clear that the sampled values are well mixed and exhibit adequate stability for autocorrelation.

4. Conclusions

This paper introduced the Bayesian estimation for the discrete Weibull regression model via the median link function. The Bayesian approach was considered on the three different prior distributions: Uniform non-informative prior, normal prior and Laplace prior. The augmented random walk Metropolis procedure was also proposed to compute the Bayes estimates of the unknown parameters. Moreover, the maximum likelihood estimation was compared. The performance of these methods was compared by using the Monte Carlo simulation based on the MSE and CP criteria. These criteria were calculated for different sample sizes based on both under-dispersion and over-dispersion data, along with the application of the methods illustrated by using a real dataset available on the literature to compare models with different estimates of parameters via the Akaike information criterion, the Bayesian information criterion and the deviance information criterion. Based on MSE criterion, the Bayesian using Laplace prior distribution for estimating the parameters performs better than other approaches. Additionally, the three Bayesian methods behaved very similarly with the large sample size. Estimated coverage probabilities of the three Bayesian approaches were considered as the criteria of a good confidence interval. In additioon, the results of real data analysis were coincided with those in the simulation study. Overall, the Bayesian estimation using Laplace prior distribution outperforms other methods for parameters estimation. However, the Bayesian estimation using all three prior distributions can be an effective alternative for this model.

Use of AI tools declaration

The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

Acknowledgments

The authors are grateful to the Editor in Chief and anonymous referee for the insightful and constructive comments that improve this paper.

Conflict of interest

All authors declare no conflicts of interest in this paper.

References

[1]	S. Grodski, T. Brown, S. Sidhu, A. Gill, B. Robinson, D. Learoyd, et al., Increasing incidence of thyroid cancer is due to increased pathologic detection, Surgery, 144 (2008), 1038–1043.
[2]	J. Kim, J. E. Gosnell, S. A. Roman, Geographic influences in the global rise of thyroid cancer, Nat. Rev. Endocrinol., 16 (2020), 17–29.
[3]	H. Sung, J. Ferlay, R. L. Siegel, M. Laversanne, I. Soerjomataram, A. Jemal, Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries, CA: Cancer J. Clin., 71 (2021), 209–249. https://doi.org/10.3322/caac.21660 doi: 10.3322/caac.21660
[4]	L. Enewold, K. Zhu, E. Ron, A. J. Marrogi, A. Stojadinovic, G. E. Peoples, et al., Rising thyroid cancer incidence in the United States by demographic and tumor characteristics, 1980–2005, Cancer Epidem. Biomar., 18 (2009), 784–791. https://doi.org/10.1109/JMEMS.2009.2023841 doi: 10.1109/JMEMS.2009.2023841
[5]	L. Davies, H. G. Welch, Current thyroid cancer trends in the United States, JAMA Otolaryngology-Head Neck Surgery, 140 (2014), 317. https://doi.org/10.1016/j.neucom.2014.03.007 doi: 10.1016/j.neucom.2014.03.007
[6]	P. B. Manoj, A. Innisai, D. S. Hameed, A. Khader, M. Gopanraj, N. H. Ihare, Correlation of high-resolution ultrasonography findings of thyroid nodules with ultrasound-guided fine-needle aspiration cytology in detecting malignant nodules: A retrospective study in Malabar region of Kerala, South India, J. Fam. Med. Prim. Care, 8 (2019), 1613.
[7]	H. Tan, Z. Li, N. Li, J. Qian, F. Fan, H. Zhong, et al., Thyroid imaging reporting and data system combined with Bethesda classification in qualitative thyroid nodule diagnosis, Medicine, 98 (2019), 2019.
[8]	A. N. Rajalakshmi, F. Begam, Thyroid Hormones in the Human Body: A review, J. Drug Delivery Ther., 11 (2021), 178–182. https://doi.org/10.22270/jddt.v11i5.5039 doi: 10.22270/jddt.v11i5.5039
[9]	A. K. Lee, P. M. A. Tacanay, P. Siy, D. T. Argamosa, Ectopic papillary thyroid carcinoma presenting as right lateral neck mass, JAFES, 37 (2022), 2022.
[10]	M. I. Larg, D. Apostu, C. Peștean, K. Gabora, I. C. Bădulescu, E. Olariu, et al., Evaluation of malignancy risk in 18F-FDG PET/CT thyroid incidentalomas, Diagnostics, 9 (2019), 92. https://doi.org/10.3390/diagnostics9030092 doi: 10.3390/diagnostics9030092
[11]	M. Hanan, E. Fatma, A. Aly, A. Medhat, Evaluation of Incidental Thyroid Findings Detected by Positron Emission Tomography/Computed Tomography, Medical J. Cairo University, 87 (2019), 819–826. https://doi.org/10.21608/mjcu.2019.52541 doi: 10.21608/mjcu.2019.52541
[12]	S. Quazi, Artificial intelligence and machine learning in precision and genomic medicine, Med. Oncol., 39 (2022), 120.
[13]	K. Preuss, N. Thach, X. Liang, M. Baine, J. Chen, C. Zhang, et al., Using quantitative imaging for personalized medicine in pancreatic cancer: a review of radiomics and deep learning applications, Cancers, 14 (2022), 1654. https://doi.org/10.3390/cancers14071654 doi: 10.3390/cancers14071654
[14]	N. Shusharina, D. Yukhnenko, S. Botman, V. Sapunov, V. Savinov, G. Kamyshov, et al., Modern methods of diagnostics and treatment of neurodegenerative diseases and depression, Diagnostics, 13 (2023), 573. https://doi.org/10.3390/diagnostics13030573 doi: 10.3390/diagnostics13030573
[15]	S. Khalil, U. Nawaz, Zubariah, Z. Mushtaq, S. Arif, M. Z. ur Rehman, et al., Enhancing ductal carcinoma Classification using transfer learning with 3D U-Net models in breast cancer imaging, Appl. Sci., 13 (2023), 4255.
[16]	A. M. Antoniadi, Y. Du, Y. Guendouz, L. Wei, C. Mazo, B. A. Becker, et al., Current challenges and future opportunities for XAI in machine learning-based clinical decision support systems: A systematic review, Appl. Sci., 11 (2021), 5088.
[17]	Z. Mushtaq, M. F. Qureshi, M. J. Abbass, S. M. Q. AlFakih, Effective kernelprincipal component analysis based approach for wisconsin breast cancer diagnosis, Electron. Lett., 59 (2023).
[18]	X. M. Keutgen, H. Li, K. Memeh, J. Conn Busch, J. Williams, L. Lan, D. Sarne, et al., A machine-learning algorithm for distinguishing malignant from benign indeterminate thyroid nodules using ultrasound radiomic features, J. Med. Imaging, 9 (2022), 034501–034501.
[19]	V. V. Vadhiraj, A. Simpkin, J. O'Connell, N. Singh Ospina, S. Maraka, D. T. O'Keeffe, Ultrasound image classification of thyroid nodules using machine learning techniques, Medicina, 57 (2021), 527. https://doi.org/10.3390/medicina57060527 doi: 10.3390/medicina57060527
[20]	M. Bereby-Kahane, R. Dautry, E. Matzner-Lober, F. Cornelis, D. Sebbag-Sfez, V. Place, et al., Prediction of tumor grade and lymphovascular space invasion in endometrial adenocarcinoma with MR imaging-based radiomic analysis, Diagn. Interv. Imag., 101 (2020), 401–411.
[21]	K. E. Fasmer, E. Hodneland, J. A. Dybvik, K. Wagner-Larsen, J. Trovik, A. Salvesen, et al., Whole-volume tumor MRI radiomics for prognostic modeling in endometrial cancer, J. Magn. Reson. Imaging, 53 (2021), 928–937.
[22]	A. Prete, P. Borges de Souza, S. Censi, M. Muzza, N. Nucci, M. Sponziello, Update on fundamental mechanisms of thyroid cancer, Front. Endocrinol., 11 (2020), 102.
[23]	N. Pozdeyev, M. M. Rose, D. W. Bowles, R. E. Schweppe, Molecular therapeutics for anaplastic thyroid cancer, In: Seminars in Cancer Biology, 61 (2020), 23–29. https://doi.org/10.1016/j.semcancer.2020.01.005
[24]	Y. C. Zhu, P. F. Jin, J. Bao, Q. Jiang, X. Wang, Thyroid ultrasound image classification using a convolutional neural network, Ann. Transl. Med., 9 (2021).
[25]	M. R. Kwon, J. H. Shin, H. Park, H. Cho, S. Y. Hahn, K. W. Park, Radiomics study of thyroid ultrasound for predicting BRAF mutation in papillary thyroid carcinoma: Preliminary results, Am. J. Neuroradiol., 41 (2020), 700–705. https://doi.org/10.3174/ajnr.A6505 doi: 10.3174/ajnr.A6505
[26]	Y. Wang, W. Yue, X. Li, S. Liu, L. Guo, H. Xu, et al., Comparison study of radiomics and deep learning-based methods for thyroid nodules classification using ultrasound images, Ieee Access, 8 (2020), 52010–52017.
[27]	D. Chen, J. Hu, M. Zhu, N. Tang, Y. Yang, Y. Feng, Diagnosis of thyroid nodules for ultrasonographic characteristics indicative of malignancy using random forest, BioData Min., 13 (2020), 1–21.
[28]	H. K. Shivastuti, J. Manhas, V. Sharma, Performance evaluation of SVM and random forest for the diagnosis of thyroid disorder, Int. J. Res. Appl. Sci. Eng. Technol., 9 (2021), 945–947.
[29]	H. Abbad Ur Rehman, C. Y. Lin, Z. Mushtaq, Effective K-nearest neighbor algorithms performance analysis of thyroid disease, J. Chin. Inst. Eng., 44 (2021), 77–87. https://doi.org/10.14358/PERS.87.2.77 doi: 10.14358/PERS.87.2.77
[30]	T. Akhtar, S. O. Gilani, Z. Mushtaq, S. Arif, M. Jamil, Y. Ayaz, et al., Effective voting ensemble of homogenous ensembling with multiple attribute-selection approaches for improved identification of thyroid disorder, Electronics, 10 (2021), 3026.
[31]	L. C. Zhu, Y. L. Ye, W. H. Luo, M. Su, H. P. Wei, X. B. Zhang, et al., A model to discriminate malignant from benign thyroid nodules using artificial neural network, PLoS One, 8 (2013), e82211. https://doi.org/10.1371/journal.pone.0082211 doi: 10.1371/journal.pone.0082211
[32]	B. Zhang, J. Tian, S. Pei, Y. Chen, X. He, Y. Dong, et al., Machine learning–assisted system for thyroid nodule diagnosis, Thyroid, 29 (2019), 858–867. https://doi.org/10.1089/thy.2018.0380 doi: 10.1089/thy.2018.0380
[33]	A. K. Singh, A comparative study on disease classification using machine learning algorithms, In Proceedings of 2nd International Conference on Advanced Computing and Software Engineering (ICACSE), 2019.
[34]	E. Sonuç, Thyroid disease classification using machine learning algorithms, In: Journal of Physics: Conference Series, vol. 1963, p. 012140, IOP Publishing, 2021. https://doi.org/10.1088/1742-6596/1963/1/012140
[35]	P. Poudel, A. Illanes, E. J. Ataide, N. Esmaeili, S. Balakrishnan, M. Friebe, Thyroid ultrasound texture classification using autoregressive features in conjunction with machine learning approaches, IEEE Access, 7 (2019), 79354–79365. https://doi.org/10.1109/ACCESS.2019.2923547 doi: 10.1109/ACCESS.2019.2923547
[36]	D. C. Yadav, S. Pal, Thyroid prediction using ensemble data mining techniques, Int. J. Inf. Technol., 14 (2022), 1273–1283.
[37]	S. S. Z. Mousavi, M. M. Zanjireh, M. Oghbaie, Applying computational classification methods to diagnose Congenital Hypothyroidism: A comparative study, Inf. Medicine Unlocked, 18 (2020), 100281.
[38]	D. T. Nguyen, J. K. Kang, T. D. Pham, G. Batchuluun, K. R. Park, Ultrasound image-based diagnosis of malignant thyroid nodule using artificial intelligence, Sensors, 20 (2020), 1822. https://doi.org/10.3390/s20071822 doi: 10.3390/s20071822
[39]	G. Chaubey, D. Bisen, S. Arjaria, V. Yadav, Thyroid disease prediction using machine learning approaches, Natl. Acad. Sci. Lett., 44 (2021), 233–238.
[40]	M. Garcia de Lomana, A. G. Weber, B. Birk, R. Landsiedel, J. Achenbach, K. J. Schleifer, et al., In silico models to predict the perturbation of molecular initiating events related to thyroid hormone homeostasis, Chem. Res. Toxicol., 34 (2020), 396–411.
[41]	K. Shankar, S. K. Lakshmanaprabu, D. Gupta, A. Maseleno, V. H. C. De Albuquerque, Optimal feature-based multi-kernel SVM approach for thyroid disease classification, J. Supercomput., 76 (2020), 1128–1143.
[42]	H. Abbad Ur Rehman, C. Y. Lin, Z. Mushtaq, S. F. Su, Performance analysis of machine learning algorithms for thyroid disease, Arab. J. Sci. Eng., 1–13, 2021.
[43]	R. Das, S. Saraswat, D. Chandel, S. Karan, J. S. Kirar, An AI Driven Approach for Multiclass Hypothyroidism Classification, In: Advanced Network Technologies and Intelligent Computing: First International Conference, ANTIC 2021, Varanasi, India, December 17–18, 2021, Proceedings, pp. 319–327, Springer, 2022.
[44]	M. Hosseinzadeh, O. H. Ahmed, M. Y. Ghafour, F. Safara, H. K. Hama, S. Ali, et al., A multiple multilayer perceptron neural network with an adaptive learning algorithm for thyroid disease diagnosis in the internet of medical things, J. Supercomput., 77 (2021), 3616–3637.
[45]	M. Riajuliislam, K. Z. Rahim, A. Mahmud, Prediction of Thyroid Disease (Hypothyroid) in Early Stage Using Feature Selection and Classification Techniques, In: 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), pp. 60–64, IEEE, 2021.
[46]	R. Jha, V. Bhattacharjee, A. Mustafi, Increasing the prediction accuracy for thyroid disease: A step towards better health for society, Wireless Pers. Commun., 122 (2022), 1921–1938. https://doi.org/10.1155/2022/9809932 doi: 10.1155/2022/9809932
[47]	T. Alyas, M. Hamid, K. Alissa, T. Faiz, N. Tabassum, A. Ahmad, Empirical method for thyroid disease classification using a machine learning approach, BioMed Res. Int., 22 (2022).
[48]	S. Sankar, A. Potti, G. N. Chandrika, S. Ramasubbareddy, Thyroid disease prediction using XGBoost algorithms, J. Mob. Multimed, 18 (2022), 1–18.
[49]	I. Ali, Z. Mushtaq, S. Arif, A. Algarni, N. Soliman, W. El-Shafai, Hyperspectral images-based crop classification scheme for agricultural remote sensing, Comput. Syst. Sci. Eng., 46 (2023), 303–319.
[50]	S. Arif, S. Munawar, H. Ali, Driving drowsiness detection using spectral signatures of EEG-based neurophysiology, Front. Physiol., 14 (2023), 1153268.
[51]	S. Arif, M. Arif, S. Munawar, Y. Ayaz, M. J. Khan, N. Naseer, EEG spectral comparison between occipital and prefrontal cortices for early detection of driver drowsiness, In: 2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS), pp. 1–6, IEEE, 2021.
[52]	S. Arif, M. J. Khan, N. Naseer, K. S. Hong, H. Sajid, Y. Ayaz, Vector phase analysis approach for sleep stage classification: A functional near-infrared spectroscopy-based passive brain–computer interface, Front. Hum. Neurosci., 15 (2021), 658444.
[53]	T. Akhtar, S. Arif, Z. Mushtaq, S. O. Gilani, M. Jamil, Y. Ayaz, et al., Ensemble-based effective diagnosis of thyroid disorder with various feature selection techniques, In: 2022 2nd International Conference of Smart Systems and Emerging Technologies (SMARTTECH), pp. 14–19, IEEE, 2022.
[54]	K. Chandel, V. Kunwar, S. Sabitha, T. Choudhury, S. Mukherjee, A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques, CSI Transactions ICT, 4 (2016), 313–319. https://doi.org/10.1111/twec.13285 doi: 10.1111/twec.13285
[55]	R. Pal, T. Anand, S. K. Dubey, Evaluation and performance analysis of classification techniques for thyroid detection, Int. J. Bus. Inf. Syst., 28 (2018), 163–177.
[56]	M. Saktheeswari, T. Balasubramanian, Multi-layer tree liquid state machine recurrent auto encoder for thyroid detection, Multimed. Tools Appl., 80 (2021), 17773–17783. https://doi.org/10.1007/s11042-020-10243-7 doi: 10.1007/s11042-020-10243-7
[57]	A. Tyagi, R. Mehra, A. Saxena, Interactive Thyroid Disease Prediction System Using Machine Learning Technique, In: 2018 Fifth International Conference on Parallel, Distributed and Grid Computing (PDGC), (Solan Himachal Pradesh, India), pp. 689–693, IEEE, Dec. 2018.
[58]	S. Mishra, Y. Tadesse, A. Dash, L. Jena, P. Ranjan, Thyroid Disorder Analysis Using Random Forest Classifier, In: Intelligent and Cloud Computing (D. Mishra, R. Buyya, P. Mohapatra, and S. Patnaik, eds.), Smart Innovation, Systems and Technologies, (Singapore), pp. 385–390, Springer, 2021.
[59]	K. Guleria, S. Sharma, S. Kumar, S. Tiwari, Early prediction of hypothyroidism and multiclass classification using predictive machine learning and deep learning, Measurement: Sensors, 24 (2022), 100482. https://doi.org/10.1016/j.measen.2022.100482 doi: 10.1016/j.measen.2022.100482
[60]	H. Zhang, C. Li, D. Li, Y. Zhang, W. Peng, Fault detection and diagnosis of the air handling unit via an enhanced kernel slow feature analysis approach considering the time-wise and batch-wise dynamics, Energ. Buildings, 253 (2021), 111467. https://doi.org/10.1016/j.enbuild.2021.111467 doi: 10.1016/j.enbuild.2021.111467
[61]	H. Zhang, W. Yang, W. Yi, J. B. Lim, Z. An, C. Li, Imbalanced data based fault diagnosis of the chiller via integrating a new resampling technique with an improved ensemble extreme learning machine, J. Build. Eng., 70 (2023), 106338. https://doi.org/10.1016/j.jobe.2023.106338 doi: 10.1016/j.jobe.2023.106338
[62]	H. Zhang, C. Li, Q. Wei, Y. Zhang, Fault detection and diagnosis of the air handling unit via combining the feature sparse representation based dynamic SFA and the LSTM network, Energ. Buildings, 269 (2022), 112241.

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)