1.
Introduction
A generalized partial linear regression model (GPLRM) is a semiparametric extension of a generalized linear regression model. In the study of GPLRM, the topics such as parameters estimation, the effectiveness of estimators have been paid much attention by researchers [1,2,3,4,5]. Here the famous profile likelihood method (PLM) came into being. As a special case of GPLRM, the logistic partial linear regression model (LPLRM) is widely used to study the relationship between binomial response variable and explanatory variables with linear part and nonparametric part in fields of medical studies, social studies and biological studies, and its parameters estimation as well as the asymptotic properties of the estimators can be obtained from the framework of GPLRM. As known, the LPLRM is assumed that the explanatory variables in the linear part are mutually independent. However, in application, the assumption of independence between explanatory variables in the linear part seldom holds, and a violation of the assumption causes the multicollinearity problem. It is well known that multicollinearity may causes large variance of the estimators, which in turn leads to wide confidence intervals and erroneous sign estimates.
In the study of the multicollinearity problem, to our best knowledge, the multicollinearity of linear regression models, generalized linear regression models (logistic, poisson and gamma), and partial linear regression models has been well investigated, respectively. (a) Most proposals of powerful novel methods for dealing with multicollinearity have appeared in the study of the linear regression models. Hoerl and Kennard [6] firstly proposed a ridge estimation program of unknown parameters to combat multicollinearity of a linear regression model. Liu [7] presented a class of biased Liu estimators of the unknown parameters that are superior to ridge estimators with ease of selecting bias parameters. Liu [8] introduced a two-parameter Liu-type estimator of unknown parameters to deal with the serious multicollinearity problem that the ridge estimator failed to combat. Kurnaz and Akay [9] proposed a general Liu-type estimator of unknown parameters based on the Liu-type estimator. Moreover, Zeinal [10] contributed an extension of the two-parameter estimator presented by Özkale and Kaçiranlar [11]. (b) In the case of a generalized linear regression model, for dealing with multicollinearity, some modified estimators are established. The estimators to combat multicollinearity in the logistic linear regression models (LLRM) are listed below. Kibria, Mnsson and Shukur [12] generalized and compared logistic ridge estimators with different ridge parameters. Inan and Erdogan [13] introduced a Liu-type estimator that had a smaller total mean squared error (MSE) than the ridge estimator under certain conditions. Asar and Genç [14]) constructed a two-parameter ridge estimator for the LLRM. Varathan and Wijekoon [15] proposed estimator called Modified almost unbiased logistic Liu estimator (MAULLE). Ertan and Akay [16] modified the general Liu-type estimator with reconstructing the form of the biasing parameter function. Jadhav [17] proposed a new estimator designated as linearized ridge logistic estimator. And also there are new poisson ridge estimator by Rashad et al. [18], and gamma modified ridge-type estimator by Lukman et al. [19]. (c) In order to eliminate the multicollinearity of a partial linear regression model (PLRM), some biased estimators of unknown parameters are constructed by considering additive errors, correlated errors or linear constraint, etc. For example, Roozbeh and Arashi [20] proposed difference-based ridge type estimators combining the restricted least squares method in the seemingly unrelated semiparametric models. Wu [21] proposed a difference-based almost unbiased Liu estimator in PLRM. Emami and Aghamohammadi [22] constructed the difference-based ridge and Liu type estimators in PLRM when the covariates are measured with additive errors. Akdeniz and Roozbeh [23] introduced a generalized difference-based almost unbiased ridge estimator in PLRM when the errors are correlated. Wu and Kibria [24] considered the generalized difference-based mixed two-parameter estimator in PLRM.
Theoretically, statistical regression models (except for univariate regression model) including linear part may face to a multicollinearity problem. Now we consider the case of a LPLRM, to our best knowledge, there is no literature reported on combating multicollinearity problem in study of a LPLRM. Apparently, the existing methods mentioned above combating multicollinearity in LLRM and PLRM can be recommended to construct a suitable biased estimator for our purpose. However, it is difficult to properly coordinate the existing methods for LLRM and PLRM, i.e. the combination of the methods should generate an alternative (estimator) which optimally reduces the impact of multicollinearity on LPLRM, meanwhile the linear part and the nonparametric part of the LPLRM are estimated in accordance with some standard statistical criteria. We attempt to deal such issues in section 3 where the PLM is employed to construct a more generalized Liu-type estimator, and the optimal choices of the biasing parameters and the function of the biasing parameter as well as the superiority conditions of the proposed estimator over other estimators are given.
This paper is organized as follows. Section 2 states the LPLRM, its estimation and evaluation criteria as well as the PLM. Several biased estimators and the proposed generalized Liu-type estimator (GLTE) are given in section 3. In section 4, theoretical conditions are derived to study the superiority of the GLTE over the other estimators under MSEM criterion. In section 5, the optimal choices of the biasing parameters and the function of biasing parameter are determined. In section 6, Mote Carlo simulations are given to evaluate the performance of the proposed GLTE. Section 7 presents a real data application. Finally, a brief summary and conclusions are given in section 8.
2.
Statistical methodology
2.1. Model and estimations
The dependent variable yi∈{0,1} is the binary response variable. xTi is the ith row of n×p explanatory variables matrix X, which may take any form of continuous, discrete or mixture of discrete and continuous. ti∈Rq is ith sample of a q-variate random vector of continuous explanatory variables. We consider the LPLRM as follows:
where πi=prob(yi=1|xi,ti), β=(β1,β2,...,βp)T is p×1 parametric vector, m(⋅) is a nonparametric function.
The PLM is often used to obtain estimators of model (2.1). In PLM, the smoothed or local log-likelihood for the nonparametric function mβ(t) at point t is given by
with πj(β,mβ(t))=exp(xTjβ+mβ(t))1+exp(xTjβ+mβ(t)). Here, κH(t−tj) denote local kernel weights with a (multidimensional) kenel function κ and a bandwidth matrix H, mβ(t) is a differentiable function with respect to β for each t. The Logarithmic profile likelihood for β can be written as
with πi(β,mβ(ti))=exp(xTiβ+mβ(ti))1+exp(xTiβ+mβ(ti)). Abbrevuate mi=mβ(ti), the likelihood equations are obtained from Eqs (2.2) and (2.3) as follows
and
where m′i is the partial derivative vector of mβ(ti) with respect to β. Taking the partial derivative of Eq (2.4) with respect to β, we have
The second partial derivative of Eq (2.3) with respect to β is given by
By the iterative Newton-Raphson algorithm [25] (see Eqs (2.5) and (2.6)), the iterative formula of β is given by
where ˜X=X−SX, S is the smoothing matrix with the following elements
˜X is the matrix with rows ˜xTi, ˜xi=xi+m′i, the diagonal matrix W=diag(πi(β,mi)(1−πi(β,mi))i=1,⋯,n), y=(y1,y2,…,yn)T, π=(π1,π2,…,πn)T. We will be able to get the iterative weighted least squares algorithm [25] for the model (2.1). Define an adjusted dependent variable Z=Xβ+m+W−1(y−π) to obtain ˜Z=Z−SZ=˜Xβ+W−1(y−π), where m=(m1,m2,⋯,mn)T.
Here, the estimation of the LPLRM (2.1) is called LPLE. In summary, the estimation algorithm for the model (2.1) is given as follows
Step 1: Give suitable initial values β(0), m(0) for β and m.
Step 2: Repeat step (a), (b) and (c) until convergence.
(a) Calculate ˜X and ˜Z;
(b) Updating step for β: βnew=(˜XTW˜X)−1˜XTW˜Z;
(c) Updating step for m: mnew=S(Z−Xβnew).
Step 3: Obtain the final estimates ˆβ, ˆmˆβ of β and m.
2.2. Evaluation criteria for related estimators of unknown parameters
There are two common evaluation criteria for estimators, namely, the asymptotic scalar mean squared error (SMSE) and the asymptotic mean square error matrix (MSEM). The SMSE are traces of the MSEM of an estimator ˜β, and they are defined as
where Cov(⋅) is the covariance matrix, Bias(⋅) is the biasing vector, tr(⋅) denote trace operation. We denote α=QTβ, Λ=diag(λ1,⋯,λp)=QT(˜XTˆW˜X)Q, where λ1≥λ2≥⋯≥λp≥0 are the ordered eigenvalues of ˜XTˆW˜X, ˆW is the final estimate of W in the above algorithm, Q is the orthogonal matrix consisting of the eigenvectors of ˜XTˆW˜X in columns, αj denote the jth element of QTβ, j=1,2,⋯,p. Since the LPLE is asymptotically unbiased [1], the MSEM and the SMSE of ˆβ are given by
3.
Construction of the proposed estimator
In this section, using the PLM, we introduce a GLTE which includes the LPLE, the ridge estimator, the Liu estimator and the Liu-type estimator. We still use some denotation same as that used in Section 2 such as Q,Λ,ˆW,Z,˜Z,˜X,X without change of meanings. We begin to consider how to combat a multicollinearity of the LPLRM, since in the presence of multicollinearity, the matrix ˜XTˆW˜X becomes ill-conditioned, which leads to a large variance and instability of LPLE though the LPLE is asymptotically unbiased [1]. Inspired by the famous ridge estimation method (a popular penalty estimation used for dealing with multicollinearity), we attempt to construct an appropriate biased estimation to eliminate the multicollinearity in LPLRM and meanwhile the LPLRM are well estimated in accordance with some standard statistical criteria. Using the PLM, in first step, we will simply extend the existing biased estimators such as the ridge estimator, the Liu estimator and the Liu-type estimator to the case of the LPLRM. In second step, we take a complex procedure in which we modify the biasing parameters and the function of biasing parameter, and introduce an objective function of β to construct the GLTE.
3.1. Logistic partial linear ridge estimator
Now, using LPLE, we adopt the ridge estimator [6] to construct logistic partial linear ridge estimator (LPLRE) as follows
where k≥0 is biasing parameter, I is the identity matrix with dimension p×p. The estimates of the nonparametric functions m are given by
The Bias, Cov, MSEM and SMSE of ˆβR(k) are given by
3.2. Logistic partial linear Liu estimator
Liu estimator was firstly proposed for dealing with a multicollinearity problem in a linear regression model by Liu [7]. Here, using LPLE, we adopt the Liu estimator to construct logistic partial linear Liu estimator (LPLLE) as follows
where 0<d<1 is a biasing parameter. In addition, the estimates of nonparametric functions m are given by
The Bias, Cov, MSEM and SMSE of ˆβL(d) are given by
3.3. Logistic partial linear Liu-type estimator
Liu [8], Inan and Erdogon [13] introduced a Liu-type estimator to modify Liu estimator when they deal with the multicollinearity problem in the LLRM. Using LPLE, we adopt the Liu-type estimator to construct the logistic partial linear Liu-type estimator (LPLLTE) as follows
where k>0 and d∈R are biasing parameters. The estimates of the nonparametric functions m are given by
The MSEM and SMSE of ˆβLT(d,k) are given as follows
3.4. Proposed generalized Liu-type estimator
Kurnaz and Akay [9], Ertan and Akay [16] introduced some new Liu-type estimators when they deal with multicollinearity of linear regression model and LLRM. The new Liu-type estimators are defined as
and
where ˆβ∗ is any estimator of the regression coefficient vector β, k is a biasing parameter and f(k) is a continuous function of biasing parameter k.
Using LPLE, we modify the constructions of the new Liu-type estimators ˆβE&A,ˆβK&A to propose a new estimator which is called a generalized Liu-type estimator (GLTE). Setting the following objective function
where K=diag(k1,k2,⋯,kp) is the matrix of biasing parameters, F=diag(f(k1),f(k2) ,⋯,f(kp)) is the matrix of function f(kj) of biasing parameter, f(kj)=akj+b, kj>0,j=1,⋯,p, a and b are constants (refer to section 5). Minimize the function (3.6) with respect to β, we can obtain the GLTE as follows
In addition, the estimates of the nonparametric functions m are given by
Remark 1. In our proposed construction (3.7) of GLTE, a group of biasing parameters is considered, which is the key difference from the constructions of the new Liu-type estimators by Kurnaz and Akay [9], Ertan and Akay [16]. We add the matrix of biasing parameters and the matrix of function of biasing parameters in order to ensure that the adjustment of the estimate of each component of the vector β can be different.
Remark 2. The GLTE is a more generalized estimator which includes the other estimators as special cases:
(i) ˆβG(K)=ˆβ, for f(kj)=kj, where a=1 and b=0;
(ii) ˆβG(K)=ˆβR(k), for K=kI, f(kj)=0, where a=0 and b=0;
(iii) ˆβG(K)=ˆβL(d), for K=I, f(kj)=d, where a=0 and b=d;
(iv) ˆβG(K)=ˆβLT(d,k), for K=kI, f(kj)=−d, where a=0 and b=−d.
The Bias, Cov, MSEM and SMSE of ˆβG(K) are given by
4.
The superiority of the proposed GLTE
In this section, we compare the performance of the proposed GLTE with that of the LPLE, the ridge estimator, the Liu estimator and the Liu-type estimator under MSEM criterion.
Let ˆβ1 and ˆβ2 be any two estimators of the vector β. From literature [26], we know that ˆβ2 is superior to ˆβ1 with respect to the MSEM criteria if and only if (iff) MSEM(ˆβ1)−MSEM(ˆβ2) is a positive definite (p.d.) matrix. If MSEM(ˆβ1)−MSEM(ˆβ2) is a non-negative definite matrix, then SMSE(ˆβ1)−SMSE(ˆβ2)≥0, but the inverse is not true. To compare the superiority of GLTE ˆβG(K), we will use the following lemma.
Lemma 1. (Farebrother [27]) Let A be a p.d. matrix, namely A>0, and c be nonzero vector. Then, A−ccT is p.d. matrix, iff cTA−1c≤1.
For clarity, we use the following abbreviations, Λ1=Λ+I, Λd=Λ+dI, Λ−d=Λ−dI, Λk=Λ+kI, ΛK=Λ+K, ΛF=Λ+F. We give the following theorems to show the superiority of ˆβG(K) over ˆβ, ˆβR(k), ˆβL(d), ˆβLT(d,k).
Theorem 1. Let be kj>0 and −2λj−kj<f(kj)<kj, j=1,⋯,p. Then MSEM(ˆβ)−MSEM(ˆβG(K))>0 iff
where Bias(ˆβG(K))=QΛ−1K(F−K)α.
Proof. From Eqs (2.7) and (3.8), we can immediately obtain the difference between the MSEM of ˆβ and ˆβG(K) as follows
The Λ−1−Λ−1KΛFΛ−1ΛFΛ−1K is the p.d. matrix if (2λj+kj+f(kj))(kj−f(kj))>0,j=1,⋯,p. Since kj>0,j=1,⋯,p, this condition becomes −2λj−kj<f(kj)<kj. By Lemma 1, the Theorem 1 is proved.
Theorem 2. Let be k>0, kj>0 and −λj−λj(λj+kj)λj+k<f(kj)<−λj+λj(λj+kj)λj+k,j=1,⋯,p. Then MSEM(ˆβR(k))−MSEM(ˆβG(K))>0 iff
where Bias(ˆβG(K))=QΛ−1K(F−K)α.
Proof. From Eqs (3.1) and (3.8), we can immediately obtain the difference between the MSEM of ˆβR(k) and ˆβG(K) as follows
Since (Bias(ˆβR(k)))(Bias(ˆβR(k)))T is the p.d. matrix, we just need to prove that Q(Λ−1kΛΛ−1k−Λ−1KΛFΛ−1ΛFΛ−1K)QT−(Bias(ˆβG(K)))(Bias(ˆβG(K)))T is the p.d. matrix. The Λ−1kΛΛ−1k−Λ−1KΛF Λ−1ΛFΛ−1K is the p.d. matrix if (λjλj+k+λj+f(kj)λj+kj)(λjλj+k−λj+f(kj)λj+kj)>0. Since k>0,kj>0, j=1,⋯,p, this condition becomes −λj−λj(λj+kj)λj+k<f(kj)<−λj+λj(λj+kj)λj+k. By Lemma 1, the Theorem 2 is proved.
Theorem 3. Let be 0<d<1, kj>0 and −λj−(λj+d)(λj+kj)λj+1<f(kj)<−λj+(λj+d)(λj+kj)λj+1,j=1,⋯,p. Then MSEM(ˆβL(d))−MSEM(ˆβG(K))>0 iff
where Bias(ˆβG(K))=QΛ−1K(F−K)α.
Proof. From Eqs (3.2) and (3.8), we can immediately obtain the difference between the MSEM of ˆβL(d) and ˆβG(K) as follows
Since (Bias(ˆβL(d)))(Bias(ˆβL(d)))T is the p.d. matrix, we just need to prove that Q(Λ−11 ΛdΛ−1ΛdΛ−11−Λ−1KΛFΛ−1ΛFΛ−1K)QT−(Bias(ˆβG(K)))(Bias(ˆβG(K)))T is the p.d. matrix. The Λ−11 ΛdΛ−1ΛdΛ−11−Λ−1KΛFΛ−1ΛFΛ−1K is the p.d. matrix if (λj+dλj+1+λj+f(kj)λj+kj)(λj+dλj+1−λj+f(kj)λj+kj) >0. Since 0<d<1,kj>0,j=1,⋯,p, this condition becomes −λj−(λj+d)(λj+kj)λj+1<f(kj)<−λj+(λj+d)(λj+kj)λj+1,j=1,⋯,p. By Lemma 1, the Theorem 3 is proved.
Theorem 4. Let be k>0, d∈R, kj>0 and −λj−(λj−d)(λj+kj)λj+k<f(kj)<−λj+(λj−d)(λj+kj)λj+k, or −λj+(λj−d)(λj+kj)λj+k<f(kj)<−λj−(λj−d)(λj+kj)λj+k, j=1,⋯,p. Then MSEM(ˆβLT(d,k))−MSEM(ˆβG(K))>0 iff
where Bias(ˆβG(K))=QΛ−1K(F−K)α.
Proof. From Eqs (3.4) and (3.8), we can immediately obtain the difference between the MSEM of ˆβLT(d,k) and ˆβG(K) as follows
Since (Bias(ˆβLL(d,k)))(Bias(ˆβLL(d,k)))T is the p.d. matrix, we just need to prove that Q(Λ−1k Λ−dΛ−1Λ−dΛ−1k−Λ−1KΛFΛ−1ΛFΛ−1K)QT−(Bias(ˆβG(K)))(Bias(ˆβG(K)))T is the p.d. matrix. The Λ−1kΛ−dΛ−1Λ−dΛ−1k−Λ−1KΛFΛ−1ΛFΛ−1K is the p.d. matrix if (λj−dλj+k+λj+f(kj)λj+kj)(λj−dλj+k −λj+f(kj)λj+kj) >0. Since k>0,d∈R,kj>0,j=1,⋯,p, this condition becomes −λj−(λj−d)(λj+kj)λj+k<f(kj)<−λj+(λj−d)(λj+kj)λj+k, or −λj+(λj−d)(λj+kj)λj+k<f(kj)<−λj−(λj−d)(λj+kj)λj+k, j=1,⋯,p. By Lemma 1, the Theorem 4 is proved.
5.
Determinations of biasing parameters and function of the biasing parameter
5.1. Estimators of function f(kj) of the biasing parameter and the biasing parameter kj in GLTE (j=1,⋯,p)
To make GLTE more effective, it is important to make the appropriate choices of function f(kj) of the biasing parameter and the biasing parameter kj. Inspired by Kurnaz and Akay [9], Ertan and Akay [16], we use the minimum SMSE criterion to select the best function of f(kj) and the optimal values of kj, j=1,⋯,p.
Note that, the SMSE of ˆβG(K) is a nonlinear function of kj,j=1,⋯,p and denoted by
We obtain partial derivative of g(k1,k2,⋯,kp) with respect to kj,j=1,⋯,p as follow
after equating the Eq (5.1) to 0, we obtain
or
where f′(kj) is the first derivative of f(kj) with respect to kj, j=1,⋯,p.
Remark 3. Our construction Eq (3.7) is able to carry out the general form of f(kj) by minimizing SMSE of ˆβG(K), instead of the two special cases Fact1, Fact2 in Kurnaz and Akay [9], Ertan and Akay [16]. This is due to the fact that there is no summation after taking the partial derivative, see Eq (5.1).
In Eq (5.2), we know that the first derivative f′(kj)=λjα2j1+λjα2j, and both λjα2j1+λjα2j and −λj1+λjα2j are constants, f(kj) is a linear function. In Eq (5.3), when f′(kj)=λjα2j1+λjα2j, f(kj) is also a linear function. Thus, Eqs (5.2) and (5.3) can be unified as a linear function f(kj)=akj+b, j=1,⋯,p. Substituting f(kj)=akj+b into Eq (5.1) and making it equal to 0, we can obtain the optimal value of kj as follows,
where a and b are constants need to be specified, ˆαj denote the jth element of QTˆβ. From Eq (5.2), we know that 0<a<1 and b<0. From Eq (5.3), we know that b=λj(a−1). In practice, we might as well take a=τ and b=λmin(τ−1), τ∈(0,1). However, in Eq (5.4), ˆkj may be negative. Thus, it is better to use the following estimator
the estimator of f(kj) is ˆf(kj)=τ+λmin(τ−1)ˆkGj. A more detailed discussion of constant τ is given in section 6.
5.2. Estimator of the biasing parameter k in LPLRE
In the LPLRE method, we use three biasing parameter estimators (namely, ˆkR1, ˆkR2, ˆkR3) recommended by Kibria et al. [12], which have been verified by simulation study to have better performance than other biasing parameter estimators. These parameter estimators are given as follows
where ˆσ2=1n−p(yi−ˆπi)2,mj=√ˆσ2ˆα2j,qj=λmax(n−p)ˆσ2+λmaxˆα2j.
5.3. Estimator of the biasing parameter d in LPLLE
By Liu [7], the optimal estimator of d is obtained by minimizing Eq (3.3) as follows
It is obvious that ˆdopt<1, but ˆdopt may be negative. Refer to [17], we use the following estimator
5.4. Estimators of biasing parameters d and k in LPLLTE
Based on the method in [13], for the biasing parameters k and d in ˆβLT, we fix k and find the optimal value of d by minimizing Eq (3.5), which is as follows
with ˆkLT=pˆα′ˆα.
Remark 4. The above mentioned biasing parameter kj and the function f(kj) of the biasing parameter in GLTE, as well as the biasing parameter k in ridge estimator, the biasing parameter d in Liu estimator and the biasing parameters k, d in Liu-type estimator, can also be determined by the generalized cross validation (GCV) criterion (see, [28,29]). Based on the GCV, we are able to simultaneously select the optimal biasing parameters in the biased estimators and the bandwidth of kernel smoother, and obtain the biased estimators with good performance.
6.
Monte Carlo simulations
In this section, we present some simulations to compare the performance of the proposed GLTE with that of the other estimators in the LPLRM. From the sample size (n), the degree of collinearity (r) and the number of explanatory variables in linear part (p), we discuss the performance of different estimators of parameters of linear part, as well as the performance of different estimators of nonparametric function in LPLRM. We generate the explanatory variables with following formula given by Zeinal [10], Varathan and Wijekoon [15], Ertan and Akay [16] as
where r2 denotes the correlation between any two design variables, which is specified by r=0.70,0.80,0.85, and 0.90, ξij are independent standard normal pseudo-random numbers. The binary response variable yi is generated from the Bernoulli(πi) distribution with
where β=(0.1,0.6,0.4,0.7)T or β=(0.2,0.3,0.5,0.4,0.5,0.2,0.4)T so that βTβ=1. The nonparametric functions are generated by using the design given in [23] as
The sample sizes are taken as n=200,300,400, and 500. The bandwidth vector is computed by Scott's rule of thumb [30] for the Gaussian kernel. The simulation is repeated M=2,000 times with the above setup and the simulated average scalar mean square error (ASMSE) of estimator (ˆβ∗) for linear parametric part (xTiβ) is given by
where the subscript l indicates the number of repetition, ˆβ∗ represents the estimator of the various methods. The simulated mean squared error (mse) of the estimator vector ˆm(t) of the nonparametric function m(t) is obtained by using the following equation,
where ||v||22=∑ni=1v2i for v=(v1,⋯,vn)′.
6.1. Performance of the LPLE, LPLRE, LPLLE, LPLLTE and GLTE
First, we will evaluate the performance of the LPLE, LPLRE, LPLLE, LPLLTE and GLTE. For different combinations of p,r, and n, the ASMSE of the estimator (ˆβ∗) for each method is shown in Table 1, the mse of the estimator (ˆm(t)) for each method is shown in Table 2.
Table 1 clearly reveals that the sample size (n), the degree of collinearity (r) and the number of explanatory variables in linear part (p) influence the ASMSE of estimator (ˆβ∗) for each method. When any other two parameters in p,r,n are fixed, the ASMSE of each estimator (ˆβ∗) decreases with the increase of the sample size n, increases with the increase of the degree of collinearity (r), and increases with the increase of the number of explanatory variables for linear part (p). It is observed that the ASMSE of the estimator (ˆβG) is smaller than that of the other estimators for all combinations of p,r,n. It implies that the performance of the estimator (ˆβG) better than that of the other estimators.
Table 2 indicate that the sample size (n) and the number of explanatory variables in the linear part (p) influence the mse of estimator ˆm(t) of nonparametric function for each method, and the concrete situation of the influence is similar to that in Table 1, and the influence degree is much smaller than that in Table 1. However, the mse of the estimators of the nonparametric functions of the proposed method is still smaller than that of other methods, only less obvious than that of the parametric part. The degree of collinearity (r) has some slight effect on the mse of estimator ˆm(t) of nonparametric function of each mothod, but the mse of estimator ˆm(t) of nonparametric function under the proposed method is generally at the lowest level. These results reflect that the proposed method also has some superior performance in the estimator of nonparametric functions, but it is not obvious in the parameter part.
It is noted that in Tables 1 and 2, τ values are close to 0, which is not arbitrary. Simulation results and simple theoretical derivation will be explained below.
In order to study the influence of τ value on the ASMSE of ˆβG, the curves of τ(τ∈(0,1)) and the ASMSE of ˆβG at p=4,n=200,r=0.70 or 0.90; p=7,n=200,r=0.70 or 0.90 and p=7,n=300,r=0.70 or 0.90 are plotted in Figure 1. The Figure 1 clearly indicates that the ASMSE of ˆβG for various combinations is in a small state when τ value tends to 0, and reaches a maximum value when τ value tends to 1, which is close to the ASMSE of ˆβ (refer to Table 1). The situations of the two endpoints can also be derived theoretically. From ˆf(kj)=τkj+λmin(τ−1), Eqs (5.4) and (5.5), it follows that ˆf(kj)→−λmin and kj→−λj+λmin(1+λjˆα2j)−λjˆα2j as τ→0, and ˆf(kj)→0 and kj→0 as τ→1, thus ˆβG→e (e is the minimum value of ˆβG) as τ→0, and ˆβG→ˆβ as τ→1. In addition, Figure 1 also implies that the ASMSE of ˆβG varies greatly with large p and large r. Large p and large r are more likely to have multicollinearity, it indirectly reflects that the proposed GLTE ˆβG has significant ability to eliminate multicollinearity.
6.2. Performance of GLTE and unmodified methods
In this subsection, we directly apply the estimator in [16] to the LPLRM for simulation and obtain estimators of unmodified method (UM) as follows
We chose k=ˆkR1 (case Ⅱ) and the seven f(k) functions (Ⅰ, Ⅱ, Ⅲ, Ⅳ, Ⅴ, Ⅵ, Ⅶ) in [16] and simulate with the setting in the beginning of section 6. The simulation results are presented in Table 3, Table 4 together with those of GLTE and LPLE in subsection 6.1.
From the results shown in Tables 3 and 4 for the estimation of parameter and nonparametric function, we can see that the performance of GLTE method is better than the UM method under the seven function types. Both GLTE and UM have a large improvement in the estimation of the parameter, and these have a slightly smaller improvement in the estimation of the nonparametric function, comparing with LPLE. Also, as a result of increasing sample size n while keeping r and p fixed, the ASMSE and mse values in LPLE, GLTE and UM methods under various function types are generally decreases. Similarly, when n and p are fixed and r is increased, it is seen that the ASMSE and mse values commonly increased in various methods.
7.
An application to the dataset of Indian Liver Patient
To motivate the multicollinearity problem in LPLRM, we consider the Indian Liver Patient Dataset from the UCI Repository of Machine Learning Databases [31]. The dataset contains 416 records of liver patients and 167 records of non-liver patients form north east of Andhra Pradesh, India. We consider the following eight variables: the binomial response variable y, taking values 1 if this patient has liver disease and 0 otherwise, the explanatory variables include age of the patient (Age), total bilirubin (TB), direct bilirubin (DB), serum glutamate-pyruvate transaminase (SGPT), serum glutamate-oxaloacetate transaminase (SGOT), total proteins (TP), albumin (ALB). Based on the study of Hartatik et al. [32], there are strong correlations between TB and DB, SGPT and SGOT, but a weak correlation between Age and other variables. However, we are more interested in the curve of the effect of Age on the logarithmic odds of liver disease, so we set Age as a nonparametric variable and the other six explanatory variables are included in the linear part to establish the following LPLRM:
where pi=P(yi=1) is the probability of the ith individual having liver disease. After the final iteration process in section 2, the eigenvalues of the matrix ˜XTˆW˜X are λ1=189342.42, λ2=32497.93, λ3=276.57, λ4=153.35, λ5=16.40, λ6=11.18. Thus the condition number is κ=√λmaxλmin=130.1194 showing that there is a multicollinearity problem. For this real data, the parameter estimates and SMSE values of parametric estimator of the model (7.1) under various methods are given in Table 5. The SMSE is used instead of ASMSE because the actual parameter values (β) are unknown during real data modeling. The results from the Table 5 reveal that the presence of multicollinearity affects the parameter estimates and that the SMSE of the proposed estimator ˆβG is smaller than that of the other estimators. Also, all the theoretical conditions in section 4 can be verified in this dataset when τ takes a very small value (τ=0.01) (see Table 6). The estimates of nonparametric functions under various methods are plotted in Figure 2. The Figure 2 shows that the difference between the nonparametric curve (ˆmβG(Age)) of the proposed method and the nonparametric curve (ˆmβ(Age)) of the PLM is the largest, but their shapes are basically similar, while the nonparametric curves of other methods are very similar to the nonparametric curve (ˆmβ(Age)) of the PLM. Combined with the results in Table 2, this implies that the nonparametric function estimator of the proposed GLTE method outperform those of other methods.
8.
Conclusions
In this paper, we propose a GLTE to combat the multicollinearity of the linear part in a LPLRM. The GLTE is a more general estimator which includes the other estimators as special cases. Under some theoretical conditions, the performance of GLTE is superior to the LPLE, the ridge estimator, the Liu estimator and the Liu-type estimator. Also, the optimal choices of biasing parameters and function of biasing parameter are carried out, and their empirical choices are suggested. Monte Carlo simulations show that the finite sample performances of the proposed GLTE are better than those of the other estimators. The superior performance of GLTE is obtained when the empirical parameter τ takes a very small value (τ→0 & τ≠0). Also, the estimators are applied to the dataset of Indian Liver Patient, and the obtained results are consistent with the simulation study.
Acknowledgments
The research in this article is supported by the National Natural Science Foundation of China under grant number No.61973096. Their financial aids are greatly appreciated.
Conflict of interest
The authors declare there is no conflict of interest.