Regression analysis frequently encounters two issues: multicollinearity among the explanatory variables, and the existence of outliers in the data set. Multicollinearity in the semiparametric regression model causes the variance of the ordinary least-squares estimator to become inflated. Furthermore, the existence of multicollinearity may lead to wide confidence intervals for the individual parameters and even produce estimates with wrong signs. On the other hand, as is often known, the ordinary least-squares estimator is extremely sensitive to outliers, and it may be completely corrupted by the existence of even a single outlier in the data. Due to such drawbacks of the least-squares method, a robust Liu estimator based on the least trimmed squares (LTS) method for the regression parameters is introduced under some linear restrictions on the whole parameter space of the linear part in a semiparametric model. Considering that the covariance matrix of the error terms is usually unknown in practice, the feasible forms of the proposed estimators are substituted, and their asymptotic distributional properties are derived. Moreover, necessary and sufficient conditions for the superiority of the Liu type estimators over their counterparts for choosing the biasing Liu parameter d are extracted. The performance of the feasible type of robust Liu estimators is compared with the classical ones in constrained semiparametric regression models using extensive Monte-Carlo simulation experiments and a real data example.
Citation: W. B. Altukhaes, M. Roozbeh, N. A. Mohamed. Feasible robust Liu estimator to combat outliers and multicollinearity effects in restricted semiparametric regression model[J]. AIMS Mathematics, 2024, 9(11): 31581-31606. doi: 10.3934/math.20241519
Regression analysis frequently encounters two issues: multicollinearity among the explanatory variables, and the existence of outliers in the data set. Multicollinearity in the semiparametric regression model causes the variance of the ordinary least-squares estimator to become inflated. Furthermore, the existence of multicollinearity may lead to wide confidence intervals for the individual parameters and even produce estimates with wrong signs. On the other hand, as is often known, the ordinary least-squares estimator is extremely sensitive to outliers, and it may be completely corrupted by the existence of even a single outlier in the data. Due to such drawbacks of the least-squares method, a robust Liu estimator based on the least trimmed squares (LTS) method for the regression parameters is introduced under some linear restrictions on the whole parameter space of the linear part in a semiparametric model. Considering that the covariance matrix of the error terms is usually unknown in practice, the feasible forms of the proposed estimators are substituted, and their asymptotic distributional properties are derived. Moreover, necessary and sufficient conditions for the superiority of the Liu type estimators over their counterparts for choosing the biasing Liu parameter d are extracted. The performance of the feasible type of robust Liu estimators is compared with the classical ones in constrained semiparametric regression models using extensive Monte-Carlo simulation experiments and a real data example.
[1] | P. Green, C. Jennison, A. Seheult, Analysis of field experiments by least squares smoothing, J. Roy. Statist. Soc. Ser. B, 47 (1985), 299–315. https://doi.org/10.1111/j.2517-6161.1985.tb01358.x doi: 10.1111/j.2517-6161.1985.tb01358.x |
[2] | R.-F. Engle, C.-W.-J. Granger, J. Rice, A. Weiss, Semiparametric estimates of the relation between weather and electricity sales, J. Am. Stat. Assoc., 81 (1986), 310–320. https://doi.org/10.2307/2289218 doi: 10.2307/2289218 |
[3] | R.-L. Eubank, E.-L. Kambour, J.-T. Kim, K. Klipple, C.-S. Reese, M. Schimek, Estimation in partially linear models, Comput. Stat. Data. Anal., 29 (1998), 27–34. https://doi.org/10.1016/S0167-9473(98)00054-1 doi: 10.1016/S0167-9473(98)00054-1 |
[4] | P. Speckman, Kernel somoothing in partial linear models, J. R. Stat. Soc. Ser. B, 50 (1988), 413–436. https://doi.org/10.1111/j.2517-6161.1988.tb01738.x doi: 10.1111/j.2517-6161.1988.tb01738.x |
[5] | R.-L. Eubank, Nonparametric Regression and Spline Smoothing, New York: Marcel Dekker, 1999. https://doi.org/10.1201/9781482273144 |
[6] | D. Ruppert, M.-P. Wand, R.-C. Carroll, Semiparametric Regression; Cambridge: Cambridge University Press, 2003. https://doi.org/10.1017/CBO9780511755453 |
[7] | W. Härdle, M. Müller, S. Sperlich, A. Werwatz, Nonparametric and Semiparmetric Models, Berlin/Heidelberg: Springer, 2004. https://doi.org/10.1007/978-3-642-17146-8 |
[8] | Yatchew, Semiparametric Regression for the Applied Econometrican, Cambridge: Cambridge University Press, 2003. https://doi.org/10.1017/CBO9780511615887 |
[9] | F. Akdeniz, G. Tabakan, Restricted ridge estimators of the parameters in semiparametric regression model, Commun. Stat. Theory Methods, 38 (2009), 1852–1869. https://doi.org/10.1080/03610920802470109 doi: 10.1080/03610920802470109 |
[10] | D.-E. Akdeniz, W.-K. Hardle, M. Osipenko, Difference based ridge and Liu type estimators in semiparametric regression models, J. Multivar. Anal., 105 (2012), 164–175. https://doi.org/10.1016/j.jmva.2011.08.018 doi: 10.1016/j.jmva.2011.08.018 |
[11] | M. Arashi, T. Valizadeh, Performance of Kibria's methods in partial linear ridge regression model, Stat. Pap., 56 (2015), 231–246. https://doi.org/10.1007/s00362-014-0578-6 doi: 10.1007/s00362-014-0578-6 |
[12] | F. Akdeniz, S. Kaçıranlar, On the almost unbiased generalized Liu estimator and unbiased estimation of the bias and MSE, Commun. Stat. Theory Methods, 24 (1995), 1789–1797. https://doi.org/10.1080/03610929508831585 doi: 10.1080/03610929508831585 |
[13] | M.-N. Akram, B.-M.-G. Kibria, M. Arashi, A.-F. Lukman, A new improved Liu estimator for the QSAR model with inverse Gaussian response, Commun. Stat. Simul. Comput., 53 (2024), 1873–1888. https://doi.org/10.1080/03610918.2022.2059088 doi: 10.1080/03610918.2022.2059088 |
[14] | S. Kaçıranlar, N. Ozbay, E. Ozkan, H. Guler, Comparison of Liu and two parameter principal component estimator to combat multicollinearity, Concurr. Comput. Pract. Exp., 34 (2022), e6737. https://doi.org/10.1002/cpe.6737 doi: 10.1002/cpe.6737 |
[15] | B. Kan, O. Alpu, B. Yazici, Robust ridge and robust Liu estimator for regression based on the LTS estimator, J. Appl. Stat., 40 (2013), 644–655. https://doi.org/10.1080/02664763.2012.750285 doi: 10.1080/02664763.2012.750285 |
[16] | K.-J. Liu, A new class of biased estimate in linear regression, Commun. Stat. Theory Methods, 22 (1993), 393–402. https://doi.org/10.1080/03610929308831027 doi: 10.1080/03610929308831027 |
[17] | M. Arashi, A.-F. Lukman, Z.-Y. Algamal, Liu regression after random forest for prediction and modeling in high dimension, J. Chemom., 36 (2022), e3393. https://doi.org/10.1002/cem.3393 doi: 10.1002/cem.3393 |
[18] | M. Arashi, M. Roozbeh, H.-A. Niroumand, A note on Stein-type shrinkage estimator in partial linear models, Statistics, 46 (2012), 673–685. https://doi.org/10.1080/02331888.2011.553682 doi: 10.1080/02331888.2011.553682 |
[19] | M. Roozbeh, S. Babaie-Kafaki, M. Manavi, A heuristic algorithm to combat outliers and multicollinearity in regression model analysis, Iran. J. Num. Anal. Opt., 12 (2022), 173–186. https://doi.org/10.22067/IJNAO.2021.68160.1008 doi: 10.22067/IJNAO.2021.68160.1008 |
[20] | M. Roozbeh, A. Rouhi, N.-A. Mohamed, F. Jahadi, Generalized support vector regression and symmetry functional regression approaches to model the high-dimensional data, Symmetry, 15 (2023), 1262. https://doi.org/10.3390/sym15061262 doi: 10.3390/sym15061262 |
[21] | M. Roozbeh, S. Babaie-Kafaki, Z. Aminifard, A nonlinear mixed–integer programming approach for variable selection in linear regression model, Commun. Stat. Simul. Comput., 11 (2023), 5434–5445. https://doi.org/10.1080/03610918.2021.1990323 doi: 10.1080/03610918.2021.1990323 |
[22] | P.-J. Rousseeuw, Least median of squares regression, J. Am. Stat. Assoc., 79 (1984), 871–880. https://doi.org/10.2307/2288718 doi: 10.2307/2288718 |
[23] | P.-J. Rousseeuw, A.-M. Leroy, Robust Regression and Outlier Detection, New York: John Wiley, 1987. https://doi.org/10.1002/0471725382 |
[24] | P.-J. Rousseeuw, K. van Driessen, Computing LTS regression for large data sets, Data Min. Knowl. Discov., 12 (2006), 29–45. https://doi.org/10.1007/s10618-005-0024-4 doi: 10.1007/s10618-005-0024-4 |
[25] | M. Amini, M. Roozbeh, Optimal partial ridge estimation in restricted semiparametric regression models, J. Multivar. Anal., 136 (2015), 26–40. https://doi.org/10.1016/j.jmva.2015.01.005 doi: 10.1016/j.jmva.2015.01.005 |
[26] | A. Zellner, An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias, J. Am. Stat. Assoc., 57 (1962), 348–368. https://doi.org/10.2307/2281644 doi: 10.2307/2281644 |
[27] | M. Arashi, B.-M.-G. Kibria, T. Valizadeh, On ridge parameter estimators under stochastic subspace hypothesis, J. Stat. Comput. Simul., 87 (2017), 966–983. https://doi.org/10.1080/00949655.2016.1239104 doi: 10.1080/00949655.2016.1239104 |
[28] | M.-H. Karbalaee, M. Arashi, S.-M.-M. Tabatabaey, Performance analysis of the preliminary test estimator with series of stochastic restrictions, Commun. Stat. Theory Methods, 47 (2018), 1–17. https://doi.org/10.1080/03610926.2017.1300275 doi: 10.1080/03610926.2017.1300275 |
[29] | M. Roozbeh, G. Hesamian, M.-G. Akbari, Ridge estimation in semi-parametric regression models under the stochastic restriction and correlated elliptically contoured errors, J. Comput. Appl. Math., 378 (2020), 112940. https://doi.org/10.1016/j.cam.2020.112940 doi: 10.1016/j.cam.2020.112940 |
[30] | J. Durbin, A note on regression when there is extraneous information about one of the coefficients, J. Am. Stat. Assoc., 48 (1990), 799–808. https://doi.org/10.2307/2281073 doi: 10.2307/2281073 |
[31] | H. Theil, A.-S. Goldberger, On pure and mixed statistical estimation in economics, Int. Econ. Rev., 2 (1961), 65–78. https://doi.org/10.2307/2525589 doi: 10.2307/2525589 |
[32] | H. Theil, On the use of incomplete prior information in regression analysis, J. Am. Stat. Assoc., 58 (1963), 401–411. https://doi.org/10.2307/2283275 doi: 10.2307/2283275 |
[33] | R. Fallah, M. Arashi, S.-M.-M. Tabatabaey, On the ridge regression estimator with sub-space restriction, Commun. Stat. Theory Methods, 46 (2017), 11854–11865. https://doi.org/10.1080/03610926.2017.1285928 doi: 10.1080/03610926.2017.1285928 |
[34] | R. Fallah, M. Arashi, S.-M.-M. Tabatabaey, Shrinkage estimation in restricted elliptical regression model, J. Iran. Stat. Soc., 17 (2018), 49–61. https://doi.org/10.29252/jirss.17.1.49 doi: 10.29252/jirss.17.1.49 |
[35] | H. Toutenburg, Prior Information in Linear Models, New York: John Wiley, 1982. https://doi.org/10.2307/2982032 |
[36] | A.-E. Hoerl, R.-W. Kennard, Ridge regression: Biased estimation for non-orthogonal problems, Technometrics, 12 (1970), 69–82. https://doi.org/10.2307/1271436 doi: 10.2307/1271436 |
[37] | M. Roozbeh, N.-A. Hamzah, Feasible robust estimator in restricted semiparametric regression models based on the LTS approach, Commun. Stat. Simul. Comput., 46 (2017), 7332–7350. https://doi.org/10.1080/03610918.2016.1236954 doi: 10.1080/03610918.2016.1236954 |
[38] | F. Akdeniz, M. Roozbeh, Generalized difference-based weighted mixed almost unbiased ridge estimator in partially linear models, Stat. Pap. 60 (2019), 1717–1739. https://doi.org/10.1007/s00362-017-0893-9 doi: 10.1007/s00362-017-0893-9 |
[39] | F. Akdeniz, M. Roozbeh, E. Akdeniz, M.-N. Khan, Generalized difference-based weighted mixed almost unbiased liu estimator in semiparametric regression models, Commun. Stat. Theory Methods, 51 (2022), 4395–4416. https://doi.org/10.1080/03610926.2020.1814340 doi: 10.1080/03610926.2020.1814340 |
[40] | B.-M.-G. Kibria, Some Liu and ridge type estimators and their properties under the ill- conditioned Gaussian linear regression model, J. Stat. Comput. Simul., 82 (2012), 1–17. https://doi.org/10.1080/00949655.2010.519705 doi: 10.1080/00949655.2010.519705 |
[41] | K. Månsson, B.-M.-G. Kibria, G. Shukur, A restricted Liu estimator for binary regression models and its application to an applied demand system, J. Appl. Stat., 43 (2016), 1119–1127. https://doi.org/10.1080/02664763.2015.1092110 doi: 10.1080/02664763.2015.1092110 |
[42] | K. Månsson, B.-M.-G. Kibria, Estimating the unrestricted and restricted Liu estimators for the Poisson regression model: Method and application, Comput. Econ., 58 (2021), 311–326. https://doi.org/10.1007/s10614-020-10028-y doi: 10.1007/s10614-020-10028-y |
[43] | P.-J. Rousseeuw, Multivariate estimation with high breakdown point, Math. Stat. Appl., 8 (1985), 283–297. |
[44] | Alfons, C. Croux, S. Gelper, Sparse least trimmed squares regression for analyzing high-dimensional large data sets, Ann. Appl. Stat., 7 (2013), 226–248. https://doi.org/10.1214/12-AOAS575 doi: 10.1214/12-AOAS575 |
[45] | F.-A. Graybill, Matrices with Applications in Statistics, Wadsworth: Belmont, 1983. |
[46] | D. Harville, Matrix Algebra from a Statistician's Perspective, New York: Springer Verlag, 1997. https://doi.org/10.1007/b98818 |
[47] | R. Farebrother, Further results on the mean square error of ridge regression, J. Roy. Stat. Soc. Ser. B, 38 (1976), 248–250. https://doi.org/10.1111/j.2517-6161.1976.tb01588.x doi: 10.1111/j.2517-6161.1976.tb01588.x |
[48] | G.-C. McDonald, D.-I. Galarneau, A Monte Carlo evaluation of some ridge-type estimators, J. Am. Stat. Assoc., 70 (1975), 407–416. https://doi.org/10.2307/2285832 doi: 10.2307/2285832 |
[49] | D.-G. Gibbons, A simulation study of some ridge estimators, J. Am. Stat. Assoc., 76 (1981), 131–139. https://doi.org/10.2307/2287058 doi: 10.2307/2287058 |
[50] | M.-B. Priestley, M.-T. Chao, Non-Parametric Function Fitting, J. R. Stat. Soc. Ser. B, 34 (1972), 385–392. https://doi.org/10.1111/j.2517-6161.1972.tb00916.x doi: 10.1111/j.2517-6161.1972.tb00916.x |
[51] | M. Ho, Essays on the Housing Market, PhD thesis, University of Toronto, 1995. |
[52] | S.-J. Sheather, A Modern Approach to Regression with R, New York: Springer, 2009. https://doi.org/10.1007/978-0-387-09608-7 |
[53] | P. Cizek, Least trimmed squares in nonlinear regression under dependence, J. Stat. Plann. Inference, 136 (2005), 3967–3988. https://doi.org/10.1016/j.jspi.2005.05.004 doi: 10.1016/j.jspi.2005.05.004 |
[54] | J.-A. Visek, The least trimmed squares part Ⅲ: Asymptotic normality, Kybernetika, 42 (2006), 203–224. |