In the framework of generalized linear models (GLM), this paper explores the design and applicability of partial residual (PRES), augmented partial residual (APRES), and conditional expectation and residuals (CERES) plots for visualizing an outlier's diagnostics as a function of selected variables. Here, a geometric regression as a GLM is thoroughly described. Additionally, plots for PRES, APRES, and CERES have been built. Due to how the response variable and the associated link function interact with various covariates, the effectiveness of these plots for creating an appealing visual impression may vary. On the cervical cancer data, specific methodologies are used to identify trends for effective modelling. When compared to other approaches, the power of the tests for various plots demonstrates that PRES, CERES (L) and CERES (K) have the greatest endurance for the outlier's diagnostics. On the basis of the power of residual plots, the use is recommended for outlier diagnostics in presence of conventional tests.
Citation: Zawar Hussain, Atif Akbar, Mohammed M. A. Almazah, A. Y. Al-Rezami, Fuad S. Al-Duais. Diagnostic power of some graphical methods in geometric regression model addressing cervical cancer data[J]. AIMS Mathematics, 2024, 9(2): 4057-4075. doi: 10.3934/math.2024198
In the framework of generalized linear models (GLM), this paper explores the design and applicability of partial residual (PRES), augmented partial residual (APRES), and conditional expectation and residuals (CERES) plots for visualizing an outlier's diagnostics as a function of selected variables. Here, a geometric regression as a GLM is thoroughly described. Additionally, plots for PRES, APRES, and CERES have been built. Due to how the response variable and the associated link function interact with various covariates, the effectiveness of these plots for creating an appealing visual impression may vary. On the cervical cancer data, specific methodologies are used to identify trends for effective modelling. When compared to other approaches, the power of the tests for various plots demonstrates that PRES, CERES (L) and CERES (K) have the greatest endurance for the outlier's diagnostics. On the basis of the power of residual plots, the use is recommended for outlier diagnostics in presence of conventional tests.
[1] | P. McCullagh, J. A. Nelder, Generalized linear models, Chapman and Hall, 1989. Available from: https://www.utstat.toronto.edu/~brunner/oldclass/2201s11/readings/glmbook.pdf. |
[2] | M. Otto, Chemometrics: statistics and computer application in analytical chemistry, John Wiley & Sons, 2016. Available from: https://www.wiley.com/en-us/exportProduct/pdf/9783527699384. |
[3] | A. F. Lukman, K. Ayinde, S. Binuomote, O. A. Clement, Modified ridge‐type estimator to combat multicollinearity: Application to chemical data, J. Chemometr., 33 (2019), e3125. https://doi.org/10.1002/cem.3125 doi: 10.1002/cem.3125 |
[4] | A. Zeileis, C. Kleiber, S. Jackman, Regression models for count data in R, J. Stat. Softw., 27 (2008), 1–25. |
[5] | W. S. Cleveland, Graphs in scientific publications, Am. Stat., 38 (1984), 261–269. https://doi.org/10.1080/00031305.1984.10483223 doi: 10.1080/00031305.1984.10483223 |
[6] | W. G. Jacoby, Statistical graphics for univariate and bivariate data, Sage, 1997. |
[7] | J. Textor, J. Hardt, S. Knuppel, Dagitty: A graphical tool for analyzing causal diagram, Epidemiology, 22 (2011), 745. https://doi.org/10.1097/EDE.0b013e318225c2be doi: 10.1097/EDE.0b013e318225c2be |
[8] | W. A. Larsen, S. J. McCleary, The use of partial residual plots in regression analysis, Technometrics, 14 (1972), 781–790. https://doi.org/10.1080/00401706.1972.10488966 doi: 10.1080/00401706.1972.10488966 |
[9] | E. R. Mansfield, M.D. Conerly, Diagnostic value of residual and partial residual plots, Am. Stat., 41 (1987), 107–116. https://doi.org/10.1080/00031305.1987.10475457 doi: 10.1080/00031305.1987.10475457 |
[10] | A. C. Atkinson, Regression diagnostics, transformations and constructed variables, J. R. Stat. Soc. Ser. B (Meth.), 44 (1982), 1–22. https://doi.org/10.1111/j.2517-6161.1982.tb01181.x doi: 10.1111/j.2517-6161.1982.tb01181.x |
[11] | A. C. Davison, C. L. Tsai, Regression model diagnostics, Int. Stat. Rev., 60 (1992), 337–353. https://doi.org/10.2307/1403682 doi: 10.2307/1403682 |
[12] | R. J. O'Hara Hines, E. M. Carter, Improved added variable and partial residual plots for the detection of influential observations in generalized linear models, J. R. Stat. Soc. Ser C. (Appl. Stat.), 42 (1993), 3–20. https://doi.org/10.2307/2347405 doi: 10.2307/2347405 |
[13] | P. C. Wang, Residual plots for detecting nonlinearity in generalized linear models, Technometrics, 29 (1987), 435–438. https://doi.org/10.1080/00401706.1987.10488271 doi: 10.1080/00401706.1987.10488271 |
[14] | R. D. Cook, S. Weisberg, Residuals and influence in regression, New York: Chapman and Hall, 1982. |
[15] | M. M. A. Almazah, T. Erbayram, Y. Akdoğan, M. M. Al Sobhi, A. Z. Afify, A new extended geometric distribution: Properties, regression model, and actuarial applications, Mathematics, 9 (2021), 1336. https://doi.org/10.3390/math9121336 doi: 10.3390/math9121336 |
[16] | J. Makcutek, A generalization of the geometric distribution and its application in quantitative linguistics, Rom. Rep. Phys., 60 (2008), 501–509. |
[17] | F. Jahan, B. Siddika, M. A. Islam, An application of the generalized linear model for the geometric distribution, J. Stat.: Adv. Theory. Appl., 16 (2016), 45–65. http://doi.org/10.18642/jsata_7100121695 |
[18] | B. Pradhan, D. Kundu, A choice between Poisson and geometric distributions, J. Indian Soc. Prob. Stat., 17 (2016), 111–123. https://doi.org/10.1007/s41096-016-0008-2 doi: 10.1007/s41096-016-0008-2 |
[19] | Z. M. D. Al-Balushi, M. M. Islam, Geometric regression for modelling count data on the time-to-first antenatal care visit, J. Stat.: Adv. Theory. Appl., 23 (2020), 35–57. http://doi.org/10.18642/jsata_7100122148 |
[20] | P. J. Saulnier, M. Darshi, K. M. Wheelock, H. C. Looker, G. D. Fufaa, W. C. Knowler, et al., Urine metabolites are associated with glomerular lesions in type 2 diabetes, Metabolomics, 14 (2018), 84. https://doi.org/10.1007/s11306-018-1380-6 doi: 10.1007/s11306-018-1380-6 |
[21] | G. Xie, J. T. Lundholm, J. S. MacIvor, Phylogenetic diversity and plant trait composition predict multiple ecosystem functions in green roofs, Sci. Total Environ., 628-629 (2018), 1017–1026. https://doi.org/10.1016/j.scitotenv.2018.02.093 doi: 10.1016/j.scitotenv.2018.02.093 |
[22] | J. M. Wouters, J. B. Gusmao, G. Mattos, P. Lana, Polychaete functional diversity in shallow habitats: Shelter from the storm, J. Sea. Res., 135 (2018), 18–30. https://doi.org/10.1016/j.seares.2018.02.005 doi: 10.1016/j.seares.2018.02.005 |
[23] | J. M. Landwehr, D. Pregibon, A. C. Shoemaker, Graphical methods for assessing logistic regression models, J. Am. Stat. Assoc., 79 (1984), 61–71. https://doi.org/10.1080/01621459.1984.10477062 doi: 10.1080/01621459.1984.10477062 |
[24] | R. D. Cook, R. Croos-Dabrera, Partial residual plots in generalized linear models, J. Am. Stat. Assoc., 93 (1998), 730–739. https://doi.org/10.1080/01621459.1998.10473725 doi: 10.1080/01621459.1998.10473725 |
[25] | M. Imran, A. Akbar, Diagnostics via partial residual plots in inverse Gaussian regression, J. Chemometr., 34 (2020), e3203. https://doi.org/10.1002/cem.3203 doi: 10.1002/cem.3203 |
[26] | Z. Hussain, A. Akbar, Diagnostics through residual plots in binomial regression addressing chemical species data, Math. Probl. Eng., 2022 (2022), 437594. https://doi.org/10.1155/2022/4375945 doi: 10.1155/2022/4375945 |
[27] | J. L. Hintz, User guide–Ⅲ: Regression and curve fitting, kaysville: NCSS, 2007. Available from: https://www.ncss.com/wp-content/uploads/2012/09/NCSSUG3.pdf. |
[28] | R. D. Cook, Exploring partial residual plots, Technometrics, 35 (1993), 351–362. https://doi.org/10.1080/00401706.1993.10485350 doi: 10.1080/00401706.1993.10485350 |
[29] | K. Oh, Regression diagnostics using residual plots, Korean. Commun. Stat., 8 (2001), 311–317. Available from: https://koreascience.kr/article/JAKO200111920779561.pdf. |
[30] | A. R. Irawan, Pemodelan perulangan pengobatan pasien kanker serviks di rsud dr. soetomo dengan bayesian geometric regression dan bayesian mixture geometric regression, Ph D thesis, Institut teknologi sepuluh nopember, surabaya, 2017. Available from: https://core.ac.uk/download/pdf/291465419.pdf. |
[31] | A. Azzalini, A. W. Bowman, On the use of nonparametric regression for checking linear relationship, J. R. Stat. Soc. Ser. B (Meth.), 55 (1993), 549–557. https://doi.org/10.1111/j.2517-6161.1993.tb01923.x doi: 10.1111/j.2517-6161.1993.tb01923.x |