Increase statistical reliability without losing predictive power by merging classes and adding variables

  • Published: 01 October 2016
  • It is usually true that adding explanatory variables into a probability model increases association degree yet risks losing statistical reliability. In this article, we propose an approach to merge classes within the categorical explanatory variables before the addition so as to keep the statistical reliability while increase the predictive power step by step.

    Citation: Wenxue Huang, Xiaofeng Li, Yuanyi Pan. Increase statistical reliability without losing predictive power by merging classes and adding variables[J]. Big Data and Information Analytics, 2016, 1(4): 341-348. doi: 10.3934/bdia.2016014

    Related Papers:

  • It is usually true that adding explanatory variables into a probability model increases association degree yet risks losing statistical reliability. In this article, we propose an approach to merge classes within the categorical explanatory variables before the addition so as to keep the statistical reliability while increase the predictive power step by step.


    加载中
    [1] [ H. L. Costner, Criteria for measure of association, American Sociology Review, 30(1965), 341-353.
    [2] [ M. Dash and H. Liu, Feature selection for classification, Intell. Data. Anal., 1(1997), 131-156.
    [3] [ R. L. Ebel, Estimation of the reliability of ratings, Psychomereika, 16(1951), 407-424.
    [4] [ G. S. Fisher, Monte Carlo:Concepts, Algorithms, and Applications, Springer-Verlag, 1996.
    [5] [ P. Glasserman, Monte Carlo Method in Financial Engineering, (Stochastic Modelling and Applied Probability) (V. 53), Spinger, 2004.
    [6] [ L. A. Goodman and W. H. Kruskal, Measures of Associations for Cross Classification, With a foreword by Stephen E. Fienberg. Springer Series in Statistics, 1. Springer-Verlag, New YorkBerlin, 1979.
    [7] [ L. Guttman, The test-retest reliability of qualitative data, Psychometrika, 11(1946), 81-95.
    [8] [ I. Guyon and A. Elisseeff, An introduction to variable and feature selection, J. Mach. Learn. Res., 3(2003), 1157-1182.
    [9] [ W. Huang and Y. Pan, On balancing between optimal and proportional categorical predictions, Big Data and Info. Anal., 1(2016), 129-137.
    [10] [ W. Huang, Y. Pan and J. Wu, Supervised Discretization with GK -τ, Proc. Comp. Sci., 17(2013), 114-120.
    [11] [ W. Huang, Y. Pan and J. Wu, Supervised discretization for optimal prediction, Proc. Comp. Sci., 30(2014), 75-80.
    [12] [ W. Huang, Y. Shi and X. Wang, A nominal association matrix with feature selection for categorical data, Communications in Statistics -Theory and Methods, 2017.
    [13] [ M. G. Kendall, The Advanced Theory of Statistics, London, Charles Griffin and Co., Ltd, 1946.
    [14] [ C. J. Lloyd, Statistical Analysis of Categorical Data, John Wiley Sons, 1999.
    [15] [ K. Pearson and D. Heron, On Theories of association, Biometrika, 9(1913), 159-315.
    [16] [ STATCAN, Survey of Family Expenditures-1996. (1998)
    [17] [ D. L. Streiner and G. R. Norman, "Precision" and "accuracy":Two terms that are neither, J. of Cli. Epid., 59(2006), 327-330.
  • Reader Comments
  • © 2016 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(2543) PDF downloads(469) Cited by(1)

Article outline

Figures and Tables

Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog