HETEROSCEDASTICITY IN SURVEY DATA AND MODEL SELECTION BASED ON WEIGHTED HANNAN-QUINN INFORMATION CRITERION
Keywords:
Hannan-Quinn Information Criterion, Weighted Hannan-Quinn Information Criterion, Differential Entropy, Log-Quantiles, Variance To Mean Ratio.Abstract
This paper made an attempt on the weighted version of Hannan-Quinn information criterion for the purpose of selecting a best model from various competing models, when heteroscedasticity is present in the survey data. The authors found that the information loss between the true model and fitted models are equally weighted, instead of giving unequal weights. The computation of weights purely depends on the differential entropy of each sample observation and traditional Hannan-Quinn information criterion was penalized by the weight function which comprised of the Inverse variance to mean ratio (VMR) of the fitted log quantiles.The Weighted Hannan-Quinn information criterion was explained in two versions based on the nature of the estimated error variances of the model namely Homogeneous and Heterogeneous WHQIC respectively. The WHQIC visualizes a transition in model selection and it leads to conduct a logical statistical treatment for selecting a best model. Finally, this procedure was numerically illustrated by fitting 12 different types of stepwise regression models based on 44 independent variables in a BSQ (Bank service Quality) study.
Downloads
References
Arlot, S., (2012). Choosing a penalty for model selection in heteroscedastic
regression, Arxiv preprint arXiv:0812.3141.
Andrew Barron, Lucien Birg ́e, and Pascal Massart (1999). Risk bounds for
model selection via penalization, Probab. Theory Related Fields, 113(3), p.
–413.
Boris T. Polyak and Tsybakov, A.B. (1990). Asymptotic optimality of the Cp-
test in the projection estimation of a regression, Teor. Veroyatnost. i
Primenen., 35(2), p.305–317.
Corderio, G.M, (2008). Corrected maximum likelihood estimators in linear
heteroscdastic regression models, Brazilian Review of Econometrics, 28, p. 1-
Colin L. Mallows (1973). Some comments on Cp. Technometrics, 15, p. 661–
Fisher, G.R., (1957). Maximum likelihood estimators with heteroscedastic
errors, Revue de l’Institut International de statistique, p. 52-55.
Greene and William H. Greene (2011). Econometric Analysis, Pearson
Education Ltd.
Hirotugu Akaike (1970). Statistical predictor identification. Ann. Inst. Statist.
Math., 22, p. 203–217.
Hirotugu Akaike (1973). Information theory and an extension of the maximum
likelihood principle. In Second International Symposium on Information
Theory (Tsahkadsor, 1971), p. 267–281, Akad ́emiai Kiad ́o, Budapest.
Ker-Chau Li (1987). Asymptotic optimality for Cp, CL, cross-validation and
generalized crossvalidation: discrete index set, Ann. Statist., 15(3), p. 958–
Lucien Birg ́e and Pascal Massart (2007). Minimal penalties for Gaussian
model selection, Probab. Theory Related Fields, 138(1-2), p. 33–73.
Muralidharan, K. and Kale, B. K. (2008). Inliers detection using schwartz
information criterion, Journal of Reliability and Statistical Studies, 1(1), p. 1-
Myers, R. H. and Montgomery, D,C. (1997). A Tutorial on Generalized Linear
Models, Journal of Quality Technology, 29, p. 274-291.
Ritei Shibata (1981). An optimal selection of regression variables, Biometrika,
(1), p.45–54.
Sylvain Arlot (2009). Model selection by resampling penalization, Electron. J.
Stat., 3:557–624.
Xavier Gendre (2008). Simultaneous estimation of the mean and the variance
in heteroscedastic Gaussian regression, Electron. J. Stat., 2, p. 1345–1372.
Yannick Baraud (2000). Model selection for regression on a fixed design,
Probab. Theory Related Fields, 117(4), p. 467–493.
Yannick Baraud (2002). Model selection for regression on a random design,
ESAIM Probab. Statist., 6, p. 127–146 (electronic).