EXACT DISTRIBUTION OF HAT VALUES AND IDENTIFICATION OF LEVERAGE POINTS

Authors

  • G.S. David Sam Jayakumar Jamal Institute of Management, Tiruchirappalli – 620 020, India
  • A. Sulthan Jamal Institute of Management, Tiruchirappalli – 620 020, India

Keywords:

Centered Hat Values, Hat Matrix, Beta-Distribution, Moments, Leverage Points, Outliers, X-Space.

Abstract

This paper proposed the exact distribution of centered hat values of the hat matrix of predictors in multiple linear regression analysis. The authors adopted the relationship proposed by Belsey et al. (1980) between the centered hat values and the F-ratio and we showed that the derived density function of the centered hat values followed Beta distribution B(p-1, n-p) and it lies between 1/n<=h<=1 . Moreover, the first two moments of the distribution are derived and we established the upper and lower limits of the centered hat values. Moreover, the shape of the density function of hat values is also visualized and the authors computed the percentage points of centered hat values at 5% and 1% significance level for different sample sizes and predictors. Finally, the authors proposed two approaches. The first approach helps to identify the leverage points in multiple linear regression analysis in the X-space based on the test of significance and the second approach scrutinized the leverage points as well as the outliers. The proposed approaches were numerically illustrated and the results were compared the traditional approach.

Downloads

Download data is not yet available.

References

Belsley, D. A., Kuh, E. and Welsch, R. E. (2005). Regression Diagnostics:

Identifying Influential Data and Sources Of Collinearity, Wiley-Interscience.,

Vol. 571.

Chatterjee, S. and Hadi, A. S. (2009). Sensitivity Analysis in Linear

Regression, Wiley, Vol. 327.

Chave, A. D. and Thomson, D. J. (2003). A bounded influence regression

estimator based on the statistics of the hat matrix, J. Roy. Statist. Soc. C,

(3), p. 307-322.

Diaz-Garcia, J.A. and Gonzalez- Faras, G. (2004). A note on the Cook's

distance, J. Statist. Plan. Inference, 120, p. 119-136.

Dodge, Y. and Hadi, A. S. (1999). Simple graphs and bounds for the elements

of the hat matrix, J. Appl.Statist., 26(7), p. 817-823.

Handschin, E., Schweppe, F. C., Kohlas, J. and Fiechter, A. (1975). Bad data

analysis for power system state estimation. Power Apparatus and Systems,

IEEE Transact, 94(2), p. 329-337.

Hoaglin, D. C. and Welsch, R. E. (1978). The hat matrix in regression and

ANOVA, The Amer. Statist, 32(1), p. 17-22.

Huang, Y., Kuo, M. and Wang, T. (2007). Pair-perturbation influence

functions and local influence in PCA, Comp. Statist. and Data Analysis, 51, p.

-5899

Krasker, W. S. and Welsch, R. E. (1982). Efficient bounded-influence

regression estimation, J.Amer.Statist. Assoc., 77, p. 595-604.

Mallows, Colin L. (1975). On some topics in robustness, Unpublished

memorandum, Bell Telephone Laboratories, Murray Hill, NJ.

Prendergast, L.A. (2005). Influence functions for sliced inverse regression.

Scandinavian J. Statist., 32, p. 385-404.

Prendergast, L.A. (2006). Detecting influential observations in Sliced Inverse

Regression analysis, Austral. and New Zealand J. Statist., 48, p. 285-304.

Pynnonen, Seppo (2010). Joint distribution of a linear transformation of OLS

regression residuals with general spherical error distribution. Working Paper,

Department of Mathematics and Statistics, University of Vaasa.

Ullah, M.A. and Pasha, G.R. (2009). The Origin and development of

influence measures in regression, Pak. J. Statist, Vol. 25(3), p. 295-307.

Downloads

Published

2014-06-02

How to Cite

Jayakumar, G. D. S. ., & Sulthan, A. . (2014). EXACT DISTRIBUTION OF HAT VALUES AND IDENTIFICATION OF LEVERAGE POINTS. Journal of Reliability and Statistical Studies, 7(01), 61–78. Retrieved from https://journals.riverpublishers.com/index.php/JRSS/article/view/21319

Issue

Section

Articles