The Unit Folded Normal Distribution: A New Unit Probability Distribution with the Estimation Procedures, Quantile Regression Modeling and Educational Attainment Applications
DOI:
https://doi.org/10.13052/jrss0974-8024.15111Keywords:
Better life index, educational attainment, hyperbolic tangent function, normal distribution, point estimates, OECD data sets, quantile regression, unit distributionAbstract
In this paper, we develop a continuous distribution on the unit interval characterized by the distribution of the absolute hyperbolic tangent transformation of a random variable following the normal distribution. The lack of research on the prospect of hyperbolic transformations providing flexible distributions on the unit interval is a motivation for the study. First, we study it theoretically and discuss its properties of interest from a modeling point of view. In particular, it is shown that the proposed distribution accommodates various levels of skewness and kurtosis. Then, some statistical work is performed. We investigate diverse estimation methods for the involved parameters and evaluate their performance through two simulation studies. Subsequently, the quantile regression model derived from the proposed distribution is developed. Two real-world data applications of interest are provided. The first application is about the univariate modeling of the percentage of the educational attainment of some countries, which is one indicator of the education topic of the Better Life Index (BLI) of the Organization for Economic Co-operation and Development (OECD) countries. The second application is to explain the relationship between the percentage of educational attainment of some countries with one indicator of the work-life balance, safety, and health topics of BLI via median quantile regression modeling. For the considered data sets, the proposed distribution and quantile regression models show that they have better modeling abilities than competitive models under some comparison criteria. The results also indicate that covariates are (statistically) significant at any ordinary level of significance for the median response.
Downloads
References
Aarset, M. V. (1987). How to identify a bathtub hazard rate. IEEE Transactions on Reliability, 36(1), 106–108.
Abel, E. L. andKruger, M. L. (2005). Educational attainment and suicide rates in the United States. Psychological Reports, 97(1), 25–28.
Adams, S. J. (2002). Educational attainment and health: Evidence from a sample of older adults. Education Economics, 10(1), 97–109.
Altun, E. (2020). The log-weighted exponential regression model: alternative to the beta regression model, Communications in Statistics-Theory and Methods, DOI: 10.1080/03610926.2019.1664586.
Altun, E. and Cordeiro, G. M. (2020). The unit-improved second-degree Lindley distribution: inference and regression modeling. Computational Statistics, 35(1), 259–279.
Altun, E., El-Morshedy, M. and Eliwa, M. S. (2021). A new regression model for bounded response variable: An alternative to the beta and unit–Lindley regression models. Plos one, 16(1), e0245627.
Altun, E. and Hamedani, G. G. (2018). The log-xgamma distribution with inference and application. Journal de la Société Française de Statistique, 159(3), 40–55.
Borgonovi, F. and Pokropek, A. (2016). Education and self-reported health: Evidence from 23 countries on the role of years of schooling, cognitive skills and social capital. PloS one, 11(2), e0149716.
Cheng, R. C. H. and Amin, N. A. K. (1979). Maximum product of spacings estimation with application to the lognormal distribution. Math Report, 791.
Cox, D. R. and Snell, E. J. (1968). A general definition of residuals. Journal of the Royal Statistical Society: Series B (Methodological), 30(2), 248–265.
Dunn, P. K. and Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5(3), 236–244.
Ferrari, S. and Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of applied statistics, 31(7), 799–815.
Fischer, M. J. (2013). Generalized Hyperbolic Secant Distributions: With Applications to Finance, Springer Science & Business Media, Springer.
Figueroa-Zu, J. I., Niklitschek-Soto, S. A., Leiva, V. and Liu, S. (2020). Modeling heavy-tailed bounded data by the trapezoidal beta distribution with applications. Revstat, to appear.
Gallardo, D.I., Gómez-Déniz, E. and Gómez, H.W. (2020). Discrete generalized half-normal distribution and its applications in quantile regression. SORT-Statistics and Operations Research Transactions, 265–284.
Galton, F. (1883). Inquiries into human faculty and its development. Macmillan and Company, London, Macmillan.
Ghitany, M. E., Mazucheli, J., Menezes, A. F. B. and Alqallaf, F. (2019). The unit-inverse Gaussian distribution: A new alternative to two-parameter distributions on the unit interval. Communications in Statistics-Theory and Methods, 48(14), 3423–3438.
Gómez-Déniz, E., Sordo, M. A. and Calderín-Ojeda, E. (2014). The log–Lindley distribution as an alternative to the beta regression model with applications in insurance. Insurance: Mathematics and Economics, 54, 49–57.
Gündüz, S. and Korkmaz, M. Ç. (2020). A New Unit Distribution Based On The Unbounded Johnson Distribution Rule: The Unit Johnson SU Distribution. Pakistan Journal of Statistics and Operation Research, 16(3), 471–490.
Gyekye, S. A. and Salminen, S. (2009). Educational status and organizational safety climate: Does educational attainment influence workers’ perceptions of workplace safety?. Safety science, 47(1), 20–28.
Hazekamp, C., McLone, S., Yousuf, S., Mason, M. and Sheehan, K. (2018). Educational attainment of male homicide victims aged 18 to 24 years in Chicago: 2006 to 2015. Journal of interpersonal violence, DOI: 10.1177/0886260518807216.
Hill, T. D. and Needham, B. L. (2006). Gender-specific trends in educational attainment and self-rated health, 1972–2002. American journal of public health, 96(7), 1288–1292.
Henningsen, A. and Toomet, O. (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics, 26(3), 443–458.
Jodra, P. and Jiménez-Gamero, M. D. (2020). A quantile regression model for bounded responses based on the exponential-geometric distribution. REVSTAT-Statistical Journal, 18(4), 415–436.
Johnson, N. L. (1949). Systems of frequency curves generated by methods of translation. Biometrika, 36(1/2), 149–176.
Korkmaz, M. Ç. (2020a). A new heavy-tailed distribution defined on the bounded interval: the logit slash distribution and its application. Journal of Applied Statistics, 47(12), 2097–2119.
Korkmaz, M. Ç. (2020b). The unit generalized half normal distribution: A new bounded distribution with inference and application. University Politehnica of Bucharest Scientific Bulletin-Series A-Applied Mathematics and Physics, 82(2), 133–140.
Korkmaz, M.Ç. and Chesneau, C. (2021). On the unit Burr-XII distribution with the quantile regression modeling and applications, Computational and Applied Mathematics, 40, Article number: 29, 1–26.
Korkmaz, M. Ç., Chesneau, C. and Korkmaz, Z. S. (2021). On the arcsecant hyperbolic normal distribution. Properties, quantile regression modeling and applications, Symmetry, 13(1), 117, 1–24.
Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1–2), 79–88.
Mazucheli, J., Menezes, A. F. B. and Chakraborty, S. (2019a). On the one parameter unit-Lindley distribution and its associated regression model for proportion data. Journal of Applied Statistics, 46(4), 700–714.
Mazucheli, J., Menezes, A. F. and Dey, S. (2019b). Unit-Gompertz distribution with applications. Statistica, 79(1), 25–43.
Mazucheli, J., Menezes, A. F. and Dey, S. (2018). The unit-Birnbaum-Saunders distribution with applications. Chilean Journal of Statistics, 9(1), 47–57.
Mazucheli, J., Menezes, A. F. B. and Ghitany, M. E. (2018). The unit-Weibull distribution and associated inference. J. Appl. Probability Stat, 13, 1–22.
Mazucheli, J., Menezes, A. F. B., Fernandes, L. B., de Oliveira, R. P., and Ghitany, M. E. (2020). The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates. Journal of Applied Statistics, 47(6), 954–974.
McDonald, J. B. (1984). Some generalized functions for the size distribution of income. Econometrica, 52, 647–663.
Mitnik, P. A. and Baek, S. (2013). The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Statistical Papers, 54(1), 177–192.
Moors, J. J. A. (1988). A quantile alternative for Kurtosis, J. Roy. Statist. Soc. Ser. D, 37, 25–32.
Pourdarvish, A., Mirmostafaee, S. M. T. K. and Naderi, K. (2015). The exponentiated Topp-Leone distribution: Properties and application. Journal of Applied Environmental and Biological Sciences, 5(7), 251–256.
Ranneby, B. (1984). The maximum spacing method. An estimation method related to the maximum likelihood method. Scandinavian Journal of Statistics, 93–112.
Schellekens, J.and Ziv, A. (2020). The role of education in explaining trends in self-rated health in the United States, 1972-2018. Demographic Research, 42, 383–398.
Shaked, M. and Shanthikumar, J.G. (2007). Stochastic Orders, Wiley, New York, NY, USA.
Subramanian, S. V., Huijts, T. and Avendano, M. (2010). Self-reported health assessments in the 2002 World Health Survey: how do they correlate with education?. Bulletin of the World Health Organization, 88, 131–138.
Topp, C. W. and Leone, F. C. (1955). A family of J-shaped frequency functions. Journal of the American Statistical Association, 50(269), 209–219.
van Dorp, J. R. and Kotz, S. (2002). The standard two-sided power distribution and its properties: with applications in financial engineering. The American Statistician, 56(2), 90–99.