Estimation of Finite Population Variance Under Stratified Sampling Technique
Uzma Yasmeen1,* and Muhammad Noor-ul-Amin2
1Department of Statistics and Actuarial Sciences, University of Waterloo, Canada
1Institute of Molecular Biology and Biotechnology/Centre for Research in Molecular Medicine, The University of Lahore, Pakistan
2Department of Statistics, COMSATS University Islamabad, Lahore Campus, Pakistan
E-mail: uzmayasmeen15@gmail.com
*Corresponding Author
Received 18 September 2021; Accepted 04 October 2021; Publication 03 November 2021
The efficiency of the study variable can be improved by incorporating the information from the known auxiliary variables. Usually two techniques ratio and regression estimation are used with the help of auxiliary information in different approaches to acquire the high precision of the estimators. Considering the very heterogeneous population to get the size of the sample it may be originating impossible to get a sufficiently accurate and precise estimate by taking the simple random sampling technique from the complete population. Occasionally taking sample issue may differ significantly in different part of the entire population. For example, under study population consists of people living in apartments, own homes, hospitals and prisons or people living in plain regions and hill regions so in such situations the stratified sampling is one of the most commonly used approach to get a representative sample in survey sampling from different cross units of the population. The present study is set out on the recommendation of generalized variance estimators for finite population variance incorporating stratified sampling scheme with the information of single and two transformed auxiliary variables. The expressions of bias and mean square error (MSE) are obtained for the advised exponential type estimators. The conditions are obtained for which the anticipated estimators are better than the usual estimator. An empirical and simulation study is conducted to prove the superiority of the recommended estimator.
Keywords: Exponential estimator, stratified sampling, auxiliary variables, relative efficiency.
Ancillary information plays a vital role both univariate and multivariate, either in the planning, designing, selection, and estimation stages in survey sampling must be efficiently utilized to acquire accurate estimates. The importance of appropriate ancillary information results in substantial reduction in mean squared error (MSE) of the ratio and regression estimators. A rational understanding of Variation is an integral phenomenon of nature for better results in different fields of life. Numerous grounds of life like biology, medical, genetics studies have been facing the difficulty in estimation of the finite population variance. In agricultural field, an agriculturist needs adequate information of climatic variation to invent suitable strategy for cultivating his yield. Several statisticians utilized the ancillary or auxiliary variates for the estimating of finite population variance like Das and Tripathi (1978), and extended by Srivastava and Jhaji (1980), Isaki (1983), R. K. Singh (1983), S. Singh and Joarder (1998), R. Singh, Chauhan, Sawan, and Smarandache (2011), and Tailor and Sharma (2012) among others. Future on, Gupta and Shabbir (2007, 2008), Yadav et al. (2015), Yasmeen et al. (2018), Yasmeen et al. (2018) and Noor-ul-Amin et al. (2018) have paid their consideration towards the better estimation of population variance under different sampling procedures.
In survey sampling, several studies depend on simple random sampling (S.R.S) technique for which the population is supposed homogeneous. In practice, the population may be containing of heterogeneous units and can be divided into homogenous subgroups say strata. For example, to fill up the questionnaires in socio-economic surveys, people may live in urban zones, rural localities, hostels, usual domestic houses, jail and hospitals etc. In this situation one should carefully study the population according to the characteristics of regions and then apply sampling scheme strata wise independently. Stratified random sampling is more appropriate probability sampling method used to increase the precision of estimation.
At the estimation stage, several authors utilized the information of auxiliary variate to estimate the finite population parameters like mean, variance, standard deviation and proportions. Various statisticians recommended estimators with some known population parameters of an auxiliary variable utilized stratified sampling technique for instance Shabbir and Gupta (2005), Khoshnevisan et al. (2007). Singh et al. (2008) recommended class of estimators utilizing power transformation based on the estimators developed by Kadilar and Cingi (2003). Diana (1993) recommended a class of estimators of the finite population mean using single auxiliary variable in the stratified sampling. Moreover, Singh and Vishwakarma (2008) suggested a family of estimators using transformation in the stratified random sampling. Yasmeen et al. (2015, 2016) suggested exponential estimators using transformed auxiliary variables in estimating finite population mean.
The main objective of this paper is to propose three different generalized ratio, ratio cum exponential and generalized dual to ratio cum exponential variance estimators with the help of single auxiliary variate, two auxiliary variates and two transformed auxiliary variates information under stratified sampling approach to estimate the finite population variance respectively. The constituents of the current study are organized as given below. The suggested estimators based on information of auxiliary variables such as The midrange, kurtosis, The median, The tri-mean, The coefficient of correlation, The coefficient of variation, The coefficient of skewness and The Quartile Deviation under stratified random sampling technique are presented in Section 2 while the conditions in which the recommended estimators performs better than the existing estimators are derived in Section 3. The performances of the recommended and the existing estimators are evaluated for population data in Section 4 and the key findings, conclusion are presented in Section 5. Suppose a finite population of size N distributed into L non-overlapping stratum of size where such that . Let be the value of response varíate (y) and be the value of auxiliary varíate (x), for unit of population in the stratum . Suppose a size of sample from stratum is drawn through SRSWOR. Suppose the stratified population and sample averages are and respectively. All the notations are defined in Appendix (A).
The existing usual variance estimator is
(1) |
The existing ratio type estimator suggested by Shabbir and Gupta (2010) is
(2) |
The relevant notations of the existing estimator Equation (2) is defined in Appendix (B).
By the motivation of Kadilar and Cingi (2006) Subramani and Kumarapandiyan (2012a, 2012b, 2012c) and Jeelani et al. (2013), we developed three ratio cum exponential type estimators for the finite population variance utilizing midrange, kurtosis, The median, The tri-mean, The coefficient of correlation, The coefficient of variation, The coefficient of skewness and The Quartile Deviation.
The following three generalized ratio, ratio cum exponential and generalized ratio cum exponential variance estimators have been developed with the help of single auxiliary variate, two auxiliary variates and transformed auxiliary variates information under stratified sampling approach to estimate the finite population variance respectively. The proposed ratio estimator utilizing the information of single auxiliary variate through stratified sampling method is given by
(3) |
where and are the parameters of auxiliary variate such as The midrange, kurtosis, The median, The tri-mean, The coefficient of correlation, The coefficient of variation, The coefficient of skewness and The Quartile Deviation. The ratio cum exponential estimator utilizing the knowledge of two auxiliary variates under stratified sampling technique is given by
(4) |
The generalized dual to ratio-cum-exponential estimator utilizing the two transformed auxiliary variables is given by
(5) |
The notations are related to the proposed estimator (3) are defined in Appendix (A), the proposed estimator (3) is given by
(6) |
After simplification, we have
or
(7) |
Where
The different options of the and are given in Appendix (C).
Expending and neglecting the higher order terms, we obtained
(8) |
After taking expectations, we obtained approximate expression of bias is
In order to develop the approximate expressions of mean square error (MSE) of , again utilizing (8), ignoring the terms of power two and greater, we obtain
After applying expectations on both sides, the approximate expression of mean square error (MSE) is given by
(11) |
In the similar way, we may derive the approximate expression of Biases for the proposed estimators of Equations (3) and (4) as Equations (12) and (13) respectively.
After simplification, the approximate bias expression of anticipated estimator , we acquire
(12) |
after simplification, the approximate expression of bias of proposed estimator is given by
(13) |
In the similar steps, we may derive the approximate expression of mean square error (MSE) for the proposed estimators of Equations (3) and (4) as Equations (14) and (15) respectively.
The approximate mean square error of is given by
(14) |
The approximate expressions of minimum mean square error (MSE) of after substituting the values of and is
(15) |
The values of and are mentioned in Appendix (B).
If and , then the form of recommended estimator of Equation (5) is
(16) |
If and , then the form of recommended estimator of Equation (5) is
(17) |
If and then the form of recommended estimator of Equation (5) is
(18) |
If and then the form of recommended estimator of Equation (5) is
(19) |
If and then the form of recommended estimator of Equation (5) is
(20) |
If and then the form of recommended estimator of Equation (5) is
(21) |
If and then the form of recommended estimator of Equation (5) is
(22) |
A study of simulation is conducted to reveal the efficiency of recommended estimators through the knowledge of single auxiliary, two auxiliary variates and two transformed auxiliary variates under stratified sampling procedure. The performance of the recommended estimators is assessed for real population showed in this section. The percentage relative efficiency (PRE) formula is used to present the comparison of recommended estimators with existing estimators. we used population taken from Murthy (1967) page:228 and consists of 80 values. The population is considered for simulation purpose by using the R-Language software. In Table 1, the PRE’s are calculated through the following formula
(23) | ||
where, are denoted by existing estimators and , are denoted by developed estimators utilized in the PRE’s formula shown in (23). In this study, consider the population for summarizing through the simulation steps. The procedure is used to find the efficiency of the developed estimators over existing estimators. To perform the simulation study, the following steps are applied in R-Language software:
Table 1 Simulation results for the PRE’s of recommended estimator with respect to the existing usual estimator by different sample sizes
Sample Size | ||||||
Estimator | 8 | 10 | 14 | 18 | 22 | 28 |
100 | 100 | 100 | 100 | 100 | 100 | |
23.89 | 40.54 | 56.35 | 59.39 | 60.00 | 57.39 | |
109.00 | 133.00 | 212.76 | 248.29 | 267.15 | 287.96 | |
217.69 | 262.61 | 304.78 | 322.40 | 339.79 | 337.82 | |
253.87 | 242.58 | 232.15 | 225.19 | 214.342 | 207.52 |
Step 1: From the defined population, the different sizes of samples are considered as ; 10; 14; 18; 22; 28. The procedure repeated 100,000 times and the population is divided into two strata’s to calculate the numerous values of recommended and existing estimators with stratified sampling method.
Step 2: Using the samples obtained in step 1, the 100,000 values of separately are obtained using expressions of existing and recommended estimators (1)–(5), respectively.
Step 3: Using the values found in step 2, the values of PRE is computed by (23) and reported in Table 1.
This simulation study is involved the evaluation of the PRE’s of the anticipated estimators and existing estimators for each sample size. The results attained from the study of simulation interprets that the recommended generalized variance estimators are more efficient than the available estimators.
Table 2 Simulation results for the RB of recommended estimator with respect to the existing usual estimator for different sample sizes
Sample Size | ||||||
Estimator | 8 | 10 | 14 | 18 | 22 | 28 |
0.15109 | 0.14578 | 0.0864 | 0.07700 | 0.00104 | 0.00138 | |
0.0047 | 0.02788 | 0.0786 | 0.001417 | 0.000271 | 0.000327 | |
0.004272 | 0.00499 | 0.002609 | 0.00192 | 0.00178 | –0.00052 | |
0.05976 | 0.02719 | 0.001964 | 0.00171 | 0.00196 | 0.00066 | |
–0.8038 | –0.8544 | –0.9178 | –0.9945 | –1.1280 | –1.2687 |
The simulation study results are presented in Tables 1–2. The results obtained by using the existing and all three suggested estimators under stratified sampling technique are discussed in Tables 1 and 2, where the results in Table 1 are obtained by using formula percent relative efficiency and the results in Table 2 are obtained by relative biases formula. Further the comparisons of recommended ratio, ratio-cum-exponential and generalized ratio-cum-exponential stratified variance estimators utilizing the coefficient of Skewness, Quartile Deviation, Tri-mean, coefficient of kurtosis, Median and midrange and existing estimators are given in Tables 1–2. The major findings from the results presented in Section 3 are:
1. We can see from Tables 1–2 the variances of suggested ratio, ratio-cum-exponential and generalized ratio-cum-exponential estimators under stratified sampling procedure are efficient than the existing variance estimators for heterogeneous population with different samples. For instance, we observed that three anticipated estimators have been developed for variable of interest by using single auxiliary variable, two auxiliary variables and transformed auxiliary variables with different sizes of samples and strata size is 2.
2. From Table 1, it is also observed that the PRE of advised several estimators are greater than the PRE of existing estimators, which demonstrates percent relative efficiencies of these advised estimators utilizing the coefficient of Skewness, Quartile Deviation, Tri-mean, coefficient of kurtosis and Median using two strata and different sample sizes. We have acquired several PRE of recommended estimators through simulation study but here presented results of more efficient estimators with different sample sizes in Table 1. From Table 1. 1st row have PRE of usual estimator with different sample sizes, 2nd row have PRE of existing Gupta and Shabbir (2010) estimator with different sample sizes and last 3rd, 4th and 5th rows have variances of advised estimators of and respectively. For example considering third row of Table 1 sample size is 22, PRE of usual estimator is 100, PRE of Gupta and Shabbir (2010) estimator is 60 and PRE of our suggested estimators and are 267.15, 339.798 and 214.34. So we observed that the PRE of suggested estimators are higher than PRE of existing estimators using this sample size and strata scheme.
3. Further, From Table 2. 1st row have of usual estimator with different sample sizes, 2nd row have relative biases of existing Gupta and Shabir (2010) estimator with different sample sizes and last 3rd, 4th and 5th rows have relative efficiencies of advised estimators of and respectively. For example considering third row of Table 2 sample size is 28, relative bias of usual estimator is 0.0013, relative bias of Gupta and Shabbir (2010) estimator is 0.0003 and relative bias of our suggested estimators and are –0.0005, 0.0006 and –1.26. So we observed that the relative biases of suggested estimators are less than relative biases of existing estimators using this sample size and strata scheme.
It is concluded from Table 1 that proposed estimators are more efficient for all different sample sizes than the previous available estimators. It is clear that the recommended estimators are more useful than existing estimators as the performance of suggested estimators are better than the existing estimators. We have suggested generalized ratio cum exponential estimators for estimating variance of a finite population under stratified random sampling technique. From the simulation results, given in Tables 1 and 2, we inferred that the suggested estimators are more efficient estimators than the usual stratified sample variance , Shabbir and Gupta (2010) stratified variance estimator for the finite population. Hence, the suggested estimators are recommended for its practical use for estimating variance of a finite population when the single auxiliary variate and two auxiliary variate without using transformation and two auxiliary variates utilized transformation are available. The simulated efficiencies and relative biases for the anticipated estimators are obtained in Tables 1 and 2 for the various sample sizes. The superiority of anticipated estimators may be concluded from the information of Table 1.
For the construction of proposed estimators, we should have complete information about auxiliary variables and information of population parameters. The following recommendations are suggested for future research of the study.
1. In this paper the suggested estimators have been presented under stratified sampling technique by using single auxiliary variable, two auxiliary variables and transformed auxiliary variables for the estimation of the population variance. One can extend this effort utilizing the information of multi-auxiliary variables under single phase and two phase sampling and non-response in different sampling techniques.
2. In single and two phase sampling, one can also extend work using different transformations available in literature with the information of single, two and multi-auxiliary variables by different sampling techniques in full response and non-response cases.
Hence we strongly recommended the use of the suggested three new generalized estimators for practical concerns and estimation of the population variance and can be preferred over the usual variance estimator and the existing estimator. These ratios cum exponential type and dual to ratio cum exponential estimators will go a lengthy way in formulating business, economic, demographic, banking and market policies based on reduced heterogeneity in population and reduced errors of the estimates acquired by adopting these estimators.
The usual variance of is given by
where is the population variance for stratum h when we ignore the finite population factor.
Suppose
so that
where
are the coefficients of kurtosis of y, x and z respectively in the stratum and
where
where are the 2nd order moment for the stratum and are moments ratio for stratum.
where and are constants. and are known population parameters of x. so for different values of and , we have
After simplification, we get
where
[1] Dalabehara, M., and Sahoo, L. N. (1997). A class of estimators in stratified sampling with two auxiliary variables. Journal Into Society Agriculture Statistics, 50(2), pp. 144–149.
[2] Dalabehara, M., Sahoo, L. N. (1999). A new estimator with two auxiliary variables for stratified sampling, Statistica, 59(1), pp. 101–107.
[3] Shabbir, J., Gupta, S. (2005). Improved ratio estimators in stratified sampling, American Journal of Mathematical & Management Sciences, 25, pp. 293–311.
[4] Diana, G. (1993). A class of estimators of the population mean in stratified random sampling, Statistica, 53(1), pp. 59–66.
[5] Singh, M. P. (1965). On the estimation of ratio and product of the population parameters, Sankhyā The Indian Journal of Statistics, Series B, pp. 321–328.
[6] Singh, H. P., Tailor, R., Singh, S., Kim, J. M. (2008). A modified estimator of population mean using power transformation, Statistica, 49, pp. 37–58.
[7] Khoshnevisan M., Singh R., Chauhan P., Sawan, N., Smarandache, F. (2007). A general family of estimators for estimating population mean using known value of some population parameter(s), Far East Journal of Theoretical Statistics, 22, pp. 181–191.
[8] Kadilar, C., Cingi, H. (2003). Ratio estimators in stratified random sampling, Biometrical Journal, 45: 218–225.
[9] Kadilar, C., and Cingi, H. (2003). Ratio estimators in stratified random sampling, Biometrical Journal: Journal of Mathematical Methods in Biosciences, 45(2), pp. 218–225.
[10] Kadilar, C. and Cingi, H. (2006). An improvement in estimating the population mean by using the correlation co-efficient, Hacettepe Journal of Mathematics and Statistics, 35 (1), pp. 103–109.
[11] Chandra, P., Singh H. P. (2005) A family of estimators for population variance using knowledge of kurtosis of an auxiliary variable in sample survey, Statistics in Transition, 37 pp. 7–27.
[12] Gupta, S., Shabbir, J. (2007). On the use of transformed auxiliary variables in estimating population mean, Journal of Statistical Planning and Inference, 137(5), pp. 1606–1611.
[13] Gupta, S., Shabbir, J. (2008). On improvement in estimating the population mean in simple random sampling. Journal of Applied Statistics, 35(5), pp. 559–566.
[14] Isaki, C. T. (1983). Variance estimation using auxiliary information, Journal of the American Statistical Association, 78, pp. 117–123.
[15] Parsad, B., Singh, H. P. (1990). Some improved ratio-type estimators of finite population variance using auxiliary information in sample surveys, Communication in Statistics – Theory and Methods, 19(3), pp. 1127–1139.
[16] Parsad, B., Singh, H. P. (1992). Unbiased estimators of finite population variance using auxiliary information in sample survey, Communication in Statistics – Theory and Methods, 21(5), pp. 1367–1376.
[17] Singh, H. P., Vishvakarama, G. K. (2010). A general procedure for estimating the population mean in stratified random sampling using auxiliary information, Metron, 68(1), pp. 47–65.
[18] Subramani, J. and Kumarapandiyan, G. (2012a). A class of almost unbiased modified ratio estimators for population mean with known population parameters, Elixir Statistics, 44, pp. 7411–7415.
[19] Subramani, J. and Kumarapandiyan, G. (2012b). Estimation of population mean using known median and co-efficient of skewness, American Journal of Mathematics and Statistics, 2(5), pp. 101–107.
[20] Subramani, J. and Kumarapandiyan, G. (2012c). Estimation of population mean using co-efficient of variation and median of an auxiliary variable, International Journal of Probability and Statistics, 1(4), pp. 111–118.
[21] Yadav, S. K., Kadilar, C., Shabbir, J., Gupta, S. (2015). Improved family of estimators of population variance in Simple Random Sampling, Journal of Statistical Theory and Practice, 9(2), pp. 219–226.
[22] Yasmeen, U., Noor ul Amin, M., Hanif, M. (2015). Generalized exponential estimators of finite population mean using transformed auxiliary variables. International Journal of Applied Computational Mathematics, 1(2), pp. 1–10.
[23] Yasmeen U, Noor ul Amin M, Hanif M. (2016). Exponential ratio and product type estimators of finite population mean. Journal of statistics and management systems 19: 55–71.
[24] Yasmeen, U., Noor ul Amin, M., Hanif M. (2018). Exponential Estimators of Finite Population Variance Using Transformed Auxiliary Variables. Proceedings of the National Academy of Sciences, India Section A: Physical Sciences.
[25] Noor ul Amin, M., Yasmeen, U., Hanif M. (2018). Generalized Variance Estimators In Adaptive Cluster Sampling Using Single Auxiliary Variable. Journal of Statistics and Management Sciences.
[26] Yasmeen, U., Noor ul Amin, M., Hanif M. (2018) Estimation of finite population variance under systematic sampling using auxiliary information. Statistics and Applications.
[27] Gupta, S., and Shabbir, J. (2010). Estimating variance of stratified random sample mean in two phase sampling using two auxiliary variables. American Journal of Mathematical and Management Sciences, 30(3–4), 347–364.
[28] Shabbir, J., and Gupta, S. (2010). Some estimators of finite population variance of stratified sample mean. Communications in Statistics—Theory and Methods, 39(16), 3001–3008.
Uzma Yasmeen is a Ph.D. from the National College of Business Administration & Economics, Lahore, Pakistan. She has worked at the University of Waterloo, Canada and COMSATS University Islamabad. Currently, she is working as an Assistant professor at the University of Lahore, Lahore Campus. Her research interest is sampling Techniques, Bio Statistics. She is an HEC approved supervisor.
Muhammad Noor-ul-Amin received his Ph.D. degree from NCBA&E, Lahore, Pakistan. He has working experience in various universities for teaching and research that includes the Virtual University of Pakistan, University of Sargodha, Pakistan, and the University of Burgundy, France. He is currently working as an Assistant professor at COMSATS University Islamabad-Lahore Campus. His research interests include sampling techniques and control charting techniques. He is an HEC approved supervisor.
Journal of Reliability and Statistical Studies, Vol. 14, Issue 2 (2021), 565–584.
doi: 10.13052/jrss0974-8024.14210
© 2021 River Publishers