Heuristical Approach for Optimizing Population Mean Using Ratio Estimator in Stratified Random Sampling

G. R. V. Triveni and Faizan Danish*

Department of Mathematics, School of Advanced Sciences, VIT-AP University, Inavolu, Beside AP Secretariat, Amaravati AP-522237, India
E-mail: danishstat@gmail.com
*Corresponding Author

Received 05 April 2023; Accepted 20 October 2023; Publication 20 November 2023

Abstract

In this study, we have developed a Ratio type estimator in Stratified sampling to estimate the population average of study variable by using the information of a concomitant variable. By utilizing Taylor’s series, we have derived the expressions for Bias and MSE upto first degree of approximation. In numerical illustration, employing a real data set, we have demonstrated that the proposed estimator has highest Percentage relative efficiency when compared to the considered existing estimatrs. Furthermore, we have demonstrated that Separate ratio type estimators have the highest relative efficiency when contrasted with the Combined ratio type estimators.

Keywords: Relative effciency, Bias, mean square error, stratified sampling, Taylors series, ratio estimators.

1 Introduction

In contemporary survey methodologies, the practice of stratification serves as a strategic tool aimed at strengthening the accuracy of estimations. In this approach, a diverse population is partitioned into distinct strata with the objective of achieving homogeneity within each stratum. Subsequently, a sample is drawn from each stratum utilizing various sampling designs. This method proves particularly effective when the selected units are not substantial in number, and the population demonstrates significant variability. The act of segmenting the population into strata offers the potential to yield more precise estimates of population parameters while concurrently reducing variance within each stratum.

Optimal stratification emerges as a technique that yields the minimal possible variance in this process. Ratio estimation constitutes a central endeavour, aiming to estimate the same ratio within the sample by utilizing the population total of a variable of interest and the population size. Ratio estimation stands out as a robust method for estimating population parameters, and its utility becomes especially apparent in situations where the population exhibits high levels of variability or when the selected units are limited in number. One of the key advantages of employing ratio estimation within the framework of stratified random sampling lies in its capacity to reduce estimator variance, thereby enhancing the precision of estimates.

The study of estimators has been supported by numerous researchers. Cochran (1971) gave a brief summary of the stratified sampling process for estimating the estimators using various sampling designs, whereas Sarjinder Singh (2003) spoke extensively about the evolution of estimators in general. A class of product cum ratio type estimators was presented by Sisodia and Dwivedi (1981). Ratio and Product type exponential estimators were shown to have higher precision by Bahl and Tuteja (1991) after Prasad (1989) worked on a class of ratio estimators. Upadhyaya and Singh (1999) provided a new approach by transforming the auxiliary variable by assuming auxiliary variable is known and they concluded that proposed estimator has lower MSE and it performs better than estimators in literature. In 2003, Kadilar and Cingi assessed a variety of ratio-type estimators before going on to create a new estimator. They expanded their study and contributed significantly to the field of survey sampling in 2005 by introducing a novel estimation method. In order to enhance several different forms of ratio estimators, Shabbir and Gupta (2005) worked on them. Later, Koyuncu and Kadilar (2009) provided effective estimators for the population mean and demonstrated the effectiveness of their proposal by numerical illustration. Yadav et al. (2011) promoted an enhanced separate ratio exponential estimator and demonstrated the superiority of the estimator by utilizing theoretical and numerical comparisons.

Additionally, Tailor and Chouhan and Tailor and Lone (2014) researched the characteristics of distinct ratio-type estimators and worked on ratio cum product type exponential estimators. Yadav et al. (2014) developed ratio and product exponential estimators and evaluated its efficiency by using an empirical study. Later, using information from two auxiliary variables and a difference cum exponential ratio type estimator, Shabbir and Gupta (2017) were able to use stratified random sampling as well as basic random sampling. The efficiency of the novel estimators was supported by theoretical properties and numerical findings provided by Verma et al. (2017) in their proposal of a distinct class of estimators. A number of estimators in various sample designs were created by Koyuncu et al. (2018), Zaman (2019), Luengo et al. (2019), Kumar and Vishwakarma (2020), Yadav and Tailor (2020), Dahiru et al. (2021), and Khare et al. (2022). Rather and Kadilar (2022) demonstrated an increase in the proposed estimator’s efficiency by contrasting it with an unbiased estimator. Additionally, Kumar et al. (2023) created a group of estimators using stratified random sampling, compared those estimators to others already in use, and came to the conclusion that the new class of estimators outperformed the older ones. In this paper, we have developed an estimator in Section 4 and proved the efficiency of proposed and considered estimators in Section 6.

2 Terminology

• N = no. of units in population

• n = the no. of units in the sample

Ph=NhN is stratum weight

• k = no. of strata

Nh= population units in hth stratum

nh= sample units in hth stratum

Z¯1=h=1kPhz¯1h is the average of study variable

Z¯2=h=1kPhz¯2h is the average of auxiliary variable

z¯1h=1nhj=1nhz1hj is the study variable’s average for the sample at hth stratum

z¯2h=1nhj=1nhz2hjis the auxiliary variable’s average for the sample at hth stratum

Shz12=1Nh-1j=1nh(z1hj-Z¯1h)2 is the study variable’s variance at hth stratum

Shz22=1Nh-1j=1nh(z2hj-Z¯2h)2 is the auxiliary variable’s variance at hth stratum

Shz1z2=1Nh-1j=1nh(z1hj-Z¯1h)(z2hj-Z¯2h) is the covariance between variables at hth stratum

ρhz1z2=Shz1z2Shz1Shz2 is correlation coefficient between variables at hth stratum

Chz1=Shz1Z¯1h is Coefficient of variation for study variable at hth stratum

Chz2=Shz2Z¯2h is Coefficient of variation for auxiliary variable at hth stratum

A population ‘U’ made up of units U1,U2,U3UN of size N. Let Z¯1 and Z¯2 represents the study and concomitant variable. If population U is separated into K homogeneous strata of sizes nh(h=1,2,3,k) and nh represents the sample size drawn from hth stratum. The population mean Z¯1 for Separate Ratio estimator is expressed as

Z¯1SR=h=1kPhz¯1h[Z¯2hz¯2h]

3 Estimators in Literature

(1) The Traditional Separate ratio estimator is

Z¯1SR =h=1kPhZ¯1hR
=h=1kPh[z¯1hz¯2hZ¯2h]

Where, Z¯1hR is the ratio estimator of hth stratum population mean Z¯1h and z¯1hZ¯2h is the proportion of sample means of the study variable z1 and auxiliary variable z2 of hth stratum. The Bias and MSE are as follows:

Bias(Z¯1SR) =h=1kPhZ¯1h(1-fhnh)[Chz22-ρhz1z2Chz1Chz2]
MSE(Z¯1SR) =h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+Chz22-2ρhz1z2Chz1Chz2] (1)

Rajesh and Lone (2014) modified the following estimators into Separate ratio estimators as:

(2) Separate ratio type estimator utilizing Cz2 of auxiliary variate in hth stratum as

Z¯1RSSD=h=1kPhZ¯1h[z¯2h+Cz2hz¯2h+Cz2h]

where Cz2h is the auxiliary variate’s coefficient of variation in hth stratum

Bias(Z¯1RSSD) =h=1kPhZ¯1h(1-fhnh)[λ1h2Chz22-ρhz1z2λ1hChz1Chz2]
MSE(Z¯1RSSD) =h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+λ1h2Chz22-2ρhz1z2λ1hChz1Chz2] (2)

Where λ1h=Z¯2hZ¯2h+Cz2h

(3) Separate Ratio type estimator by utilizing kurtosis β2(z2) of z2 in hth stratum as

Z¯1RSSE=h=1kPhz¯1h[Z¯2h+β2h(z2)z¯2h+β2h(z2)]

Where β2(z2) is the auxiliary variate’s coefficient of kurtosis in hth stratum.

Bias(Z¯1RSSE) =h=1kPhZ¯1h(1-fhnh)[λ2h2Chz22-ρhz1z2λ2hChz1Chz2]
MSE(Z¯1RSSE) =h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+λ2h2Chz22-2ρhz1z2λ2hChz1Chz2] (3)

where λ2h=Z¯2hz¯2h+β2h(z2)

(4) By utilizing coefficients of kurtosis and variation in hth stratum, preferred a separate ratio-type estimator as

Z¯1RSUS1=h=1kPhZ¯1h[z¯2hβ2h(z2)+Cz2hz¯2hβ2h(z2)+Cz2h]

Where β2(z2) is the coefficient of kurtosis and Cz2h is the auxiliary variate’s coefficient of variation in hth stratum.

Bias(Z¯1RSUS1) =h=1kPhZ¯1h(1-fhnh)[λ3h2Chz22-ρhz1z2λ3hChz1Chz2]
MSE(Z1¯RSUS1) =h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+λ3h2Chz22-2ρhz1z2λ3hChz1Chz2] (4)

where λ3h=z¯2hβ2h(z2)z¯2hβ2h(z2)+Cz2h

(5) By using coefficients of variation and kurtosis in hth stratum, preferred a separate ratio-type estimator as

Z¯1RSUS2 =h=1kPhZ¯1h[z¯2hCz2h+β2h(z2)z¯2hCz2h+β2h(z2)]
Bias(Z¯1RSUS2) =h=1kPhZ¯1h(1-fhnh)[λ4h2Chz22-ρhz1z2λ4hChz1Chz2]
MSE(Z¯1RSUS2) =h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+λ4h2Chz22-2ρhz1z2λ4hChz1Chz2] (5)

where λ4h=z¯2hCz2hz¯2hCz2h+β2h(z2)

(6) The traditional Combined Ratio estimate is

Z¯1RC=z¯1stz¯2stZ¯2

Its MSE is given by

MSE(Z¯1RC)=h=1kPh2(1-fhnh)[Shz12+R2Shz22-2RShz1z2] (6)

(7) Kadilar and Cingi (2005) proposed a new Ratio estimator as

Z¯1KC =m*Z¯1RC
MSE(Z¯1KC) =m*h=1kPh2(1-fhnh)[Shz12+R2Shz22-2RShz1z2]
+(m*-1)2Z¯12 (7)

Where m*=Z¯12Z¯12+h=1kPh2(1-fhnh)[Shz12+R2Shz22-2RShz1z2]

4 Proposed Estimator

In Stratified random Sampling, we propose that

Z¯1st=t22h=1kPh[Z¯1hz¯2hZ¯2h] (8)

Proposed Estimator’s bias can be obtained as

E(Z¯1st-Z¯1) =E(t22h=1kPh[Z¯1hz¯2hZ¯2h]-Z¯1)
=Z¯2E[t22h=1kPhZ¯1h-(Z¯1z¯2hZ¯2)z¯2h] (9)

We have,

1z¯2h=1Z¯2(1+z¯2h-Z¯2Z¯2)-1

By using Taylor Series expansion, we get

1z¯2h1Z¯2(1-z¯2h-Z¯2Z¯2) (10)

Substituting Equation (10) in Equation (9), we get

E(Z¯1st-Z¯1) =Z¯2E{(t22h=1kPhZ¯1h-Z¯1z¯2hZ¯2)1Z¯2(1-z¯2h-Z¯2Z¯2)}
=[t22-1]Z¯1+1Z¯2h=1kPhZ¯1h(1-fhnh)
×[Chz22-tρhz1z2Chz1Chz2]

Therefore,

Bias(Z¯1st) =(t-1)Z¯1+1Z¯2h=1kPhZ¯1h(1-fhnh)
×[Chz22-t22ρhz1z2Chz1Chz2]
MSE(Z¯1st) =t44var(Z¯1SR)+Z¯12[t22-1]2 (11)

Thus, the MSE of the proposed estimator will take the following form as:

MSE(Z¯1st) =t44h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+Chz22-2ρhz1z2Chz1Chz2]+Z¯12[t22-1]2 (12)

For minimizing MSE differentiate above equation with respect to ‘t’ and equating to zero, we get

dMSE(Z¯1st)dt=0
t2{h=1kPh2Z¯1h2(1-fhnh)[Chz12+Chz22-2ρhz1z2Chz1Chz2]+Z¯12}=2Z¯12

We obtained “t2” from the above equation as follows:

t2=2Z¯12Z¯12+h=1kPh2Z¯1h2(1-fhnh)[Chz12+Chz22-2ρhz1z2Chz1Chz2]

where, t lies between 0 and 1.

5 Empirical Studies

In the empirical study, we compare our proposed estimator with traditional Separate ratio estimator, Combined Ratio estimator and the other estimators listed below. Additionally, we examine various conditions as outlined below:

From Equations (3) and (4), we get

MSE(Z¯1st) <MSE(Z¯1SR)
Z¯12[t22-1]2 <h=1kPh2Z¯1h2(1-fhnh)
×[Chz12+Chz22-2ρhz1z2Chz1Chz2](1-t44) (13)

From (3) and (4), we have the condition as

MSE(Z¯1st) <MSE(Z¯1RSSD)
Z¯12[t22-1]2 <h=1kPh2Z¯1h2(1-fhnh)
{[Chz12+λ1h2Chz22-2ρhz1z2λ1hChz1Chz2]
-t44[Chz12+Chz22-2ρhz1z2Chz1Chz2]} (14)

From (3) and (4),

MSE(Z¯1st) <MSE(Z¯1RSSE)
Z¯12[t22-1]2 <h=1kPh2Z¯1h2(1-fhnh)
×{[Chz12+λ2h2Chz22-2ρhz1z2λ2hChz1Chz2]
-t44[Chz12+Chz22-2ρhz1z2Chz1Chz2]} (15)

From Equations (3) and (4)

MSE(Z¯1st) <MSE(Z¯1RSUS1)
Z¯12[t22-1]2 <h=1kPh2Z¯1h2(1-fhnh)
×{[Chz12+λ3h2Chz22-2ρhz1z2λ3hChz1Chz2]
-t44[Chz12+Chz22-2ρhz1z2Chz1Chz2]} (16)

From (3) and (4), we have condition as

MSE(Z¯1st) <MSE(Z¯1RSUS2)
MSE(Z¯1st) =h=1kPh2Z¯1h2(1-fhnh)
×{[Chz12+λ4h2Chz22-2ρhz1z2λ4hChz1Chz2]
-t44[Chz12+Chz22-2ρhz1z2Chz1Chz2]} (17)

From (6) and (4), we have condition as

MSE(Z¯1st) <MSE(Z¯1RC)
t44h=1kPh2Z¯1h2(1-fhnh)[Chz12+Chz22-2ρhz1z2Chz1Chz2]
+Z¯12[t22-1]2
<h=1kPh2(1-fhnh)[Shz12+R2Shz22-2RShz1z2] (18)

From Equations (3) and (4), we have

MSE(Z¯1st) <MSE(Z¯1KC)
t44h=1kPh2Z¯1h2(1-fhnh)[Chz12+Chz22-2ρhz1z2Chz1Chz2]
+Z¯12[t22-1]2
<m*h=1kPh2(1-fhnh)[Shz12+R2Shz22-2RShz1z2]
+(m*-1)2Z¯12 (19)

6 Application of Real-Life Data Set

We used ratio estimators to analyze data from the number of apple trees and the amount of apple output in 854 villages throughout Turkey in 1999 (Source: Institute of Statistics, Republic of Turkey). First, we divided the data into six strata based on the various areas of Turkey, and then we randomly chose samples (villages) from each stratum (region). Neyman allocation (Cochran, 1977) is used

nh=nNhShh=1kNhSh

and a total of 140 units selected as sample from a population of 854, the data represented in Table 1.

Table 1 Data statistics

N1=106 N2=106 N3=94 N4=171 N5=204 N6=173
n1=9 n2=17 n3=38 n4=67 n5=7 n6=2
Z¯11=1536 Z¯12=2212 Z¯13=9384 Z¯14=5588 Z¯15=967 Z¯16=404
Z¯21=24375 Z¯22=27421 Z¯23=72409 Z¯24=74365 Z¯25=26441 Z¯26=9844
β2z11=25.7 β2z12=34.6 β2z13=26.1 β2z14=97.6 β2z15=27.5 β2z16=28.1
C1z1=4.18 C2z1=5.22 C3z1=3.19 C4z1=5.13 C5z1=2.47 C6z1=2.34
C1z2=2.02 C2z2=2.1 C3z2=2.22 C4z2=3.84 C5z2=1.72 C6z2=1.91
S1z1=6425 S2z1=11552 S3z1=29907 S4z1=28643 S5z1=2390 S6z1=946
S1z2=49189 S2z2=57461 S3z2=160757 S4z2=285603 S5z2=45403 S6z2=18794
ρ1z1z2=0.82 ρ2z1z2=0.86 ρ3z1z2=0.9 ρ4z1z2=0.99 ρ5z1z2=.71 ρ6z1z2=0.89
N = 854 n = 140 Z¯1=2930 Z¯2=37600 Sz1=17106 Sz2=144794

The suggested estimator outperforms the separate ratio, combined ratio, and taken into account existing estimators in terms of percentage relative efficiency (PRE). Table 2 shows that the proposed estimator’s MSE value is significantly lower than that of the other estimators. Further, It can be concluded that, in Table 2, the estimators 6 & 7 are combined ratio type estimators. It is observed that, they have highest PRE values than others.

Table 2 MSE values for the considered and proposed estimators

S. No. Estimator MSE PRE
1. Z¯1SR 165108.3 100
2. Z¯1RSSD 165112.2 99.99
3. Z¯1RSSE 165335.7 99.86
4. Z¯1RSUS1 165100.4 100.00
5. Z¯1RSUS2 165182.1 99.95
6. Z¯1RC 225224.1 73.31
7. Z¯1KC 219466.4 75.23
8. Z¯1st 161992.8 101.92

7 Conclusion

In this research, we have introduced a novel separate-type ratio estimator designed for application within stratified random sampling. Our investigation involved a comprehensive comparative analysis, contrasting this newly developed estimator with traditional separate ratio estimators, combined ratio estimators, and certain existing estimators that were previously modified by Rajesh and Lone in 2014. Utilizing a real-world dataset, we empirically validated the efficiency and effectiveness of our proposed estimator.

The findings, as presented in Table 2, prominently demonstrate the superiority of the proposed estimator. Specifically, our estimator consistently yielded lower Mean Squared Error (MSE) values, affirming its ability in achieving higher precision in estimation. Moreover, these results unequivocally underscore the dominance of separate ratio estimators over their combined ratio counterparts, reinforcing the significance of this method in optimizing accuracy within the realm of stratified random sampling.

References

Bahl, S., and Tuteja, R. (1991). Ratio and product type exponential estimators. Journal of information and optimization sciences, 12(1), 159–164.

Cochran, W. G. (1977). Sampling techniques. John Wiley & Sons.

Haruna, N. F., Kumar Singh, R. V., and Dahiru, S. (2021). A modified ratio type estimator of finite population mean under Stratified random sampling scheme. Indian Journal of Applied Research, 11, 58–60.

Kadilar, C., and Cingi, H. (2003). Ratio estimators in stratified random sampling. Biometrical journal, 45(2), 218–225.

Kadilar, C., and Cingi, H. (2005). A new ratio estimator in stratified random sampling. Communications in Statistics - Theory and Methods, 34(3), 597–602.

Kumar, M., and Vishwakarma, G. K. (2020). Generalized classes of regression-cum-ratio estimators of population mean in stratified random sampling. Proceedings of the National Academy of Sciences, India Section A: Physical Sciences, 90, 933–939.

Lone, H. A., Tailor, R., and Verma, M. R. (2017). Efficient separate class of estimators of population mean in stratified random sampling. Communications in Statistics – Theory and Methods, 46(2), 554–573.

Mishra, M., Singh, S., and Khare, B. B. (2022). Separate Ratio Estimator Using Calibration Approach for the Population mean in Stratified Random Sampling. Asian Journal of Probability and Statistics, 20(3), 64–73.

Prasad, B. (1989). Some improved ratio type estimators of population mean and ratio in finite population sample surveys. Communications in Statistics-Theory and Methods, 18(1), 379–392.

Rather, K. U. I., and Kadilar, C. (2022). Dual to Ratio cum Product Type of Exponential Estimator for Population Mean in Stratified Random Sampling. Journal of Statistics Applications & Probability Letters, 9(1), 43–48.

Sisodia, B. V. S., and Dwivedi, V. K. (1981). Modified ratio estimator using coefficient of variation of auxiliary variable. Journal-Indian Society of Agricultural Statistics, 33(2), 13–18.

Singh, H. P., and Kakran, M. S. (1993). A modified ratio estimator using known coefficient of kurtosis of an auxiliary character. Journal of the Indian Society of Agricultural Statistics, 45(2), 141–148.

Shabbir, J., and Gupta, S. (2005). Improved ratio estimators in stratified sampling. American Journal of Mathematical and Management Sciences, 25(3–4), 293–311.

Shabbir, J., and Gupta, S. (2017). Estimation of finite population mean in simple and stratified random sampling using two auxiliary variables. Communications in Statistics-Theory and Methods, 46(20), 10135–10148.

Shahzad, U., Hanif, M., and Koyuncu, N. (2018). A new estimator for mean under stratified random sampling. Mathematical Sciences, 12, 163–169.

Shahzad, U., Hanif, M., Koyuncu, N., and Luengo, A. G. (2019). A family of ratio estimators in stratified random sampling utilizing auxiliary attribute alongside the nonresponse issue. Journal of Statistical Theory and Applications, 18(1), 12–25.

Tailor, R., and Chouhan, S. (2014). Ratio-cum-product type exponential estimator of finite population mean in stratified random sampling. Communications in Statistics-Theory and Methods, 43(2), 343–354.

Tailor, R., and Lone, H. A. (2014). Separate ratio-type estimators of population mean in stratified random sampling. Journal of Modern Applied Statistical Methods, 13(1), 14.

Tiwari, K. K., Bhougal, S., and Kumar, S. (2023). A general class of estimators in stratified random sampling. Communications in Statistics-Simulation and Computation, 52(2), 442–452.

Upadhyaya, L. N., and Singh, H. P. (1999). Use of transformed auxiliary variable in estimating the finite population mean. Biometrical Journal: Journal of Mathematical Methods in Biosciences, 41(5), 627–636.

United States. National Agricultural Statistics Service. (1999). 1997 Census of Agriculture. US Department of Agriculture, National Agricultural Statistics Service.

Yadav, R., Upadhyaya, L. N., Singh, H. P., and Chatterjee, S. (2011). Improved separate ratio exponential estimator for population mean using auxiliary information. Statistics in Transition new series, 12(2), 401–412.

Yadav, R., and Tailor, R. (2020). Estimation of finite population mean using two auxiliary variables under stratified random sampling. Statistics in Transition new series, 21(1), 1–12.

Yadav, R., Upadhyaya, L. N., Singh, H. P., and Chatterjee, S. (2014). Improved ratio and product exponential type estimators for finite population mean in stratified random sampling. Communications in Statistics-Theory and Methods, 43(15), 3269–3285.

Zaman, T. (2019). Efficient estimators of population mean using auxiliary attribute in stratified random sampling. Advances and Applications in Statistics, 56(2), 153–171.

Biographies

images

G. R. V. Triveni currently a Ph.D Research Scholar in the Department of Mathematics, School of Advanced Sciences, VIT-AP University, Amaravati, Andhra Pradesh, India. Her Research area is Sampling theory. She had done M.Sc from Andhra University, Andhra Pradesh. She possessed a limited knowledge of R software.

images

Faizan Danish is currently working as an Assistant Professor (Statistics) in VIT-AP University, Vijayawada Andhra Pradesh, India and has done Ph.D. from Sher-e-Kashmir University of Agricultural Sciences and Technology of Jammu, J&K, India. He has worked as Postdoctoral Fellow in New York University, New York, USA. Dr. Danish has around 3 years of Teaching experience and published around 25 research papers in reputed journals. Further, he has worked as Biostatistician under Research Consultation Services, Doha Qatar. He has research expertise in Sampling Theory, Mathematical, Applied Statistics and Biostatistics. Dr. Danish is well versed with Statistical Software’s: R, STATA, SPSS, Fortran 77, O.P STAT, WINDOSTAT, Mathematica.

Abstract

1 Introduction

2 Terminology

3 Estimators in Literature

4 Proposed Estimator

5 Empirical Studies

6 Application of Real-Life Data Set

7 Conclusion

References

Biographies