Inverted Topp-Leone Distribution: Contribution to a Family of J-Shaped Frequency Functions in Presence of Random Censoring

Hiba Zeyada Muhammed1,* and Essam Abd Elsalam Muhammed2

1Department of Mathematical Statistics, Faculty of Graduate Studies for Statistical Research, Cairo University, Egypt

2Department of Management and Financial, High Institute of Computer and Information Technology, Elshorouk Academy, Egypt

E-mail: hiba_stat@cu.edu.eg; essamabdelsalam16@gmail.com

*Corresponding Author

Received 31 July 2021; Accepted 11 November 2021; Publication 06 December 2021

Abstract

In this paper, Bayesian and non-Bayesian estimation of the inverted Topp-Leone distribution shape parameter are studied when the sample is complete and random censored. The maximum likelihood estimator (MLE) and Bayes estimator of the unknown parameter are proposed. The Bayes estimates (BEs) have been computed based on the squared error loss (SEL) function and using Markov Chain Monte Carlo (MCMC) techniques. The asymptotic, bootstrap (p,t), and highest posterior density intervals are computed. The Metropolis Hasting algorithm is proposed for Bayes estimates. Monte Carlo simulation is performed to compare the performances of the proposed methods and one real data set has been analyzed for illustrative purposes.

Keywords: Inverted Topp leone distribution, moments, order statistic, maximum likelihood estimation, Bayesian estimation, MCMC, highest posterior density interval, asymptotic confidence interval, bootstrap confidence interval, random censoring.

1 Introduction

The Topp-Leone (TL) distribution was originally proposed by [1] as an alternative to beta distribution and it has applied for some family data. In recent years, the TL distribution has received huge attention in the literature; see for example; [2] showed that the TL distribution exhibits bathtub failure rate function with widespread applications in reliability. Moreover, [3] showed that TL distribution possesses some attractive reliability properties such as bath tub-shape hazard rate, decreasing reversed hazard rate, upside-down mean residual life, and increasing expected inactivity time. Recently, [4] derived admissible minimax estimates for the shape parameter of the TL distribution under squared and linear-exponential loss functions. Recently, [5] introduced the inverted Topp-Leone (IVT) distribution as a J-shaped distribution. Which is useful for modeling lifetime Phenomena and she studied more of its properties.

In this paper, we study classical and Bayesian estimation for the shape parameter of the IVT distribution when the sample is complete and random censored.

The random censoring can be expressed as follows: When the units under test lose or remove from the test before its failure this data is called random censoring. To more illustrate, in clinical trials or medical tests, some patients retreat or leave the test before finishing it.

The work in this paper is organized as follows: In Section 2 we introduce the IVT distribution. The maximum likelihood estimator (MLE) of the unknown parameter, the Bayes estimator, and the confidence interval based on complete data will be introduced in Section 3, in Section 4 we introduce the maximum likelihood estimators (MLEs) of the unknown parameters, the Bayes estimators and the confidence intervals based on random censoring data. Finally, the paper is concluded in Section 5.

2 Inverted Topp-Leone Distribution

The Topp Leone distribution is defined with the following pdf and cdf respectively,

F(t;β)=tβ(2-t)β (1)

And

f(t;β)=β(2-2t)(2t-t2)β-1 (2)

For 0<t<1 and β>0.

Assume X=1T the pdf and cdf of X are given respectively, as

f(x;β)=2β(x-1)x-2β-1(2x-1)β-1, (3)

And

F(x;β)=1-x-2β(2x-1)β. (4)

For 1<x< and β>0.

In this case, the distribution of X is called inverted Topp-Leone (IVT) distribution denoted by (β). It can be showed that the pdf (3) satisfies the following generalized Pearson system of differential equation

f(x)´f(x)=a0+a1x+a2x2b0+b1x+b2x2+b3x3

where a0=-2β-2, a1=6β, a2=-2β-2, b0=0, b1=1, b2=-3 and b3=2.

The IVT distribution may be considered as a J- shaped because f(x)>0, df(x)dx<0 and for some values d2f(x)dx2>0. And it can be noted that from Figure 1. Also, Figure 2 shows the cdf of IVT distribution for different values for the parameter β.

The mode of the IVT(β) is given by 1+32β+2.

The quantiles of the IVT(β) distribution is given by

xq=q-1β(1+1-q1β), 0<q<1.

images

Figure 1 The pdf for IVT distribution.

images

Figure 2 The cdf for IVT distribution.

The median is a special case from the quantile function, when q=12,

x0.5=(0.5)-1β(1+1-(0.5)1β)

And the inter-quartile range (IQR) is given as

IQR=(34)-1β(1+1-(34)1β)-(14)-1β(1+1-(14)1β).

The kth moment about origin μk´ is given by the following theorem.

Theorem 1: the kth-moment about zero μk´ X is given by

μk´=βj=0c(k,j)(-1)jj-k+β (5)

For k=1n and β1

Where

c(k,0) =2k,c(k,1)=k2k-1and
c(k,j) =k2k-2jj!i=1j-1(k-j-i),j2.

The survival function for the failure time X follows IVT distribution is defined as

R(x)=x-2β(2x-1)β (6)

For fixed x means the probability of survival up to time x. Figure 3 shows the survival function for the IVT distribution with different parameter values.

Moreover, for the IVT distribution, the hazard function is easily obtained as

h(x)=2βx-1x(2x-1). (7)

and it has different shapes according to the values of the parameter β as shown in Figure 4.

The reversed hazard function for the IVT distribution is given as follows

r(x)=2β(x-1)x-2β-1(2x-1)β-11-x-2β(2x-1)β. (8)

and it has different shapes as shown in Figure 5 according to to the variability in the parameter β.

images

Figure 3 The reliability function for IVT distribution.

images

Figure 4 The hazard function for IVT distribution.

images

Figure 5 The reversed hazard function for IVT distribution.

2.1 Distributions and Moments of Order Statistics from IVT Distribution

Let X1,X2,,Xn be independent and identically distributed random variables drawn from IVT distribution. Let X(r);r=1,2,,n, be the rth order statistic, then the pdf of X(r) is defined as

fn(x)=Cn,r[F(x)]r-1[1-F(x)]n-rf(x)

Where x=x(r), and Cn,r=n!r!(n-r)!.

fn(x) =2βCn,r(x-1)x2β(n-r-1)-1(2x-1)β(n-r-1)-1
[1-x2β(2x-1)β]β(n-r-1)-1

Special cases for X(1) and X(n) are respectively considered as

fn(x)=2nβ(x-1)x2β(n-2)-1(2x-1)nβ-1,x=x(1)

and

fn(x) =2nβ(x-1)x-2β-1(2x-1)β-1[1-x2β(2x-1)β]n-1,
x =x(n)

The joint pdf of x(r) and x(s), 1r<sn, for a sample of size n

fn(x,y) =Cn,r,s[F(x)]r-1[1-F(y)]n-s
×[F(y)-F(x)]s-r-1f(x)f(y)

Where x=x(r),y=x(s) and Cn,r,s=n!r!(s-r-1)!(n-s)!.

fn(x,y) =4β2Cn,r,s[1-x2β(2x-1)β]r-1[y2β(2y-1)β]n-s
[x2β(2x-1)β-y2β(2y-1)β]s-r-1
(x-1)x-2β-1(2x-1)β-1(y-1)y-2β-1(2y-1)β-1,
  1<x<y<

In the following theorems, the moments and product moments about origin will be introduced

Theorem 2: the kth-moment about zero μr:nk of rth order statistic X(r) is given by

μr:nk=Cn,rj=0c(k,j)(-1)1βB(j-kβ+1,n-r+1) (9)

For k=1n and r=1n,

Where

c(k,0) =2k,c(k,1)=k 2k-1and
c(k,j) =k 2k-2jj!i=1j-1(k-j-i),j2,

and B(,) is the beta function.

The kth moment of X(1) is given as

μ1:nk=n!j=0c(k,j)(-1)1βΓ(j-kβ+1)Γ(j-kβ+n+1).

The kth moment of X(n) is given as

μn:nk=nj=0c(k,j)(-1)1βΓ(j-kβ+n)Γ(j-kβ+n+1).

Theorem 3: the kth and Lth-product moments about zero μr:n(k,L) of X(r) and X(s) are given by

μr:n(k,L) =Cn,r,sj3s-r-1j2j1c(k,j1)c(L,j2)(s-r-1j3)(-1)j1+j2β+j3j1-kβ+j3+r
B(j1+j2-k-Lβ+s,n-s+1). (10)

For k=1n, L=1n and r=1n.

Where c(k,j1),c(L,j2) as given in (9).

3 Estimation Based on Complete Samples for IVT Distribution Shape Parameter

3.1 Maximum Likelihood Estimation

Suppose that X1,X2,,Xn is a simple random sample of size n drawn from IVT(β). In this section, the shape parameter of the IVT distribution will be estimated using the MLE as follows.

The likelihood function is given by

l(x;β)=i=1n2β(xi-1)xi-(2β+1)(2xi-1)β-1 (11)

The natural logarithm of the likelihood function is given as

L(x;β)nlog(β)-(2β+1)i=1nlog(xi)+(β-1)i=1nlog(2xi-1)

After differentiating the L(x;β) and equating to zero the MLE for β can be expressed in closed form as follows

β^=ni=1nlog?(xi2/(2xi-1)).

3.2 Bayesian Estimation

In this section, we have discussed the Bayesian estimation procedure for the parameter of the IVT distribution and we get the Bayesian estimate of the unknown parameter under the squared error loss (SEL) function. We assume that the unknown parameter of the IVT distribution have gamma prior distribution and can be written with proportional as follows;

π(β|a1,b1)βa1-1e-βb1,β>0,a1,b1>0 (12)

Hyper-parameters determination: The hyper-parameters involved in priors (12) can be easily evaluated if we consider that prior mean and prior variance are known. The prior mean and prior variance will be obtained from the maximum likelihood estimate of (β) by equating the mean and variance of (β^j) with the mean and variance of the considered priors (gamma prior), where j=1,2,,k and k is the number of random samples generated from the model in Section 3.1. Thus, on equating mean and variance of (β^j) with the mean and variance of gamma priors, we get ([6])

1kj=1kβ^j=a1b1&1k-1j=1k(β^j-1kj=1kβ^j)2=a1b12

Now, on solving the above two equations, the estimated hyper-parameters can be written as

a1 =(1kj=1kβ^j)21k-1j=1k(β^j-1kj=1kβ^j)2&
b1 =1kj=1kβ^j1k-1j=1k(β^j-1kj=1kβ^j)2

Based on the likelihood function (11) and the gamma prior (12), the joint posterior density function of β given the data can be written as

π(β|x¯)=π(β)L(β|x¯)0π(β)L(β|x¯)dβ.

Then, the joint posterior density function can be written as

π(β|x¯)=1k(β)i=1n2(xi-1)xi-(2β+1)(2xi-1)β-1βa1e-βb1 (13)

Where,

k(β)=0i=1n2(xi-1)xi-(2β+1)(2xi-1)β-1βa1e-βb1dβ,

Thus, the Bayes estimate of β based on SEL function is given by

β~ =E(β|x¯)
β~ =0βπ(β)L(β|x¯)dβ0π(β)L(β|x¯)dβ (14)

It should be noted that the ratio of integral in (3.2) cannot be obtained in closed forms. So, we use the MCMC approximation method to generate samples from (13) and to calculate the BE of β and also to construct associated HPD intervals.

Markov Chain Monte Carlo (MCMC) is considered to be a computer-driven sampling technique. It permits one to characterize a distribution without knowing all of the distribution mathematical properties by random sampling values out of the distribution ([7]). We use Metropolis Hasting (M-H) method with normal proposal distribution to generate random numbers from (13). Thus, we perform the following steps of the M-H algorithm to draw samples from the posterior density (13) and in turn compute the Bayes estimate of β and construct the corresponding HPD intervals ([8]):

I. Set initial values θ(0),= burn-in.

II. For i =1,…,N, repeat the following steps.

∙ Set θ=θ(i-1).

∙ Generate new candidate parameter values ω from N1(log(θ),Sθ).

∙ Set θ=exp(ω).

∙ Calculate A=min(1,π(θ|x¯¯)π(θ|x¯))

∙ Update θ(i)=θ with probability A; otherwise set θ(i)=θ.

The approximate Bayes estimate of θ(i)=(β(i)), i=1,2,,N with respect to SEL function is given by

θ~BS=1N-Mi=M+1Nθ(i),

Where θ~BS is Bayes estimate under SEL function and M is the burn-in-period (that is, a number of iterations before the stationary distribution is achieved).

3.3 Interval Estimation for IVT Distribution Shape Parameter

In this section, we propose different confidence intervals. One is based on the asymptotic distribution of β, two different bootstrap confidence intervals, and finally, HPD intervals.

The Asymptotic Confidence Interval

The second derivative for L(x;β) is trivially obtained as

d2Ldβ2=-nβ2.

The observed Fisher information matrix is given by

I(β^)=-d2Ldβ2|β=β^=nβ^2

The asymptotic variance of β^ is

V(β^)=1I(β^)=β^2n.

The sampling distribution of β^-βV(β^) can be approximated by a standard normal distribution.

The large sample (1-α)100% confidence interval for β is given by

(β^L,β^U)=β^Zα2V(β^).

where Zα2 is the standard normal random variable and (1-α) is the confidence coefficient.

Bootstrap Confidence Intervals

The bootstrap confidence intervals are approximate confidence intervals but in general are better approximate than standard intervals. A parametric bootstrap interval provides much more information about the population value of the quantity of interest than does a point estimate. The parametric bootstrap methods are of two types: –

(i) The percentile bootstrap method (Boot-p) was proposed by [9].

(ii) The Bootstrap-t method (Boot-t) was proposed by [10].

Percentile Bootstrap (Boot-P) Confidence Interval

The boot-p method is rather simple and constructs confidence intervals directly from the percentiles of the bootstrap distribution of the estimated parameters. It is given by the following steps:

I. A complete sample is generated from the original data T=(t1,t2tn) and the MLE θ^=(β^) of the parameter θ=(β) is computed.

II. Again, an independent complete bootstrap sample T*=(t1*,t2*tn*) is generated by using θ^.

III. Now, compute the bootstrap MLE θ^* of parameter θ based on T*, as in step-1.

IV. Repeat steps 2–3, B times representing B bootstrap MLE’s θ^*’s based on B different bootstrap samples, i = 1,2,… B.

V. Arrange all θ^*’s in an ascending order to obtain the bootstrap sample i.e. θ^*(1)θ^*(2)θ^*(B). An approximate 100(1-ω)% boot-p confidence interval for θ is obtained by (θ^*[(ω2)×B],θ^*[(1-ω2)×B]).

Where, ω2 is the quantity that helps to determine the bootstrap point.

Bootstrap-t (Boot-t) Confidence Intervals

The bootstrap-t confidence interval is given by the following steps:

I. Steps 1 and 2 of boot-p and boot-t methods are the same.

II. Compute the bootstrap-t statistic T*=θ^*b-θ^v(θ^*b) for θ^*b where b = 1, 2,…B.

III. To obtain a set of bootstrap statistics T*i;i=1,2,,B repeat steps 2–3, B times.

IV. Let T*(1)T*(2)T*(B) be the ordered values of T*i; i=1,2,,B.

V. Now, the approximate 100(1-ω)% boot-t confidence interval for parameter θ is obtained by

(θ^-T^*[(1-ω2)×B]v(θ^),θ^-T^*[(ω2)×B]v(θ^))

Highest Posterior Density (HPD) Intervals

The HPD intervals for the unknown parameters can be constructed by using the following algorithm: let θ(1),θ(2),,θ(n) be the corresponding ordered MCMC sample, to construct the HPD interval, let θ(j) be the jth smallest of {θ(i)} and denote Dj(n)=(θ(j),θ(j+[(1-φ)n])), where 0<φ<1 For j=1,2,,n-[(1-φ)n] be the HPD intervals then the best HPD interval that has the smallest interval width from Dj(n),s. So, we can say Dj*(n)=(θ(j*),θ(j*+[(1-φ)n])), be HPD interval for the unknown parameters have the smallest interval width among all Dj*(n)s. Where j* is chosen so that

θ(j*+[(1-φ)n])-θ(j*)=min1jn-[(1-φ)n](θ(j+[(1-φ)n])-θ(j))

Where θ(j)=β(j), and θ(i)=β(i) then the HPD intervals for the unknown parameters can be constructed.

3.4 Simulation Study

A simulation study was carried to check the performance of the accuracy of point and interval estimates for several cases, for which estimate the one parameter of IVT distribution (β) for the number of replications (m=1000), for different sample sizes (n) as n=25, 50, 80, 100 and different parameters values. All the computations are performed using statistical software R.

The simulations results for MLE are summarized in Tables 2, 3, 4, and 5 and obtained by the following steps:

i. Specify initial values for parameter (β) as (0.5), (0.8), (1.2) and (1.9).

ii. Specify the sample size n. as n=25,50,80,100.

iii. Generate n standard uniform variates i.e. UUniform(0,1).

iv. Generated complete samples of size n from IVT (β) distribution by using the formula x=U-1β(1+1-U1β)

v. Obtain the maximum likelihood estimates (MLEs).

vi. Obtain the mean, bias, mean squared error (MSE), asymptotic and bootstrap confidence intervals (CI’s) for the unknown parameters, average interval lengths (AILs), and coverage probability (CP) for the different sample size.

vii. Repeat steps 1–5 1000 times.

And the simulation results for the Bayesian estimate are summarized in Tables 2, 3, 4, and 5 which are obtained by the following steps:

i. Step I, ii, iii, iv, and v of the MLE simulation are the same

ii. By using the M-H algorithm shown in Section 3.2 under the informative prior and the non-informative prior and repeat the chain N times (N = 10000) to obtain MCMC samples.

∙ For informative prior, we compute the hyperparameters for all simulation cases as in Table 1.

∙ For non-informative prior (P-II) we assume that hyper-parameter values are a1=b1=0.

iii. Compute the approximate Bayes estimator of β under SEL function is given by

β~SEL=1N-Mi=M+1NgSEL(β(i)),i=1,2,,N.

Where M (=2000) is the burn-in-period (that is, a number of iterations before the stationary distribution is achieved).

Repeat step i–iii (1000) times to obtain the mean, bias, mean squared error (MSE), HPD intervals for the unknown parameters, average interval lengths (AILs), and coverage probability (CP) for the different sample sizes.

Table 1 The hyper parameters values under complete data

Initial Values

Hyper-Parameters β0=0.5 β0=0.8 β0=1.2 β0=1.9
30 a1 23.97 23.95 23.94 23.88
b1 45.85 28.75 19.21 12.00
50 a1 48.92 48.93 48.99 48.97
b1 95.59 59.47 40.02 25.21
80 a1 78.91 78.86 79.05 79.04
b1 155.47 97.51 64.86 41.17
100 a1 98.90 98.97 99.06 98.90
b1 195.76 123.19 82.52 51.58

Table 2 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under complete data

images

Table 3 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under complete data

images

Table 4 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under complete data

images

From tabulated values in Tables 2, 3, 4, and 5 it can be noticed that:

i. As expected, with respect to MSEs, higher values of n lead to better estimates.

ii. It is also noticed that the maximum likelihood estimates compete well with non-informative Bayes estimates, and the performance of the Bayes estimates obtained under informative prior is better than the non-informative Bayes estimates.

iii. It can also be noticed that under informative prior the AILs and associated CPs of HPD intervals are better than those of non-informative priors, bootstrap (p, t), and asymptotic confidence intervals respectively.

Table 5 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under complete data

images

3.5 Application to Real Data Set

In this section, the IVT distribution will be fitted to a real data set, to show how the IVT distribution can be applied in practice. moreover the IVT distribution will also compare with other inverted distributions that are fitted this data such as: inverse exponential (IE), inverse Rayleigh (IR), inverse Lindley(IL). And they will be introduced below as

The cdf, pdf of the inverse exponential (IE) distribution are respectively as

FIE(x)=e-λxx>0andλ>0andfIE(x)=λx2e-λx.

The cdf, pdf of the inverse Rayleigh (IR) distribution are respectively as:

FIR(x) =e-(σx)2x>0andσ>0and
fIR(x) =2σ2x3e-(σx)2.

The cdf, pdf of the inverse Lindley (IL) distribution are respectively as

FIL(x) =[1+θ1+θ1x]e-θxx>0andθ>0and
fIL(x) =θ21+θ(1+xx3)e-θx.

The data set consists of 100 observations of breaking stress of carbon fibers in (Gba) which are listed as follows:
1.061, 1.117, 1.162, 1.183, 1.187, 1.192, 1.196, 1.213, 1.215, 1.219, 1.220, 1.224, 1.225, 1.228, 1.237, 1.240, 1.244, 1.259, 1.261, 1.263, 1.276, 1.310, 1.321, 1.329, 1.331, 1.337, 1.351, 1.359, 1.388, 1.408, 1.449, 1.449, 1.450, 1.459, 1.471, 1.475, 1.477, 1.480, 1.489, 1.501, 1.507, 1.515, 1.530,1.530, 1.533, 1.544, 1.544, 1.552, 1.556, 1.562, 1.566, 1.585, 1.586, 1.599, 1.602, 1.614, 1.616, 1.617, 1.628, 1.684, 1.711, 1.718, 1.733, 1.738, 1.743, 1.759, 1.777, 1.794, 1.799, 1.806, 1.814, 1.816, 1.828, 1.830, 1.884, 1.892, 1.944, 1.972, 1.984, 1.987, 2.020, 2.030, 2.029, 2.035, 2.037, 2.043, 2.046, 2.059, 2.111, 2.165, 2.686, 2.778, 2.972, 3.504, 3.863, 5.306

Figure 6 shows that the empirical date compared by the inverted distributions namely IVT, IE, IR and IL.

images

Figure 6 Empirical distribution for lifetimes for carbon fibers data.

Table 6 MLEs, AIC, BIC, AICC and HQIC values, and Kolmogorov-Smirnov statistics for carbon fibers data

Measures

Model MLE p-value K-S -2log L AIC BIC AICc HQIC
IVT 5.6313 0.1197 0.1211 90.3201 92.3201 94.8844 90.3617 93.3566
IE 1.5680 0.0000 0.4528 290.4326 292.4326 294.9969 290.4742 293.4691
IR 1.5293 0.0000 0.3407 173.0542 175.0542 177.6185 173.0958 176.0907
IL 2.0773 0.0000 0.4350 280.3121 282.3121 284.8765 280.3538 283.3487

4 Estimation Based on Random Censored Samples for IVT Distribution Shape Parameter

4.1 Model Assumption and Description

The random censoring can be described as follows: if we have n units under test, Let their lifetime is T1,T2,,Tn which are independent and identically distributed (iid) random variables with pdf fT(t),t>0 and cdf FT(t)t>0, their random censoring times are C1,C2,,Cn which are iid with pdf gC(c),c>0 and cdf FC(c),c>0, assume Tis and Cis be mutually independent. Note that, between Tis and Cis, only one will be observed. Further, let the actual observed time be Xi=min(Ti,Ci),i=1,,n, and the indicator variable δi are defined as

δi={1;TiCi0;Ti>Ci (15)

The censored data (Xi) is known as the random censoring samples. The likelihood function under random censoring is given by [11]

L=i=1n[fT(x)SC(xi)]δi[gC(xi)RT(xi)]1-δi (16)

Where, RT(xi)=1-FT(t) and SC(xi)=1-FC(c).

4.2 Maximum Likelihood Estimation

In this section, we obtain the MLEs for the unknown parameters of the IVT distribution. Let the lifetime T and censoring time C follow IVT (β1) and IVT (β2) respectively. Then the likelihood function for the unknown parameters under random censoring becomes:

L =i=1n[2β1(xi-1)xi-2β1-2β2-1(2xi-1)β1+β2-1]δi
[2β2(xi-1)x-2β2-2β1-1(2xi-1)β2+β1-1]1-δi (17)

where i=1nδi=r is the observed number of uncensored lifetimes or failures.

Then, the corresponding log-likelihood function can be written as

l =rlog(2β1)+i=1nδilog(xi-1)+(-2β2-2β1-1)i=1nδilog(xi)
+(β1+β2-1)i=1nδilog(2xi-1)+(n-r)log(2β2)
+i=1n(1-δi)log(xi-1)+(-2β1-2β2-1)i=1n(1-δi)log(xi)
+(β2+β1-1)i=1n(1-δi)log(2xi-1) (18)

Differentiating (18) with respect to β1 and β2 gets:

lβ1 =rβ1-2i=1nδilog(xi)+i=1nδilog(2xi-1)
-2i=1n(1-δi)log(xi)
+i=1n(1-δi)log(2xi-1) (19)
lβ2 =n-rβ2-2i=1nδilog(xi)+i=1nδilog(2xi-1)
-2i=1n(1-δi)log(xi)
+i=1n(1-δi)log(2xi-1) (20)

Equating the first derivatives in (19) and (20) to zero and solving for β1 and β2 to get the MLEs β^1 and β^2 of β1 and β2, respectively in closed form as follows

β^1=r2i=1nδilog(x)-i=1nδilog(2x-1)+2i=1n(1-δi)log(x)-i=1n(1-δi)log(2x-1)

and

β^2=n-r2i=1nδilog(x)-i=1nδilog(2x-1)+2i=1n(1-δi)log(x)-i=1n(1-δi)log(2x-1)

4.3 Bayes Estimation for IVT Shape Parameter

In this section, we have discussed the Bayesian estimation procedure for the parameters of the IVT distribution based random censoring samples and we get BEs of the unknown parameters under the squared error loss (SEL) function. We assume that the unknown parameter of the IVT distribution has the independent gamma prior and can be written with proportional as follows;

π(β1|a1,b1) β1a1-1e-β1b1,β1>0,a1,b1>0 (21)
π(β2|a2,b2) β2a2-1e-β2b2,β2>0,a2,b2>0 (22)

Therefore, the joint prior density of β1 and β2 can be written with proportional as follows:

π(β1,β2)β1a1-1β2a2-1e-(β1b1+β2b2)β1,β2>0,a1,a2,b1,b2>0 (23)

Hyper-parameters determination: As in Section 3.2, the hyper-parameters involved in priors (21) and (22) can be easily evaluated, if we consider that prior mean and prior variance are known. The prior mean and prior variance will be obtained from the maximum likelihood estimates of (β1,β2) by equating the mean and variance of (β^1j,β^2j) with the mean and variance of the considered priors (gamma prior), where j=1,2,,k and k is the number of random samples generated from the model in Section 4.2. Thus, on equating mean and variance of (β^1j,β^2j) with the mean and variance of gamma priors, we get

1kj=1kβ^1j=a1b1&1k-1j=1k(β^1j-1kj=1kβ^1j)2=a1b12

Now, on solving the above two equations, the estimated hyper-parameters can be written as

a1 =(1kj=1kβ^1j)21k-1j=1k(β^1j-1kj=1kβ^1j)2&
b1 =1kj=1kβ^1j1k-1j=1k(β^1j-1kj=1kβ^1j)2

A similar procedure for determining the hyperparameters (a2,b2) can be used for β2.

Based on the likelihood function (4.2) and the joint prior density (23), the joint posterior density β1 and β2 given the data can be written as

π(β1,β2|x¯)=π(β1,β2)L(β1,β2|x¯)00π(β1,β2)L(β1,β2|x¯)dβ1dβ2.

Then, the joint posterior function can be written as

π(β1,β2|x¯) =1k(β1,β2)4β1r+a1-1β2(n-r)+a2-1e-(β1b1+β2b2)
i=1n(xi-1)δii=1n(xi-1)1-δi
i=1nxi(-2β1-2β2-1)δii=1nxi(-2β2-2β1-1)(1-δi)
i=1n(2xi-1)(β2-β1-1)δi
i=1n(2xi-1)(β1-β2-1)(1-δi) (24)

Where,

k(β1,β2) =004β1r+a1-1β2(n-r)+a2-1e-(β1b1+β2b2)
i=1n(xi-1)δii=1n(xi-1)1-δi
i=1nxi(-2β1-2β2-1)δii=1nxi(-2β2-2β1-1)(1-δi)
i=1n(2xi-1)(β2-β1-1)δi
i=1n(2xi-1)(β1-β2-1)(1-δi)dβ1,dβ2,

Thus, the Bayes estimate of g(β1,β2) based on SEL function is given by

g~BS(β1,β2) =E(g(β1,β2)|x¯)
g~BS(β1,β2) =00g(β1,β2)π(β1,β2)L(β1,β2|x¯)dβ1dβ200π(β1,β2)L(β1,β2|x¯)dβ1dβ2 (25)

It should be noted that the ratio of integral in (4.3) cannot be obtained in closed forms. So, we use the MCMC approximation method to generate samples from (24) and to calculate the BEs of β1 and β2 and also to construct associated HPD intervals. where we use the M-H method with normal proposal distribution.

4.4 Interval Estimation Based on Random Censoring Samples

In this section, we propose different confidence intervals. One is based on the asymptotic distribution of β1 and β2, two different bootstrap confidence intervals and finally, HPD intervals.

Asymptotic Confidence Intervals

The asymptotic variance-covariance matrix of the MLEs of β1 andβ2 can be obtained by inverting the observed information matrix, and is given as follows:

[-E(2β12)-E(2β1β2)-E(2β2β1)-E(2β22)](θ=θ^)-1
  =[V11V12V21V22]

Where θ^=(β^1,β^2) and θ=(β1,β2). The elements of the observed information matrix for β1 and β2 are given as follows:

2lβ12|θ=θ^=-rβ^12

Then the observed fisher information

I(β^1)=-2lβ12|θ=θ^=rβ^12

The asymptotic variance of β^1 is

V(β^1)=1I(β^1)=β^12r

and

2lβ22|θ=θ^=-(n-r)β22
I(β^1)=-2lβ22|θ=θ^=n-rβ^22

The asymptotic variance of β^2 is

V(β^2)=1I(β^2)=β^22n-r

The sampling distribution of β^i-βiv(β^i) where i=1,2, can be approximated by a standard normal distribution. The large (1-α)100% confidence intervals for β1 and β2 are given by

(β^iL,β^iU)=β^i±Zα2v(β^i)wherei=1,2.

Bootstrap Confidence Intervals

As in Section 3.3, two types of parametric bootstrap methods are considered

(iii) Percentile bootstrap method (Boot-p)

(iv) Bootstrap-t method (Boot-t)

Percentile Bootstrap (Boot-P) Confidence Interval

It given by the following steps:

i. A randomly censored sample is generated from the original data T=(t1,t2tn) and the MLE θ^=(β^1, β^2) of the parameter θ=(β1,β2) is computed.

ii. Again, an independent randomly censored bootstrap sample T*=(t1*,t2*tn*) is generated by using θ^.

iii. Now, compute the bootstrap MLE θ^* of parameter θ based on T*, as in step-1.

iv. Repeat steps 2–3, B times representing B bootstrap MLE’s θ^*’s based on B different bootstrap samples, i = 1,2,… B.

v. Arrange all θ^*’s in an ascending order to obtain the bootstrap sample i.e. θ^*(1)θ^*(2)θ^*(B). An approximate 100(1-ω)% boot-p confidence interval for θ is obtained by (θ^*[(ω2)×B],θ^*[(1-ω2)×B]).

Where, ω2 is the quantity that helps to determine the bootstrap point.

Bootstrap-t (Boot-t) Confidence Intervals

The bootstrap-t confidence interval is given by the following steps:

i. Steps 1 and 2 of boot-p and boot-t methods are the same.

ii. Compute the bootstrap-t statistic T*=θ^*b-θ^v(θ^*b) for θ^*b where b = 1, 2,… B.

iii. To obtain a set of bootstrap statistics T*i;i=1,2,,B repeat steps 2–3, B times.

iv. Let T*(1)T*(2)T*(B) be the ordered values of T*i;i=1,2,,B.

v. Now, the approximate 100(1-ω)% boot-t confidence interval for parameter θ is obtained by

(θ^-T^*[(1-ω2)×B]v(θ^),θ^-T^*[(ω2)×B]v(θ^))

Highest Posterior Density (HPD) Intervals

As in Section 3.3, the HPD intervals for the unknown parameters can be constructed. let θ(1),θ(2),,θ(n) be the corresponding ordered MCMC sample, to construct the HPD interval, let θ(j) be the jth smallest of {θ(i)} and denote Dj(n)=(θ(j),θ(j+[(1-φ)n])), where 0<φ<1. For j=1,2,,n-[(1-φ)n] be the HPD intervals then the best HPD interval that has the smallest interval width from Dj(n)’s. So, we can say Dj*(n)=(θ(j*),θ(j*+[(1-φ)n])), be HPD interval for the unknown parameters have the smallest interval width among all Dj*(n)’s. Where j* is chosen so that

θ(j*+[(1-φ)n])-θ(j*)=min1jn-[(1-φ)n](θ(j+[(1-φ)n])-θ(j))

Where θ(j)=β1(j),β2(j), and θ(i)=β1(i),β2(i) then the HPD intervals for the unknown parameters can be constructed.

4.5 Simulation Study

A simulation study was carried to check the performance of the accuracy of point and interval estimates for several cases, for which estimate the two parameters of IVT distribution (β1 and β2) for number of replications (m=1000), for different sample sizes (n) as n=25, 50, 80, 100 and different parameters values. All the computations are performed using statistical software R.

The simulations results for MLEs are summarized in Tables 8, 9, 10, and 11 and obtained by the following steps:

i. Specify initial values for parameters (β1 and β2) as (0.5, 0.3), (0.8, 0.9), (1.2, 1) and (1.9, 1.5).

ii. Specify the sample size n. as n=25,50,80,100.

iii. Generate n standard uniform variates i.e. UUniform(0,1).

iv. Generated samples of size n from IVT (β1) distribution (lifetimes) and IVT (β2) (censoring times) distribution by using the formula

t =U-1β1(1+1-U1β1)and
c =U-1β2(1+1-U1β2),respectively.

v. Calculate the times xi=min(Ti,Ci) and the censorship indicators δi, which are equal to 1 if Ti<Ci and 0 otherwise.

vi. Obtain the maximum likelihood estimates (MLEs).

vii. Obtain the mean, bias, mean squared error (MSE), asymptotic and bootstrap confidence intervals (CI’s) for the unknown parameters, average interval lengths (AILs), and coverage probability (CP) for the different sample size.

viii. Repeat steps 1–5 1000 times.

And the simulation results for Bayesian estimates are summarized in Tables 8, 9, 10, and 11 which are obtained by the following steps:

iv. Step I, ii, iii, iv, and v of the MLEs simulation are the same

v. By using the M-H algorithm shown in Section 4.2 under the informative prior and the non-informative prior and repeat the chain N times (N = 10000) to obtain MCMC samples.

∙ For informative prior, we compute the hyperparameters for all simulation cases as in Table 7.

∙ For non-informative prior (P-II) we assume that hyper-parameter values are a1=b1= a2=b2=a3= b3=a4=b4=a5=b5=0.

vi. Compute the approximate Bayes estimator of g(β1,β2) under SEL is given by

g~SEL(β1,β2)=1N-Mi=M+1NgSEL(β1(i),β2(i),),i=1,2,,N.

Where M (=2000) is the burn-in-period (that is, a number of iterations before the stationary distribution is achieved).

vii. Repeat step i–iii 1000 times to obtain the mean, bias, mean squared error (MSE), HPD intervals for the unknown parameters, average interval lengths (AILs), and coverage probability (CP) for the different sample size.

Table 7 The Hyper Parameters Values under random censoring data

Initial Values

β01=0.5, β01=0.8, β01=1.2, β01=1.9,
Hyper-Parameters β02=0.3 β02=0.9 β02=1 β02=1.5
30 a1 18.0 13.61 15.74 16.23
a2 10.9 15.28 13.28 12.73
b1 34.9 16.49 8.23 8.23
b2 34.7 16.34 8.18 8.18
50 a1 30.7 22.81 22.81 27.64
a2 18.3 26.12 26.12 21.41
b1 60.2 27.87 14.15 14.15
b2 60.2 27.86 14.17 14.17
80 a1 49.6 36.91 42.88 43.81
a2 29.5 42.19 36.21 35.25
b1 97.8 45.93 22.98 22.98
b2 97.8 45.90 22.97 22.97
100 a1 61.3 46.57 54.36 55.37
a2 37.7 52.37 44.60 43.60
b1 121.7 57.66 28.86 28.86
b2 121.7 57.61 28.84 28.84

Table 8 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under random censoring data

images

Table 9 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under random censoring data

images

Table 10 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under random censoring data

images

Table 11 Average estimated values, MSEs, bias, asymptotic and bootstrap (t-p) CI intervals of MLEs and BEs of IVT distribution parameters under random censoring data

images

From the results in Tables 811 the following conclusion can be made:

i. Similar to the complete case based on MSEs, higher values of n lead to better estimates.

ii. The CPs of the MLEs are better than those of the CPs of Bayes estimates obtained under informative prior and the non-informative Bayes estimates, respectively.

iii. The MSEs of the MLEs are less than the BEs under the SEL function.

iv. It can be noticed that under informative prior the AILs and of HPD intervals are better than those of non-informative priors, Bootstrap (t – p), and MLEs.

v. Estimates obtained by the MLEs and BEs are almost unbiased.

4.6 Application to Real Data

In this section, the IVT distribution will be fitted to a real data set, to show how the IVT distribution can be applied in practice. These data are taken from a lung cancer study described by [12]. These data show remission times (in days) of a group of 15 patients. The data set is given as: (8, 10, 11, 25*, 42, 72, 82, 100*, 110, 118, 126, 144, 228, 314, 411). The observations with (*) sign the censored times. For this data set, the unknown parameter (β) of the IVT distribution will be estimated by the maximum-likelihood method, and with this, the estimate (MLE), the values of the Kolmogorov-Smirnov (KS) statistic (the distance between the empirical CDFs and the fitted CDFs), Akaike information criterion (AIC ), Bayesian information criterion (BIC) and Hannan-Quinn information criterion (HQIC) are calculated. These results are summarized in Table 12:

Table 12 The Values of Goodness of Fit Test for Lung Cancer Data Set to the IVT distribution

k-s

Distribution B -2log L AIC BIC HQIC D-statistics p-value
IVT 0.277 299.65 301.7 302.2 301.5 0.3408 0.0751
IVT* 0.309 42.71 44.71 43.4 41.98 0.5452 0.4137
Note: (*) indicates the censoring times’ distribution.

From Table 12, the null hypothesis is not rejected, these lung cancer data may be modeled by the IVT distribution.

Moreover, MLE and Bayesian estimation methods are applied for estimating the model unknown parameter. For calculation of BEs, the hyper-parametersa1,b1,a2 and b2 are chosen such that the expected value Mβ1 of β1 is 0.2437 with a variance Vβ1=0.0046 giving a1=13 and b1=53.34, the expected value Mβ2 of β2 is 0.0375 with a variance Vβ2=0.0007 giving a2=2 and b2=53.33. these results are listed in Table 13.

The empirical distribution for lifetimes and for censoring times for the lung cancer data are represented in Figures 7 and 8 respectively.

Table 13 The MLEs and BEs of the parameters from lung cancer data set

BEs Under SEL Function Confidence Intervals

AILs (HPD Interval)

Parameter MLEs P-I P-II AILs (Asy CI) P-I P-II
β^1 0.2437 0.2355 0.1761 0.2672 (0.1341,0.4012) 0.1284 (0.881,0.3165) 0.0533 (0.1494,0.2027)
β^2 0.0375 0.0344 0.0321 0.1083 (0.0074,0.1158) 0.1083 (0.0232,0.0476) 0.0108 (0.027,0.0378)

Note: AILs- Average interval lengths.

images

Figure 7 Empirical distribution for lifetimes for lung cancer data.

Furthermore, the inverted distributions defined in Section 3.5 can be used to fit this data also with numerical results listed in Table 14 and Figure 9.

images

Figure 8 Empirical distribution for censoring times for lung cancer data.

Table 14 MLEs, AIC, BIC AICC and HQIC values, and Kolmogorov- Smirnov statistics for Lung Cancer Data lifetimes

Measures

Model MLE p-value K-S -2log L AIC BIC AICc HQIC
IL 32.725 0.1052 0.3013 -195.5445 181.3550 182.0630 179.6216 181.3474
IE 36.90926 0.1051 0.3014 -179.3469 181.3469 182.0550 179.6136 181.3394
IR 0.824718 0.0000 0.5911 -179.3550 211.8932 212.6012 210.1599 211.8857

images

Figure 9 Empirical distribution for different lifetimes for lung cancer data.

5 Conclusion

In this paper, we have obtained the maximum likelihood estimates and Bayes estimates for the unknown parameter of the IVT distribution based on complete and random censoring data, the confidence intervals, HPD intervals, and bootstrap (p-t) intervals are also obtained. We perform some simulations to see the performances of the MLEs and BEs incomplete and random censoring data. One real data set has been re-analyzed based on random censoring data.

References

[1] Topp, C.W. and Leone, F.C. (1955). A family of J-shaped frequency functions, Journal of the American Statistical Association, 50, 209–219.

[2] Nadarajah, S. and Kotz, S. (2003). Moments of some J-shaped distributions, Journal of Applied Statistics, 30, 311–317.

[3] Ghitany, M.E., Kotz, S. and Xie, M. (2005). On some reliability measures and their stochastic ordering for the Topp–Leone distribution, Journal of Applied Statistics, 32, 715–722.

[4] Bayoud, H. (2016). Admissible minimax estimators for the shape parameter of Topp–Leone distribution, Communications in Statistics-Theory and Methods, doi: 10.1080/03610926.2013.818700.

[5] Muhammed, H.Z. (2019). On The Inverted Topp-Leone Distribution, international journal of reliability and applications, 20, 17–28.

[6] Dey, S., Singh, S., Tripathi, Y.M. and Asgharzadeh, A. (2016). Estimation and prediction for a progressively censored generalized inverted exponential distribution. Statistical Methodology, 132, 185–202.

[7] Ravenzwaaij, D.V., Cassey, P. and Brown, S.D. (2018). A simple introduction to Markov Chain Monte-Carlo sampling. Psychonomic Bulletin Review, 25, 143–154.

[8] Dey, S. and Pradhan, B. (2014). Generalized inverted exponential distribution under hybrid censoring. Statistical Methodology, 18, 101–114.

[9] Efron, B., and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. New York: Chapman and Hall.

[10] Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals. The Annals of Statistics, 927–953.

[11] Lawless, J. F. (2011). Statistical Models And Methods for Lifetime Data, Second edition. John Wiley & Sons, Inc, Canada.

[12] Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. New York: Wiley.

Biographies

images

Hiba Zeyada Muhammed received a bachelor’s degree in Statistics from the Faculty of Science at Cairo University in 2006, a master’s degree in Statistics from Cairo University in 2009, and philosophy of doctorate in statistics from Cairo University in 2013, respectively. She is currently working as an Associative Professor at the Department of Mathematical Statistics, Faculty of Graduate Studies for Statistical Research, Cairo University. Her research areas include Reliability, life testing, bivariate and multivariate analysis, copula modeling and ranked set sampling. She has been serving as a reviewer for many highly-respected journals.

images

Essam Abd Elsalam Muhammed received a bachelor’s degree in applied Statistics from the Faculty of Commerce at kafr El-sheikh University in 2015, a master’s degree in Statistics from Cairo University in 2020, and in the preparatory year of the doctorate in statistics at Cairo University, respectively. He is currently working as a teaching assistant at High Institute of Computer and Information Technology, Elshorouk Academy. his research areas include Reliability and life testing.

Abstract

1 Introduction

2 Inverted Topp-Leone Distribution

images

images

images

images

images

2.1 Distributions and Moments of Order Statistics from IVT Distribution

3 Estimation Based on Complete Samples for IVT Distribution Shape Parameter

3.1 Maximum Likelihood Estimation

3.2 Bayesian Estimation

3.3 Interval Estimation for IVT Distribution Shape Parameter

The Asymptotic Confidence Interval

Bootstrap Confidence Intervals

Percentile Bootstrap (Boot-P) Confidence Interval

Bootstrap-t (Boot-t) Confidence Intervals

Highest Posterior Density (HPD) Intervals

3.4 Simulation Study

3.5 Application to Real Data Set

images

4 Estimation Based on Random Censored Samples for IVT Distribution Shape Parameter

4.1 Model Assumption and Description

4.2 Maximum Likelihood Estimation

4.3 Bayes Estimation for IVT Shape Parameter

4.4 Interval Estimation Based on Random Censoring Samples

Asymptotic Confidence Intervals

Bootstrap Confidence Intervals

Percentile Bootstrap (Boot-P) Confidence Interval

Bootstrap-t (Boot-t) Confidence Intervals

Highest Posterior Density (HPD) Intervals

4.5 Simulation Study

4.6 Application to Real Data

images

images

images

5 Conclusion

References

Biographies