The Poisson Nadarajah-Haghighi Distribution: Different Methods of Estimation

Sajid Ali1,*, Sanku Dey2, M. H. Tahir3, and Muhammad Mansoor4

1Department of Statistics, Quaid-i-Azam University, Islamabad 45320, Pakistan

2Department of Statistics, St. Anthony’s College, Shillong 793001, India

3Department of Statistics, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan

4Department of Statistics, Government Sadiq Egerton College, Bahawalpur, Pakistan

E-mail: sajidali.qau@hotmail.com

*Corresponding Author

Received 15 March 2021; Accepted 05 July 2021; Publication 23 August 2021

Abstract

Estimation of parameters of Poisson Nadarajah-Haghighi (PNH) distribution from the frequentist and Bayesian point of view is discussed in this article. To this end, we briefly described ten different frequentist approaches, namely, the maximum likelihood estimators, percentile based estimators, least squares estimators, weighted least squares estimators, maximum product of spacings estimators, minimum spacing absolute distance estimators, minimum spacing absolute-log distance estimators, Cramér-von Mises estimators, Anderson-Darling estimators and right-tail Anderson-Darling estimators. To assess the performance of different estimators, Monte Carlo simulations are done for small and large samples. The performance of the estimators is compared in terms of their bias, root mean squares error, average absolute difference between the true and estimated distribution functions, and the maximum absolute difference between the true and estimated distribution functions of the estimates using simulated data. For the Bayesian inference of the unknown parameters, we use Metropolis–Hastings (MH) algorithm to calculate the Bayes estimates and the corresponding credible intervals. Results from the simulation study suggests that among the considered classical methods of estimation, weighted least squares and the maximum product spacing estimators uniformly produces the least biases of the estimates with least root mean square errors. However, Bayes estimates perform better than all other estimates. Finally, we discuss a practical data set to show the application of the distribution.

Keywords: Exponential distribution, hazard rate, lifetime data, maximum likelihood method, Bayesian estimation, Nadarajah-Haghighi distribution, Poisson distribution.

1 Introduction

Although there are many continuous and discrete distributions in statistics literature, the exponential distribution enjoys a special place due to its memory-less and constant hazard rate properties. Thus, it is used as a benchmark model in the reliability analysis. To overcome constant hazard rate, many extensions of the exponential distribution have been introduced in the literature, for example, exponentiated-exponential (EE) (Gupta and Kundu, 1999) and beta-exponential (BE) (Nadarajaha and Kotz, 2006), among many others. Nadarajah and Haghighi (2011) introduced a new extension of the exponential and to define it, let Z have the Nadarajah-Haghighi (NH for short) distribution, say ZNH(α,λ). The cumulative distribution function (cdf) of NH distribution is given by

G(x)=1-e1-(1+λx)α, (1)

where λ>0 is the scale parameter and α>0 is the shape parameter. The NH distribution reduces to exponential distribution assuming α=1. The probability density function (pdf) corresponding to (1) is given by

g(x)=αλ(1+λx)α-1e1-(1+λx)α,x>0. (2)

Nadarajah and Haghighi (2011) pointed out that the density function (2) always has zero mode. Additionally, the hazard rate function (hrf) of the NH distribution can be increasing, decreasing, and constant. It is noted by Nadarajah and Haghighi (2011) that the NH density function can be monotonically decreasing and yet increasing hrf. Also, if Y is a Weibull random variable with the shape parameter α and scale parameter λ, then the density (2) has the same as that of the random variable Z=Y-λ-1 truncated at zero, i.e., the NH distribution can be interpreted as the truncated Weibull distribution.

Recently, Mansoor et al. (2020a) proposed the Poisson Nadarajah-Haghighi (PNH) model to model reliability systems. To this end, consider a company formed by N systems functioning independently at a given time, where N is a zero-truncated Poisson (ZTP) random variable (rv) with the probability mass function (pmf)

(N=n)=θnn!(eθ-1),n=1,2,.

Next, suppose that each system consists of β parallel units. The system will fail if all units fail and assume that the failure times of the units for the ith system, say Zi,1, …, Zi,β are independent and identically NH random variables with scale parameter λ and shape parameter α. Let Yi denote the failure time of the ith system and X represents the time to failure of the first of the N functioning systems. Then, one can write X=min(Y1,,YN) and the conditional cdf of X given N is

F(XN) =1-[(Y1>t)]N=1-[1-(Z1,1t,,Z1,βX)]N
=1-[1-{1-e1-(1+λx)α}β]N.

Hence, the unconditional cdf of X is given by

F(x)=1-exp{-θ[1-e1-(1+λx)α]β}1-e-θ.

For simplicity, let θ=1. Then,

F(x)=1-exp(-{1-e1-(1+λx)α}β)1-e-1. (3)

and the pdf corresponding to (3) is given by

f(x) =βαλ(1+λx)α-1e1-(1+λx)α{1-e1-(1+λx)α}β-11-e-1 (4)
×exp(-{1-e1-(1+λx)α}β). (5)

Hereafter, a random variable X with cdf (3) is called the Poisson Nadarajah-Haghighi (PNH) distribution and denoted by XPNH(β,α,λ). Clearly, if α=1, the PNH distribution reduces to the Poisson-exponential (PE) distribution. This distribution is introduced by Mansoor et al. (2020a), however, many properties, especially a comparison of different estimation methods has not been considered in the literature. The survival function (sf) and hrf of X are given by

S(x)=exp(-{1-e1-(1+λx)α}β)-e-11-e-1 (6)

and

h(x)=βαλ(1+λx)α-1e1-(1+λx)α{1-e1-(1+λx)α}β-11-exp(-[1-{1-e1-(1+λx)α}β]) (7)

respectively. Figures 1(a) and 1(b) display some plots of the density and hrf of X for different values of α, β and λ. Figure 1(a) reveals that the PNH density has decreasing and unimodal (right-skewed) shape, whereas Figure 1(b) indicates that the PNH hazard rate is decreasing, increasing, bathtub (BT) and up-side bathtub (UBT).

images

Figure 1 Plots of the PNH (a) density and (b) hazard rate for some selected parameter values.

Parameter estimation plays a vital role in statistics and the maximum likelihood estimation is generally a starting point to estimate parameters. The popularity of this method is due to its simple and intuitive formulation. For example, estimators obtained by this method are asymptotically consistent and normally distributed (Lehmann and Casella, 2003). However, there are other estimation methods in the literature, which are commonly used. For example, Kundu and Raqab (2005) for generalized Rayleigh distributions, Teimouri et al. (2013) for Weibull distribution, Ali et al. (2020b) for two-parameter logistic-exponential distribution, Dey et al. (2014, 2015, 2016, 2017b, 2017a, 2017c, 2017d) for the two-parameter Rayleigh, weighted exponential, two-parameter Maxwell, exponentiated-Chen, Dagum, transmuted-Rayleigh, two parameter exponentiated-Gumbel, new extension of generalized exponential and NH distributions. Recent literature in this direction may be seen in Alizadeh et al. (2020), Eliwa et al. (2020), Tahir et al. (2018), Ali et al. (2020c), Mansoor et al. (2020b), Ali et al. (2020a), Shafqat et al. (2020) and references cited therein. These methods are the method of moment estimation, method of L-moment estimation, method of probability weighted moment estimation, method of least-squares estimation, method of weighted least-square estimation, method of maximum product spacing estimation and method of minimum distance estimation and so on.

The aim of this study is to provide a comprehensive comparison of different frequentist methods of estimation for the PNH distribution. To this end, we assume different sample sizes and different combination of parameter values. We focus on the maximum likelihood estimators, percentile based estimators, maximum product of spacings estimators, least-squares estimators, weighted least-squares estimators, Cramér-von-Mises estimators, Anderson-Darling estimators and right-tail Anderson-Darling estimators. As it is difficult to compare the performances of different methods theoretically, extensive simulations are performed to compare the performances of the different estimators based on their relative bias, root mean squares error, the average absolute difference between the true and estimated distribution functions, and the maximum absolute difference between the true and estimated distribution functions of the estimates. The originality of this study comes from the fact that there has been no previous work comparing all of these estimation methods for the PNH distribution. Further, we also consider the Bayesian estimation of the unknown parameters under the assumptions of independent gamma priors on the scale and shape parameters, respectively. We present a Metropolis-Hastings (MH) algorithm to compute the Bayes estimates and the associated credible intervals. A real life data set is also analyzed for illustrative purposes. Thus, the study will be a guideline for choosing the best estimation method for the PNH distribution, which we think would be interesting for applied statisticians.

The rest of the paper is organized as follows. Section 2 presents the quantile function, moments and shapes of the pdf and hrf for the proposed model. Section 3 describes ten different frequentist methods of estimation. In Section 4, a simulation study is carried out to compare the performance of different methods of estimation for the proposed model. In Section 5, Bayesian analysis is conducted using the Metropolis-Hastings (MH) algorithm. In Section 6, the usefulness of the PNH distribution is illustrated using a real dataset. Finally, Section 7 offers some concluding remarks.

2 Basic Statistical Properties of PNH Distribution

This section discusses some basic statistical properties of the PNH distribution.

2.1 Quantile function

To generate random variables from the PNH distribution, we invert Equation (3) as X=F-1(u), where uUniform(0,1). The explicit form of the PNH quantile is

X =F-1(u)
=1λ[{1-ln(1-(-ln{1-u(1-exp(-1))})1/β)}1/α-1]. (8)

Further, the quantile function can be used to investigate the skewness and kurtosis measures. For example, the Bowley skewness (Kenney and Keeping, 1962) based on quantiles is given by

B=F-1(3/4)+F-1(1/4)-2F-1(2/4)F-1(3/4)-F-1(1/4).

Similarly, the Moors’ kurtosis (Moors, 1988) is

M=F-1(3/8)-F-1(1/8)+F-1(7/8)-F-1(5/8)F-1(6/8)-F-1(2/8).

2.2 Moments

Many properties of a distribution can be studied using moments, e.g., tendency, dispersion, skewness, and kurtosis. The nth moment expression of PNH is given by

μn =βλn(1-exp(-1))i,j=0k=0n(-1)i+n-kexp(i+1)(i+1)k/2+1
(β(i+1)-1j)(nk)Γ(k/2+1,i+1),

where Γ(a,x) denotes the incomplete gamma function defined as Γ(a,x)=xta-1exp(-t)dt.

The graphical depiction of the mean, variance, skewness, and kurtosis is given in Figure 2. It is noticed that the mean and variance decreased by increasing λ while increased by increasing β. Similarly, the skewness and kurtosis decreased by decreasing β, and λ is not significant as observed in the cases of mean and variance.

images

Figure 2 Plots of the PNH (a) Mean (b) Variance (c) Skewness, and (d) Kurtosis for some selected parameter values.

2.3 Shapes of the Density and Hazard Rate Functions of PNH Model

To study the shapes of the density and the hrf, we determine critical points of the PNH density by lnf(x)/x=0, which are the roots of the following equation.

(α-1)λ(1-lnw)1/α -αλ(1-lnw)1-1/α+(β-1)λα(lnw)1-1/αw1-w
-αβλ(1-lnw)1-1/αw(1-w)β-1=0. (9)

The critical points of the PNH hrf can be obtained from the following equation

(α-1)λ(1-lnw)1/α -αλ(1-lnw)1-1/α+(β-1)λα(lnw)1-1/αw1-w
-αβλ(1-lnw)1-1/αw(1-w)β-1exp(-1-(1-w)β)1-exp(-(1-(1-w)β))
=0, (10)

where w=exp(1-(1+λx)α). One can examine numerically the local maximum, minimum and inflexion points of Equations (9) and (2.3).

Another property to characterize the distribution is the log-concave, i.e., the density is log-concave if d2/dx2logf<0, otherwise convex. The hazard would be decreasing if the density is log-concave. For the PNH, it is observed that the density is log-concave for α>1 with a fixed λ. Moreover, for β>α the density is also observed log-concave. Similarly, hazard rate average HRA(x)=H(x)x=1x0xh(u)du can be used to characterize the distribution whether it is increasing (decreasing) hazard rate IDHR (DIHR) if ddxH(x)x0(0) for x>0. The PNH is DIHR for β>1,α,λ.

A density is said to be new-better-than-used (NBU) if Δ(x,y)=S(x)S(y)S(x+y)1, for x,y0, otherwise new-worse-than-used (NWU). From Figure 3, it is clear that the PNH is NBU for α>1.

images

Figure 3 Plot to identify NBU and NWU of the PNH for some selected parameter values.

3 Methods of Estimation

This section describes ten classical methods for estimating the parameters, α, β and λ assuming x=(x1,,xn) a random sample of size n from the distribution (4) with unknown parameters α, β and λ.

3.1 Method of Maximum Likelihood

It is well-known that the method of maximum likelihood is the most popular method in statistical inference, since it has several attractive properties (Lehmann and Casella, 2003). θ=(β,α,λ). The log-likelihood for θ=(β,α,λ) based on a given sample is given by

(θ) =nlog(βαλ)-nlog(1-e-1)+(α-1)i=1nlog(1+λxi)
+i=1n(1-(1+λxi)α)
+(β-1)i=1nlog[1-e1-(1+λxi)α]-i=1n[1-e1-(1+λxi)α]β. (11)

The maximum likelihood estimators (MLEs) of the model parameters can be obtained by maximizing the log-likelihood function (θ) with respect to θ. There are several routines available for numerical maximization of (3.1) given in the R program (optim function), SAS (PROC NLMIXED), Ox (sub-routine MaxBFGS). Alternatively, one can differentiate (3.1) and solve the resulting nonlinear likelihood equations.

The partial derivatives of (θ) with respect to the parameters are given by

(θ)β =nβ+i=1nlog[1-e1-(1+λxi)α]
-i=1n[1-e1-(1+λxi)α]βlog[1-e1-(1+λxi)α],
(θ)α =nα+i=1nlog[1-(1+λxi)]
+(β-1)i=1ne1-(1+λxi)α1-e1-(1+λxi)α[(1+λxi)αlog(1+λxi)]
-βi=1n(1+λxi)αlog(1+λxi)e1-(1+λxi)α
[1-e1-(1+λxi)α]β-1-i=1n(1+λxi)αlog(1+λxi),
(θ)λ =nλ+(α-1)i=1nxi1+λxi+α(β-1)i=1n
xi(1+λxi)α-1e1-(1+λxi)α1-e1-(1+λxi)α
-αβi=1nxi(1+λxi)α-1e1-(1+λxi)α[1-e1-(1+λxi)α]
-αi=1n(1+λxi)α-1.

The MLE θ^=(β^,α^,λ^) of θ=(β,α,λ) can be obtained by solving simultaneously the following normal equations

(θ)β=0,(θ)α=0,(θ)λ=0.

There is no closed-form expressions for β^,α^ and λ^ and therefore numerical computations using nonlinear optimization algorithm, such as the Newton-Raphson iterative method, should be used.

3.2 Method of Maximum and Minimum Spacing Distance Estimators

Cheng and Amin (1979) introduced the maximum product of spacings (MPS) method as an alternative to MLE for the estimation of parameters of continuous univariate distributions. Ranneby (1984) independently developed the same method as an approximation to the Kullback-Leibler measure of information. Let F(x(j)) denote the distribution function of the ordered random variables x(1)<x(2)<<x(n), where {x1,x2,,xn} is a random sample of size n from the cdf F().

Let define the uniform spacings of a random sample from the PNH distribution distribution as

Di(α,β,λ)=F(xi:nα,β,λ)-F(xi-1:nα,β,λ),

where F(x0:nα,β,λ)=0 and F(xn+1:nα,β,λ)=1. Clearly, i=1n+1Di(α,β,λ)=1.

Following Cheng and Amin (1983), the maximum product of spacings estimators α^MPS, β^MPS and λ^MPS of the parameters α, β and λ are obtained by maximizing the geometric mean of the spacings with respect to α, β and λ

G(α,β,λ)=[i=1n+1Di(α,β,λ)]1n+1, (12)

or, equivalently, by maximizing the function

H(α,β,λ)=1n+1i=1n+1logDi(α,β,λ). (13)

The estimators α^MPS, β^MPS and λ^MPS of the parameters α, β and λ can also be obtained by solving the nonlinear equations

αH(α,β,λ) =1n+1i=1n+11Di(α,β,λ)
[Δ1(xi:n|α,β,λ)-Δ1(xi-1:n)|α,β,λ)]=0, (14)
βH(α,β,λ) =1n+1i=1n+11Di(α,β,λ)
[Δ2(xi:n|α,β,λ)-Δ2(xi-1:n)|α,β,λ]=0, (15)
λH(α,β,λ) =1n+1i=1n+11Di(α,β,λ)
[Δ3(xi:n)|α,β,λ)-Δ3(xi-1:n)|α,β,λ)]=0, (16)

where

Δ1(xi:nα,β,λ) =11-e-1[βe-[1-e1-(1+λxi:n)α]β(1+λxi:n)α
log(1+λxi:n)e1-(1+λxi:n)α(1-e1-(1+λxi:n)α)β-1], (17)
Δ2(xi:nα,β,λ) =11-e-1[e-[1-e1-(1+λxi:n)α]β(1-e1-(1+λxi:n)α)β
log(1-e1-(1+λxi:n)α)] (18)

and

Δ3(xi:nα,β,λ) =11-e-1[αβxi:n(1+λxi:n)α-1e-[1-e1-(1+λxi:n)α]β
(1-e1-(1+λxi:n)α)β-1e1-(1+λxi:n)α)]. (19)

Cheng and Amin (1983) showed that maximizing H as a method of parameter estimation is as efficient as the MLE estimation. Further, the MPS estimators are also consistent under more general conditions than the MLE estimators.

Similarly, the minimum spacing distance estimators of α^MSADE, β^MSADE and λ^MSADE of α, β and λ are obtained by minimizing

T(α,β,λ)=i=1n+1h(Di(α,β,λ),1n+1), (20)

where h(x,y) is an appropriate distance. Some choices of h(x,y) are |x-y| and |logx-logy|, which are called absolute and absolute-log distance, respectively. These estimators are called the minimum spacing absolute distance estimator (MSADE) and the minimum spacing absolute-log distance estimator (MSALDE). This method was originally proposed by Torabi (2008). The MSADE and MSALDE of parameters α, β and λ can be obtained by minimizing

T(α,β,λ)=i=1n+1|(Di(α,β,λ)-1n+1| (21)

and

T(α,β,λ)=i=1n+1|logDi(α,β,λ)-log1n+1|, (22)

with respect to α, β and λ respectively.

The estimators α^MSADE, β^MSADE and λ^MSADE of α, β and λ can be obtained by solving the following nonlinear equations

αT(α,β,λ) =i=1n+1Di(α,β,λ)-1n+1|Di(α,β,λ)-1n+1|
[Δ1(xi:nα,β,λ)-Δ1(xi-1:nα,β,λ)]=0
βT(α,β,λ) =i=1n+1Di(α,β,λ)-1n+1|Di(α,β,λ)-1n+1|
[Δ2(xi:nα,β,λ)-Δ2(xi-1:nα,β,λ)]=0,
λT(α,β,λ) =i=1n+1Di(α,β,λ)-1n+1|Di(α,β,λ)-1n+1|
[Δ3(xi:nα,β,λ)-Δ3(xi-1:nα,β,λ)]=0,

where Di(α,β,λ)1n+1.

The estimators α^MSALDE, β^MSALDE and λ^MSALDE of α, β and λ can be obtained by solving the nonlinear equations

αT(α,β,λ) =i=1n+1logDi(α,β,λ)-log1n+1|logDi(α,β,λ)-log1n+1|1Di(α,β,λ)
[Δ1(xi:nα,β,λ)-Δ1(xi-1:nα,β,λ)]=0
βT(α,β,λ) =i=1n+1logDi(α,β,λ)-log1n+1|logDi(α,β,λ)-log1n+1|1Di(α,β,λ)
[Δ2(xi:nα,β,λ)-Δ2(xi-1:nα,β,λ)]=0,
λT(α,β,λ) =i=1n+1logDi(α,β,λ)-log1n+1|logDi(α,β,λ)-log1n+1|1Di(α,β,λ)
[Δ3(xi:nα,β,λ)-Δ3(xi-1:nα,β,λ)]=0,

where logDi(α,β,λ)log1n+1.

3.3 Methods of Ordinary and Weighted Least Squares

The least squares and weighted least squares estimators were proposed by Swain et al. (1988) to estimate the parameters of beta distribution (Swain et al., 1988). It is well known that

E[F(xi:nα,β,λ)]=in+1and
V[F(xi:nα,β,λ)]=i(n-i+1)(n+1)2(n+2). (23)

Using the same notations as subsection 3.2, the ordinary least squares estimators α^OLSE, β^OLSE and λ^OLSE of the parameters α, β and λ are obtained by minimizing the function:

S(α,β,λ)=i=1n[F(xi:nα,β,λ)-in+1]2. (24)

These estimators can also be obtained by solving the following non-linear equations:

i=1n[F(xi:nα,β,λ)-in+1]Δ1(xi:nα,β,λ) =0,
i=1n[F(xi:nα,β,λ)-in+1]Δ2(xi:nα,β,λ) =0,
i=1n[F(xi:nα,β,λ)-in+1]Δ3(xi:nα,β,λ) =0,

where Δ1(.|α,β,λ), Δ2(.|α,β,λ) and Δ3(.|α,β,λ) are given by Equations (17), (18) and (19), respectively.

The weighted least-squares estimators α^WLSE, β^WLSE and λ^WLSE of the parameters α, β and λ are obtained by minimizing the function:

W(α,β,λ)=i=1n(n+1)2(n+2)i(n-i+1)[F(xi:nα,β,λ)-in+1]2. (25)

The WLSE can be obtained by solving the following non-linear equations:

i=1n(n+1)2(n+2)i(n-i+1)[F(xi:nα,β,λ)-in+1]Δ1(xi:nα,β,λ) =0, (26)
i=1n(n+1)2(n+2)i(n-i+1)[F(xi:nα,β,λ)-in+1]Δ2(xi:nα,β,λ) =0, (27)
i=1n(n+1)2(n+2)i(n-i+1)[F(xi:nα,β,λ)-in+1]Δ3(xi:nα,β,λ) =0, (28)

3.4 Method of Percentiles

Since the PNH distribution has an explicit distribution function, the unknown parameters α, β and λ can be estimated by equating the sample percentile points with the population percentile points. This method is known as the percentile method (Kao, 1958, 1959). If pi denote the estimate of F(xi:nα,β,λ), then the percentile estimators α^PCEβ^PCE and λ^PCE of the parameters α, β and λ can be obtained by minimizing the function P(α,β,λ) with respect to α, β and λ:

P(α,β,λ)=i=1n[xi:n-F-1(pi|α,β,λ)]2,

where pi=in+1 is the unbiased estimator of F(xi:nα,β,λ) and F-1(pi|α,β,λ) is defined in (2.1).

3.5 Methods of the Minimum Distances

This section considers three estimation methods by minimization of the goodness-of-fit statistics, i.e., minimizing the distance between the theoretical and empirical cumulative distribution functions, with respect to α, β and λ.

3.5.1 Method of Cramér-von-Mises

To motivate our choice of Cramér-von Mises type minimum distance estimators, MacDonald (1971) provided empirical evidence that the bias of the estimator is smaller than the other minimum distance estimators. Thus, the Cramér-von Mises estimators α^CME, β^CME and λ^CME of the parameters α, β and λ are obtained by minimizing C(α,β,λ) with respect to α, β and λ:

C(α,β,λ)=112n+i=1n(F(xi:nα,β,λ)-2i-12n)2. (29)

The estimators can be obtained by solving the following non-linear equations:

i=1n(F(xi:nα,β,λ)-2i-12n)Δ1(xi:nα,β,λ) =0,
i=1n(F(xi:nα,β,λ)-2i-12n)Δ2(xi:nα,β,λ) =0
i=1n(F(xi:nα,β,λ)-2i-12n)Δ3(xi:nα,β,λ) =0,

where Δ1(.α,β,λ), Δ2(.α,β,λ) and Δ3(.α,β,λ) are given by (17), (18) and (19), respectively.

3.5.2 Methods of Anderson-Darling and Right-tail Anderson-Darling

Anderson and Darling (1952) introduced a test as an alternative to statistical tests for detecting sample distributions departure from the normal distribution. This method is used here to obtain the Anderson-Darling estimators, α^ADE, β^ADE and λ^ADE of the parameters α, β and λ, by minimizing the function A(α,β,λ) with respect to α, β and λ respectively.

A(α,β,λ) =-n-1ni=1n(2i-1)
{logF(xi:nα,β,λ)+logF¯(xn+1-i:nα,β,λ)}. (30)

The estimators can be obtained by solving the following non-linear equations:

i=1n(2i-1)[Δ1(xi:nα,β,λ)F(xi:nα,β,σ)-Δ1(xn+1-i:nα,β,λ)F¯(xn+1-i:nα,β,λ)] =0,
i=1n(2i-1)[Δ2(xi:nc,β,θ)F(xi:nα,β,λ)-Δ2(xn+1-i:nα,β,λ)F¯(xn+1-i:nα,β,λ)] =0,
i=1n(2i-1)[Δ3(xi:nα,β,λ)F(xi:nc,β,θ)-Δ3(xn+1-i:nα,β,λ)F¯(xn+1-i:nα,β,λ)] =0.

The right-tail Anderson-Darling estimators α^RTADE, β^RTADE and λ^RTADE of the parameters α, β and λ are obtained by minimizing R(α,β,λ) with respect to α, β and λ. The right-tail Anderson-Darling is defined as

R(α,β,λ) =n2-2i=1nF(xi:nα,β,λ)
-1ni=1n(2i-1)logF¯(xn+1-i:nα,β,λ). (31)

The estimators can also be obtained by solving the following non-linear equations.

-2i=1nΔ1(xi:nα,β,λ)F(xi:nα,β,λ)+1ni=1n(2i-1)Δ1(xn+1-i:nα,β,λ)F¯(xn+1-i:nα,β,λ) =0,
-2i=1nΔ2(xi:nα,β,λ)F(xi:nα,β,λ)+1ni=1n(2i-1)Δ2(xn+1-i:nα,β,λ)F¯(xn+1-i:nα,β,λ) =0,
-2i=1nΔ3(xi:nα,β,λ)F(xi:nα,β,λ)+1ni=1n(2i-1)Δ3(xn+1-i:nα,β,λ)F¯(xn+1-i:nα,β,λ) =0,

where Δ1(α,β,λ), Δ2(α,β,λ) and Δ3(α,β,λ) are given by Equations (17), (18) and (19), respectively.

4 Simulation Study

This section discusses Monte Carlo simulation studies to assess the performance of the frequentist estimators mentioned in the previous section. In particular, we used bias, root mean squared error, the average absolute difference between the theoretical and the empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and the empirical distribution functions as the performance assessment criteria. For comparison, we considered the following sample sizes: n=20, 40, 60, 80, 100. Ten thousand independent samples of the aforementioned sample sizes were generated from PNH distribution with parameters (α,β,λ)={(0.5,0.5,0.5),(3.5,3.5,3.5)}. It is observed that 10,000 repetitions are sufficiently large to have stable results. For all the methods considered in this study, first we have estimated the parameters using the method of maximum likelihood and used them as the initial values for the rest of the methods. Also, the same randomly generated samples are used to compute the simulation summaries of different estimation methods. The results of the simulation studies are tabulated in Tables 12.

For each estimate we calculate the bias, root mean-squares error (RMSE), the average absolute difference between the theoretical and the empirical estimate of the distribution functions (Dabs), and the maximum absolute difference between the theoretical and the empirical distribution functions (Dmax). The statistics are obtained using the following formulae:

Bias(α^)=1Ri=1R(α^i-α),Bias(λ^)=1Ri=1R(λ^i-λ) (32)
RMSE(α^)=1Ri=1R(α^i-α)2,RMSE(λ^)=1Ri=1R(λ^i-λ)2 (33)
Dabs(α^)=1(nR)i=1Rj=1n|F(xij|α,λ)-F(xij|α^,λ^)| (34)
Dmax(α^)=1Ri=1Rmaxj|F(xij|α,λ)-F(xij|α^,λ^)| (35)

Simulated bias, RMSE, Dabs, Dmax for the estimates are listed in Tables 12. The row with label Ranks shows the partial sum of the ranks and superscript indicates the rank of each of the estimators among all the estimators for that metric. For example, Table 1 shows the bias of MLE(α^) as 0.4144 for n=20. This indicates, the bias of α^ obtained using the method of maximum likelihood ranks 4th among all other estimators.

The following observations can be drawn from the Tables 12.

1. All the estimators show the property of consistency, i.e., the RMSE decreased as sample size increased.

2. The bias of α^ decreased by increasing n for all the method of estimations.

3. The bias of β^ decreased by increasing n for all the method of estimations.

4. The bias of λ^ decreased by increasing n for all the method of estimations but for small sample size, the estimate of λ is highly biased.

5. The bias of MSALDE increased by increasing n as compared to the other methods.

6. Dabs is smaller than Dmax for all the estimation techniques. Again, the statistics get smaller with the increase of sample size.

7. In terms of performance of the methods of estimation, it is observed that the WLS and MPS estimators uniformly produce the least biases of the estimates with least RMSE for most of the configurations considered in our studies.

Table 1 Simulation results for α=β=λ=0.5

n Est. MLE LSE WLS PCE MPS MSADE MSALDE CVM AD RAD
20 Bias(α^) 0.4144 0.2282 1.3146 43.30010 0.1741 1.6418 5.5609 0.3383 1.3817 1.1865
RMSE(α^) 0.8643 0.7512 3.4166 99.94610 0.5781 3.6107 8.4579 0.9014 3.8998 3.0985
Bias(β^) 0.4087 0.2235 0.1472 30.43710 0.0471 2.7628 6.5319 0.2124 0.1723 0.3206
RMSE(β^) 2.5207 0.8725 0.5482 56.54810 0.3141 5.0028 9.7369 0.5543 0.5964 0.9686
Bias(λ^) 135.20410 15.5989 6.3436 0.0101 1.5802 1.5863 4.1375 3.0094 7.3007 8.7628
RMSE(λ^) 1884.97810 132.3859 36.8526 0.0101 4.9803 4.1662 7.3975 7.0984 40.6177 43.1158
Dabs 0.0591.5 0.0625 0.0613 0.1827 0.0591.5 0.42110 0.3999 0.0625 0.3108 0.0625
Dmax 0.1022 0.1064 0.1043 0.2997 0.0981 0.81910 0.7889 0.1095.5 0.5478 0.1095.5
Ranks 44.55 414 343 568.5 11.51 568.5 6410 32.52 527 48.56
40 Bias(α^) 0.2173 0.1742 0.6127 15.42910 0.0881 2.1088 5.9889 0.2214 0.4216 0.4105
RMSE(α^) 0.4942 0.5233 2.0657 43.91810 0.3601 4.4758 9.0539 0.5844 1.7606 1.5745
Bias(β^) 0.0885 0.0794 0.0512 18.92510 0.0081 3.0618 6.7589 0.0936 0.0583 0.1167
RMSE(β^) 0.5187 0.3425 0.2473 51.89110 0.1611 5.6618 10.0569 0.2734 0.2372 0.4116
Bias(λ^) 9.38910 4.5799 1.6625 0.0101 0.5362 1.8847 4.0988 1.3223 1.4784 1.8556
RMSE(λ^) 196.24610 47.0519 17.0318 0.0101 2.4042 4.3774 7.2855 4.1943 13.0457 10.2456
Dabs 0.0421.5 0.0445 0.0433 0.1297 0.0421.5 0.43710 0.4049 0.0445 0.2468 0.0445
Dmax 0.0732 0.0774 0.0753 0.2137 0.0711 0.85110 0.7949 0.0796 0.4088 0.0785
Ranks 40.54 415 383 568 10.51 639 6710 352 446 457
60 Bias(α^) 0.1483 0.1462 0.3227 14.90410 0.0551 2.4978 6.0459 0.1744 0.1915 0.1996
RMSE(α^) 0.3662 0.4383 1.2607 54.29910 0.2751 5.1518 9.1869 0.4734 0.9116 0.8975
Bias(β^) 0.0384 0.0425 0.0262 18.41910 -0.0011 3.2488 6.8329 0.0556 0.0323 0.0617
RMSE(β^) 0.2337 0.2165 0.1563 60.18010 0.1131 6.0118 10.1209 0.1874 0.1432 0.2296
Bias(λ^) 2.4119 1.5597 0.5774 0.0101 0.2722 2.0848 4.15210 0.7156 0.4913 0.7075
RMSE(λ^) 49.61110 20.5859 7.1947 0.0101 1.3862 4.6066 7.4018 2.7363 4.2084 4.2385
Dabs 0.0341 0.0365 0.0352.5 0.1177 0.0352.5 0.44310 0.4069 0.0365 0.2128 0.0365
Dmax 0.0592 0.0644.5 0.0613 0.1927 0.0581 0.86110 0.7979 0.0656 0.3408 0.0644.5
Ranks 383.5 40.56 35.52 568 11.51 669 7210 383.5 395 43.57
80 Bias(α^) 0.1092 0.1245 0.1987 9.32710 0.0361 2.6638 6.0929 0.1446 0.1103 0.1214
RMSE(α^) 0.2922 0.3853 0.8937 46.99910 0.2251 5.4458 9.2419 0.4094 0.5675 0.5916
Bias(β^) 0.0213 0.0275 0.0172 15.82710 -0.0031 3.2928 6.9819 0.0386 0.0224 0.0417
RMSE(β^) 0.1434.5 0.1506 0.1123 61.08710 0.0901 6.0798 10.3219 0.1434.5 0.1082 0.1577
Bias(λ^) 0.6807 0.6968 0.2633.5 0.0101 0.1712 2.2919 4.28310 0.4446 0.2633.5 0.3915
RMSE(λ^) 18.60510 10.2329 1.7303 0.0101 0.8562 4.8757 7.5668 1.8634 1.9705 2.0546
Dabs 0.0302 0.0325.5 0.0302 0.1017 0.0302 0.44910 0.4059 0.0325.5 0.1918 0.0314
Dmax 0.0511.5 0.0554.5 0.0533 0.1637 0.0511.5 0.86910 0.7919 0.0566 0.2988 0.0554.5
Ranks 323 467 30.52 568 11.51 689 7210 425 38.54 43.56
100 Bias(α^) 0.0854 0.1035 0.1327 4.5089 0.0251 2.7728 5.79010 0.1176 0.0702 0.0773
RMSE(α^) 0.2432 0.3343 0.6707 23.17910 0.1901 5.6318 8.9939 0.3505 0.3414 0.3836
Bias(β^) 0.0143 0.0185 0.0122 13.32610 -0.0031 3.3268 7.0219 0.0286 0.0164 0.0307
RMSE(β^) 0.1034 0.1105 0.0892.5 60.28510 0.0781 6.1258 10.3339 0.1156 0.0892.5 0.1217
Bias(λ^) 0.2626 0.2877 0.1613 0.0101 0.1242 2.3969 4.09510 0.2898 0.1704 0.2475
RMSE(λ^) 8.25510 2.6177 0.6563 0.0101 0.5992 5.0218 7.3379 1.1996 0.6744 1.0115
Dabs 0.0272 0.0285 0.0272 0.0887 0.0272 0.45210 0.4069 0.0285 0.1738 0.0285
Dmax 0.0462 0.0494.5 0.0473 0.1417 0.0451 0.87410 0.7889 0.0506 0.2668 0.0494.5
Ranks 333 41.55 29.52 558 111 699 7410 487 36.54 42.56

Table 2 Simulation results for α=β=λ=3.5

n Est. MLE LSE WLS PCE MPS MSADE MSALDE CVM AD RAD
20 Bias(α^) 21.50010 3.7547 1.6204 3.4156 5.7919 1.0552 1.0531 4.0668 2.1315 1.5053
RMSE(α^) 36.50010 7.9528 4.5333 4.8805 10.5099 4.8144 4.1461 7.8987 5.1666 4.3602
Bias(β^) 3.50010 2.2618 1.0482 -3.4989 0.4511 2.0176 1.9135 1.5874 1.4063 2.1657
RMSE(β^) 18.50010 5.7338 3.0452 4.1315 2.1351 5.6807 4.2106 3.3533 3.4674 6.9839
Bias(λ^) 14.5009 16.19310 6.2806 0.0101 4.2874 0.5172 0.7593 4.6205 6.8397 7.7288
RMSE(λ^) 116.50010 39.9339 15.7536 0.0101 9.3114 4.0073 3.7592 9.5485 19.0587 20.6878
Dabs 0.0491 0.0626 0.0593 0.51210 0.0582 0.4679 0.4558 0.0614.5 0.4107 0.0614.5
Dmax 0.857 0.1013 0.0972 0.96110 0.0931 0.9259 0.9088 0.1035 0.7506 0.1024
Ranks 6710 599 281 478 312 425 343 41.54 456 45.57
40 Bias(α^) 18.10510 3.0976 1.4134 4.9129 4.7358 1.1413 0.6031 3.3337 1.6145 1.1202
RMSE(α^) 32.64510 6.7867 4.0273 5.8866 9.4559 4.7785 3.6822 6.8498 4.2604 3.6361
Bias(β^) 2.5319 1.3806 0.6882 -3.37010 0.3271 1.3917 1.8848 0.9984 0.7813 1.1335
RMSE(β^) 16.12210 3.3516 1.8212 3.87410 1.5521 5.1937 4.2818 2.2124 1.9533 2.4275
Bias(λ^) 13.57810 9.3259 4.0987 0.0101 3.3174 0.8313 0.3862 4.0206 3.9705 4.4928
RMSE(λ^) 108.52910 23.1079 10.3827 0.0101 7.9004 4.2963 3.4382 8.8125 10.1376 10.6248
Dabs 0.0411.5 0.0446 0.0423 0.51210 0.0411.5 0.4759 0.4658 0.0434.5 0.3967 0.0434.5
Dmax 0.0703 0.0735 0.0692 0.98310 0.0671 0.9569 0.9298 0.0735 0.7087 0.0735
Ranks 63.510 548.5 302 548.5 29.51 487 394 43.56 405 38.53
60 Bias(α^) 14.24710 2.7126 1.2853 5.1359 4.1618 1.5345 0.7041 2.8967 1.4004 0.9742
RMSE(α^) 27.22210 6.0816 3.6983 6.1968 8.7859 4.9615 3.6272 6.1657 3.8454 3.2771
Bias(β^) 0.9566 1.0097 0.5232 -3.38010 0.2461 1.0468 1.7109 0.7794 0.5633 0.8095
RMSE(β^) 5.85010 2.5926 1.4642 3.8647 1.2791 4.9549 4.1148 1.8105 1.5063 1.8074
Bias(λ^) 4.5859 6.57410 3.0506 0.0101 2.6294 1.1853 0.4822 3.5228 2.9235 3.3107
RMSE(λ^) 34.63710 16.4819 7.9756 0.0101 6.8764 4.4263 3.4062 8.1668 7.8425 8.1557
Dabs 0.0342 0.0354.5 0.0342 0.50510 0.0342 0.4779 0.4688 0.0366 0.3847 0.0354.5
Dmax 0.0562 0.0605 0.0573 0.98610 0.0551 0.9659 0.9328 0.0605 0.6747 0.0605
Ranks 5910 53.58 271 569 302 517 405 506 384 35.53
80 Bias(α^) 11.55710 2.4276 1.1583 5.9769 3.6988 1.7625 0.8081 2.5737 1.2214 0.8792
RMSE(α^) 23.19610 5.6056 3.4522 7.1988 8.2189 5.1115 3.6374 5.6847 3.5113 3.0841
Bias(β^) 0.5384 0.8147 0.4342 -3.44910 0.1991 0.8678 1.6489 0.6736 0.4613 0.6575
RMSE(β^) 2.7687 2.1556 1.2522 3.9178 1.1291 4.93410 4.0859 1.6135 1.2953 1.5274
Bias(λ^) 2.3755 5.24510 2.5027 0.0101 2.1624 1.4793 0.6322 3.2509 2.4036 2.7258
RMSE(λ^) 14.97310 13.3199 6.6616 0.0101 6.0744 4.6483 3.4982 7.7578 6.5825 6.7827
Dabs 0.0291.5 0.0315 0.0303 0.50310 0.0291.5 0.4799 0.4708 0.0315 0.3737 0.0315
Dmax 0.0492.5 0.0525 0.0492.5 0.99110 0.0481 0.9709 0.9328 0.0525 0.6467 0.0525
Ranks 506 549 27.51 5710 29.52 527.5 435 527.5 384 373
100 Bias(α^) 9.76510 2.2136 1.0743 7.3959 3.3588 1.9235 0.9092 2.3327 1.1114 0.8181
RMSE(α^) 20.45710 5.2055 3.2222 8.3379 7.7798 5.2967 3.6344 5.2686 3.2553 2.9031
Bias(β^) 0.3612 0.6667 0.3633 -3.37410 0.1661 0.7728 1.6889 0.5856 0.3774 0.5485
RMSE(β^) 1.5056 1.8427 1.1122 3.7418 1.0181 4.80510 4.0909 1.4715 1.1173 1.3244
Bias(λ^) 1.5093 4.23410 2.0837 0.0101 1.8395 1.6394 0.7642 2.9159 1.9886 2.3078
RMSE(λ^) 7.5799 11.12710 5.8236 0.0101 5.4884 4.7933 3.5622 7.2858 5.7405 5.9317
Dabs 0.0262 0.0285.5 0.0262 0.50210 0.0262 0.4819 0.4728 0.0285.5 0.3627 0.0274
Dmax 0.0431.5 0.0475.5 0.0443 0.99210 0.0431.5 0.9739 0.9348 0.0475.5 0.6207 0.0464
Ranks 43.55 569 281 5810 30.52 558 446 527 394 343

5 Bayesian Estimation

This section presents the Bayesian inference of the unknown parameters of the PHN distribution. It is needless to mention that, if all the parameters of the model are unknown, a joint conjugate prior for the parameters does not exist. For this, we assume piecewise independent priors and the proposed priors for the parameters α, β and λ may be taken as αGamma(a,b),βGamma(c,d) and λGamma(e,f). The joint prior distribution of α,β and λ can be written as p(α,β,λ)αa-1βc-1λe-1exp(-bα-dβ-fλ). We assume θ=(α,β,λ),x=(x1,x2,,xn), P(θ) denote the joint posterior and L(θ;x) is the likelihood function. For the PNH(x|α,β,λ), the likelihood function can be written as

L(θ;x) (αβλ)ni=1n(1+λxi)α-1exp(-i=1n(1+λxi)α)
  i=1n{1-exp(1-(1+λxi)α)}β-1
×exp(-i=1n{1-exp(1-(1+λxi)α)}β) (36)

Therefore, we write the joint posterior as

P(α,β,λ|x)  αn+a-1βn+c-1λn+e-1
exp(-α{b-i=1nln(1+λxi)})
exp(-i=1nln(1+λxi))
exp(-β{d-i=1nln[1-exp(1-(1+λxi)α)]})
exp(-i=1nln{1-exp(1-(1+λxi)α)})exp(-λf)
exp(-i=1n{1-exp(1-(1+λxi)α)}β) (37)
P(α,β,λ|x) Pα(n+a,b-i=1nln(1+λxi)|x,λ)P(λ|xα,β)
Pβ(n+c,d-i=1nln[1-exp(1-(1+λxi)α)]|x,α,λ) (38)

where Pα and Pβ are the gamma densities, and P(λ|xα,β)=λn+e-1exp(-i=1nln{1-exp(1-(1+λxi)α)})exp(-λf)exp(-i=1n {1-exp(1-(1+λxi)α)}β)exp(-i=1nln(1+λxi)). It is not difficult to show that P(λ|x,α,β) is log-concave for β>2 and α<1 and thus, the idea of Devroye (1986) can be used. Here, we will implement the Metropolis Hastings (MH) (Metropolis et al., 1953) algorithm to compute the estimators. The MH algorithm is a powerful Markov Chain Monte Carlo algorithm. To this end, we assume gamma density as transition kernel q(λ(i)|λ(*)) for sampling value of λ. The choice of gamma distribution has been considered purely for illustration purpose, and other suitable distributions can be used. After generating the marginal densities, the next step is to calculate the posterior summaries, 𝔼(θ|x)=θθ(θ|x). The steps to calculate the Bayes estimates are as follow:

Step 1: Take some initial guess values of α, β and λ, say α0, β0 and λ0, respectively;

(a) To generate λ, evaluate the acceptance probability by k(λ(i),λ(*))=min(1,P(λ(*)|x)q(λ(i)|λ(*))P(λ(i)|x)q(λ(*)|λ(i))), where P(λ|x) has been defined above.

(b) Generate a random numbers u from Uniform(0,1)

(c) If k(λ(i),λ(*))u, λ(i+1)=λ(*), otherwise λ(i+1)=λ(i).

Step 2: Suppose at the ith step, α, β and λ take the values αi,βi and λi. Now we can generate (λi+1|αi,βi,x), (αi+1|λi,x) and (βi+1|αi,λi,x);

Step 3: Repeat the above step N times;

Step 4: Calculate the Bayes estimator of h(α,λ) by 1N-Mi=M+1Nh(αi,λi), where M denote the number of burn-in sample.

For the Bayesian analysis, we generated 12,000 samples for α,β and λ, and the Bayes estimates with other posterior summaries, like MCMC error, median, 95% Bayesian intervals have been tabulated in Table 3 for the above mentioned parameter combinations and sample sizes. To compute the posterior summaries, we selected the hyperparameters in such a way that mean of the priors equal to the nominal parameter values with large variances. Moreover, we have used M=2,000 as the burn-in period for our calculations. From Table 3, it is noticed that as the sample size increases, the Bayes estimates approaches to the nominal values and the Bayesian intervals become tighter for large sample sizes. Furthermore, the MCMC error decreases with the increase of sample sizes.

Table 3 Monte Carlo Markov Chain results for Bayesian analysis

Parameter n Estimate SD MC error 95% CI Median
α=0.5 20 0.5044 0.5101 0.0060 (0.0128,1.88) 0.3504
40 0.5051 0.5085 0.0036 (0.0127,1.855) 0.3464
60 0.4999 0.5005 0.0014 (0.0128,1.833) 0.3468
80 0.501 0.5006 0.0012 (0.0129,1.824) 0.3464
100 0.5001 0.5002 0.0004 (0.0127,1.804) 0.3468
β=0.5 20 0.4974 0.5037 0.0048 (0.0123,1.919) 0.3402
40 0.4997 0.5035 0.0033 (0.0119,1.893) 0.3447
60 0.5008 0.5019 0.0014 (0.0128,1.856) 0.3467
80 0.5011 0.5014 0.0010 (0.0132,1.849) 0.3475
100 0.5001 0.5004 0.0004 (0.0127,1.846) 0.3469
λ=0.5 20 0.4986 0.4993 0.0045 (0.0124,1.845) 0.3458
40 0.503 0.5119 0.0031 (0.0116,1.897) 0.3489
60 0.4996 0.4989 0.0014 (0.0129,1.836) 0.3466
80 0.4974 0.4942 0.0011 (0.0131,1.811) 0.346
100 0.4991 0.4982 0.0004 (0.0128,1.804) 0.3467
α=3.5 20 3.506 1.329 0.0057 (1.426,6.574) 3.332
40 3.477 1.302 0.0054 (1.429,6.445) 3.31
60 3.502 1.33 0.0034 (1.402,6.541) 3.326
80 3.503 1.314 0.0032 (1.42,6.477) 3.316
100 3.499 1.302 0.0029 (1.405,6.541) 3.328
β=3.5 20 3.498 1.301 0.0059 (1.432,6.529) 3.347
40 3.482 1.329 0.0059 (1.396,6.477) 3.317
60 3.506 1.331 0.0036 (1.404,6.595) 3.347
80 3.502 1.328 0.0035 (1.407,6.536) 3.333
100 3.503 1.328 0.0032 (1.411,6.578) 3.342
λ=3.5 20 3.498 1.326 0.0063 (1.403,6.532) 3.323
40 3.511 1.328 0.0059 (1.404,6.582) 3.348
60 3.501 1.322 0.0038 (1.411,6.514) 3.337
80 3.502 1.324 0.0032 (1.408,6.555) 3.334
100 3.499 1.318 0.0029 (1.41,6.512) 3.337

Table 4 Bladder Cancer Data Set

0.08 2.09 3.48 4.87 6.94 8.66 13.11 23.63 0.20 2.23 3.52 4.98 6.97 9.02 13.29
0.40 2.26 3.57 5.06 7.09 9.22 13.80 25.74 0.50 2.46 3.64 5.09 7.26 9.47 14.24
25.82 0.51 2.54 3.70 5.17 7.28 9.74 14.76 26.31 0.81 2.62 3.82 5.32 7.32 10.06
14.77 32.15 2.64 3.88 5.32 7.39 10.34 14.83 34.26 0.90 2.69 4.18 5.34 7.59 10.66
15.96 36.66 1.05 2.69 4.23 5.41 7.62 10.75 16.62 43.01 1.19 2.75 4.26 5.41 7.63
17.12 46.12 1.26 2.83 4.33 5.49 7.66 11.25 17.14 79.05 1.35 2.87 5.62 7.87 11.64
17.36 1.40 3.02 4.34 5.71 7.93 11.79 18.10 1.46 4.40 5.85 8.26 11.98 19.13 1.76
3.25 4.50 6.25 8.37 12.02 2.02 3.31 4.51 6.54 8.53 12.03 20.28 2.02 3.36 6.76
12.07 21.73 2.07 3.36 6.93 8.65 12.63 22.69

6 Real Data Application

This section presents a real data set analysis using the PNH distribution and further compares it with competing models, like the exponentiated-NH (ENH) (Lemonte, 2013), exponentiated Weibull (EW) (Mudholkar and Srivastava, 1993), Marshall-Olkin Weibull (MOW) (Ghitany et al., 2005), BE, NH, EE and Weibull models. We estimate the model parameters by using the maximum likelihood method and compared the goodness-of-fit of the models using the Cramér–von Mises (W) and Anderson-Darling (A) statistics, which are described in detail by Chen and Balakrishnan (1995). In addition, we consider the Kolmogrov-Smirnov (K-S) statistic. In general, the smaller the values of these statistics, the better the fit to the data. The cdfs of the ENH, EW, MOW, BE and EE models are given by

ENH:FENH(x;β,α,λ)=(1-e1-(1+λx)α)β,x,β,α,λ>0,EW:FEW(x;c,α,λ)=(1-e-(x/λ)c)α,x,c,α,λ>0,MOW:FMOW(x;p,β,λ)=(1-e-(x/λ)β)[1-(1-p)e-(x/λ)β]-1,x,p,β,λ>0,BE:FBE(x;a,b,λ)=I1-e-λx(a,b),x,a,b,λ>0,EE:FEE(x;α,λ)=(1-e-λx)α,x,α,λ>0,

where Iz(a,b) is the incomplete beta function ratio.

The data set has been taken from Lee and Wang (2013) and reproduced in Table 4, which represents the remission times (in months) of a random sample of 128 bladder cancer patients.

The box-plot of these observations is displayed in Figure 4(a), which indicates that the distribution is right-skewed. The TTT plot (Aarset, 1987) of these data is shown in Figure 4(b) and it is clear that it is first concave and then convex, which suggests an upside-down bathtub shaped failure rate. Accordingly, the PNH distribution could, in principle, be appropriate for modeling the current data set. The MLEs (with SEs in parentheses), A, W and K-S statistics are included in Table 6. All three goodness-of-fit statistics indicate that the PNH model provides the best fit. Further, the empirical and estimated survival curves and PP plot are shown in Figures 5(a) and 5(b) and also support this conclusion.

images

Figure 4 (a) Boxplot (b) TTT plot for the bladder cancer data.

Table 5 Monte Carlo Markov Chain results for the Bayesian analysis of the data set

Parameter Estimate SD MC error 95% CI Median
α 0.7721 0.7783 0.0036 (0.0189,2.863) 0.5339
β 1.666 1.672 0.0078 (0.0420,6.182) 1.152
λ 0.2021 0.2017 0.0009 (0.0052,0.7446) 0.1409

Table 6 MLEs, their standard errors (in parentheses) and goodness-of-fit statistics for the bladder cancer data

Distribution Estimates A W K-S
PNH β=1.6362(0.2998) α=0.7240(0.1560) λ=0.2011(0.1012) 0.2422 0.0358 0.0405
ENH β=1.6884(0.3646) α=0.6371(0.1172) λ=0.3444(0.1752) 0.2779 0.0421 0.0442
EW c=0.6545(0.1346) α=2.7942(1.2626) λ=3.3483(1.8911) 0.2885 0.0436 0.0450
MOW p=13.5390(1.7811) β=0.5445(0.1491) λ=0.9449(0.8414) 0.8311 0.1417 0.0791
BE a=1.1879(0.1352) b=4.0609(13.2406) λ=0.0306(0.0982) 0.7154 0.1192 0.0738
NH α=0.9226(0.1514) λ=0.1216(0.0343) 0.6741 0.11008 0.0919
EE α=1.2179(0.1488) λ=0.1211(0.0135) 0.6033 0.1122 0.0725
W c=1.0477(0.0675) α=9.5585(0.8526) 0.7863 0.1313 0.0699

images

Figure 5 Bladder cancer data (a) empirical survival and estimated PNH survival function; (b) P-P plot.

7 Conclusion

In this article, we studied some basic statistical properties of the Poisson Nadarajah-Haghighi (PNH) distribution and estimated its parameters by eleven different methods of estimation, namely the maximum likelihood estimators, least squares and weighted least squares estimators, the maximum product of spacings estimators, the minimum spacing absolute distance estimators, the minimum spacing absolute-log distance estimators, Cramér-von-Mises estimators, Anderson-Darling and right-tail Anderson-Darling estimators and the Bayes estimators. Results of the simulation study showed that among frequentist estimators, WLS and MPS perform better than the other methods. However, the Bayesian is the best method. An application to a real data set is also presented as an illustration of the potentiality of the new model as compared to other existing models. It is expected the utility of the model in different fields, especially in survival analysis when hazard rate is decreasing, increasing, bathtub and upside-down bathtub shape. Further, it is also noticed that the performance of the MLEs is quite satisfactory. The use of the MLEs or Bayes estimators is recommended for practical purposes. In the future, record values can be analyzed assuming the PNH distribution.

Acknowledgments

The authors express their sincere thanks to the three reviewers and the editors for making some useful suggestions on an earlier version of this manuscript which resulted in this improved version.

References

Aarset (1987) Aarset, M. V. (1987). How to identify a bathtub hazard rate. IEEE Transactions on Reliability, R-36(1):106–108.

Ali et al. (2020a) Ali, S., Dey, S., Tahir, M. H., and Mansoor, M. (2020a). A comparison of different methods of estimation for the flexible Weibull distribution. Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics, 69:794–814.

Ali et al. (2020b) Ali, S., Dey, S., Tahir, M. H., and Mansoor, M. (2020b). Two-parameter logistic-exponential distribution: Some new properties and estimation methods. American Journal of Mathematical and Management Sciences, 39(3):270–298.

Ali et al. (2020c) Ali, S., Dey, S., Tahir, M. H., and Mansoor, M. (2020c). Two-Parameter Logistic-Exponential Distribution: Some New Properties and Estimation Methods. American Journal of Mathematical and Management Sciences, 39(3):270–298.

Alizadeh et al. (2020) Alizadeh, M., Afify, A. Z., Eliwa, M. S., and Ali, S. (2020). The odd log-logistic Lindley-G family of distributions: properties, Bayesian and non-Bayesian estimation with applications. Computational Statistics, 35:281–308.

Anderson and Darling (1952) Anderson, T. W. and Darling, D. A. (1952), Asymptotic Theory of Certain “Goodness of Fit” Criteria Based on Stochastic Processes, The Annals of Mathematical Statistics, 23(2):193–212. https://doi.org/10.1214/aoms/1177729437

Chen and Balakrishnan (1995) Chen, G. and Balakrishnan, N. (1995). A general purpose approximate goodness-of-fit test. Journal of Quality Technology, 27(2):154–161.

Cheng and Amin (1979) Cheng, R. C. H. and Amin, N. A. K. (1979). Maximum product of spacings estimation with application to the lognormal distribution, math.

Cheng and Amin (1983) Cheng, R. C. H. and Amin, N. A. K. (1983). Estimating parameters in continuous univariate distributions with a shifted origin. Journal of the Royal Statistical Society. Series B (Methodological), 45(3):394–403.

Devroye (1986) Devroye, L. (1986). Non-Uniform Random Variate Generation. Springer-Verlag.

Dey et al. (2017a) Dey, S., Al-Zahrani, B., and Basloom, S. (2017a). Dagum distribution: Properties and different methods of estimation. International Journal of Statistics and Probability, 6(2):74–92.

Dey et al. (2015) Dey, S., Ali, S., and Park, C. (2015). Weighted exponential distribution: properties and different methods of estimation. Journal of Statistical Computation and Simulation, 85(18):3641–3661.

Dey et al. (2016) Dey, S., Dey, T., Ali, S., and Mulekar, M. S. (2016). Two-parameter Maxwell distribution: Properties and different methods of estimation. Journal of Statistical Theory and Practice, 10(2):291–310.

Dey et al. (2014) Dey, S., Dey, T., and Kundu, D. (2014). Two-parameter Rayleigh distribution: Different methods of estimation. American Journal of Mathematical and Management Sciences, 33(1):55–74.

Dey et al. (2017b) Dey, S., Kumar, D., Ramos, P. L., and Louzada, F. (2017b). Exponentiated Chen distribution: Properties and estimation. Communications in Statistics – Simulation and Computation, 46(10):8118–8139.

Dey et al. (2017c) Dey, S., Raheem, E., and Mukherjee, S. (2017c). Statistical Properties and Different Methods of Estimation of Transmuted Rayleigh Distribution. Revista Colombiana de EstadÃstica, 40:165–203.

Dey et al. (2017d) Dey, S., Raheem, E., Mukherjee, S., and Ng, H. K. T. (2017d). Two parameter exponentiated Gumbel distribution: properties and estimation with flood data example. Journal of Statistics and Management Systems, 20(2):197–233.

Eliwa et al. (2020) Eliwa, M. S., El-Morshedy, M., and Ali, S. (2020). Exponentiated odd Chen-G family of distributions: statistical properties, Bayesian and non-Bayesian estimation with applications. Journal of Applied Statistics, 0(0):1–27.

Ghitany et al. (2005) Ghitany, M. E., Al-Hussaini, E. K., and Al-Jarallah, R. A. (2005). Marshall–Olkin extended Weibull distribution and its application to censored data. Journal of Applied Statistics, 32(10):1025–1034.

Gupta and Kundu (1999) Gupta, R. D. and Kundu, D. (1999). Theory & methods: Generalized exponential distributions. Australian & New Zealand Journal of Statistics, 41(2):173–188.

Kao (1958) Kao, J. H. K. (1958). Computer methods for estimating Weibull parameters in reliability studies. IRE Transactions on Reliability and Quality Control, PGRQC-13:15–22.

Kao (1959) Kao, J. H. K. (1959). A graphical estimation of mixed Weibull parameters in life-testing of electron tubes. Technometrics, 1(4):389–407.

Kenney and Keeping (1962) Kenney, J. and Keeping, E. (1962). Mathematics of statistics. Number v. 2 in Mathematics of Statistics. Princeton: Van Nostrand.

Kundu and Raqab (2005) Kundu, D. and Raqab, M. Z. (2005). Generalized Rayleigh distribution: different methods of estimations. Computational Statistics & Data Analysis, 49(1):187–200.

Lee and Wang (2013) Lee, E. T. and Wang, J. W. (2013). Statistical Methods for Survival Data Analysis. Wiley Publishing, 4th edition.

Lehmann and Casella (2003) Lehmann, E. and Casella, G. (2003). Theory of Point Estimation. Springer New York.

Lemonte (2013) Lemonte, A. J. (2013). A new exponential-type distribution with constant, decreasing, increasing, upside-down bathtub and bathtub-shaped failure rate function. Computational Statistics & Data Analysis, 62:149–170.

MacDonald (1971) MacDonald, P. D. M. (1971). Comment on “an estimation procedure for mixtures of distributions” by Choi and Bulgren. Journal of the Royal Statistical Society. Series B (Methodological), 33(2):326–329.

Mansoor et al. (2020a) Mansoor, M., Tahir, M. H., Alzaatreh, A., and Cordeiro, G. M. (2020a). The Poisson Nadarajah–Haghighi distribution: Properties and applications to lifetime data. International Journal of Reliability, Quality and Safety Engineering, 27(01):2050005.

Mansoor et al. (2020b) Mansoor, M., Tahir, M. H., Cordeiro, G. M., Ali, S., and Alzaatreh, A. (2020b). The Lindley negative-binomial distribution: Properties, estimation and applications to lifetime data. Mathematica Slovaca, 70(4):917–934.

Metropolis et al. (1953) Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087–1092.

Moors (1988) Moors, J. J. A. (1988). A quantile alternative for kurtosis. Journal of the Royal Statistical Society. Series D (The Statistician), 37(1):25–32.

Mudholkar and Srivastava (1993) Mudholkar, G. S. and Srivastava, D. K. (1993). Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Transactions on Reliability, 42(2):299–302.

Nadarajah and Haghighi (2011) Nadarajah, S. and Haghighi, F. (2011). An extension of the exponential distribution. Statistics, 45(6):543–558.

Nadarajaha and Kotz (2006) Nadarajaha, S. and Kotz, S. (2006). The beta exponential distribution. Reliability Engineering & System Safety, 91(6):689–697.

Ranneby (1984) Ranneby, B. (1984). The maximum spacing method. an estimation method related to the maximum likelihood method. Scandinavian Journal of Statistics, 11(2):93–112.

Shafqat et al. (2020) Shafqat, M., Ali, S., Shah, I., and Dey, S. (2020). Univariate discrete Nadarajah and Haghighi distribution: Properties and different methods of estimation. Statistica, 80(3):301–330.

Swain et al. (1988) Swain, J. J., Venkatraman, S., and Wilson, J. R. (1988). Least-squares estimation of distribution functions in Johnson’s translation system. Journal of Statistical Computation and Simulation, 29(4):271–297.

Tahir et al. (2018) Tahir, M. H., Cordeiro, G. M., Ali, S., Dey, S., and Manzoor, A. (2018). The inverted Nadarajah–Haghighi distribution: estimation methods and applications. Journal of Statistical Computation and Simulation, 88(14):2775–2798.

Teimouri et al. (2013) Teimouri, M., Hoseini, S. M., and Nadarajah, S. (2013). Comparison of estimation methods for the Weibull distribution. Statistics, 47(1):93–109.

Torabi (2008) Torabi, H. (2008). A general method for estimating and hypotheses testing using spacings. Journal of Statistical Theory and Practice, 8(2):163–168.

Biographies

images

Sajid Ali is currently Assistant Professor at the Department of Statistics, Quaid-i-Azam University (QAU), Islamabad, Pakistan. He graduated (PhD Statistics) from Bocconi University, Milan, Italy. His research interest is focused on Bayesian inference, construction of new flexible probability distributions, time series analysis, and process monitoring.

images

Sanku Dey, M.Sc., Ph.D.: An Associate Professor in the Department of Statistics, St. Anthony’s College, Shillong, Meghalaya, India. He has to his credit more than 220 research papers in journals of repute. He is a reviewer and associate editors of reputed international journals. He has a good number of contributions in almost all fields of Statistics viz., distribution theory, discretization of continuous distribution, reliability theory, multi-component stress-strength reliability, survival analysis, Bayesian inference, record statistics, statistical quality control, order statistics, lifetime performance index based on classical and Bayesian approach as well as different types of censoring schemes etc.

images

M. H. Tahir is currently Professor of Statistics, and Chair Department of Statistics at The University of Bahawalpur (IUB), Bahawalpur, Pakistan. He received BSc, MSc and PhD degree from IUB in 1988, 1990 and 2010, respectively. Dr Tahir has over 27 years of teaching experience to post-graduate classes, and has supervised 65 MPhil and 8 PhD students successfully. He has published more than 90 research papers in national and international journals, including Journal of Statistical Planning and Inference, Communications in Statistics-Theory and Methods, Communications in Statistics-Simulation and Computation, Journal of Statistical Computation and Simulation, Journal of Statistical Distributions and Applications, Journal of Statistical Theory and Applications. He is reviewer of more than 55 national and international statistical journals. Dr. Tahir’s research interests include distribution theory, generalized classes of distributions, survival and lifetime data analysis, methods of estimation, and construction of experimental designs.

images

Muhammad Mansoor is Assistant Professor of Statistics at the Department of Statistics, Government Sadiq Egerton Graduate College, Bahawalpur, Pakistan. His current research focuses on generalizing statistical distributions arising from the hazard function. Other research areas include statistical inference of probability models, computational statistics, and regression analysis.

Abstract

1 Introduction

images

2 Basic Statistical Properties of PNH Distribution

2.1 Quantile function

2.2 Moments

images

2.3 Shapes of the Density and Hazard Rate Functions of PNH Model

images

3 Methods of Estimation

3.1 Method of Maximum Likelihood

3.2 Method of Maximum and Minimum Spacing Distance Estimators

3.3 Methods of Ordinary and Weighted Least Squares

3.4 Method of Percentiles

3.5 Methods of the Minimum Distances

3.5.1 Method of Cramér-von-Mises

3.5.2 Methods of Anderson-Darling and Right-tail Anderson-Darling

4 Simulation Study

5 Bayesian Estimation

6 Real Data Application

images

images

7 Conclusion

Acknowledgments

References

Biographies