Improved Small Sample Inference on the Ratio of Two Coefficients of Variation of Two Independent Lognormal Distributions

Without the ability to use research tools and procedures that yield consistent measurements, researchers would be unable to draw conclusions, formulate theories, or make claims about generalizability of their results. In statistics, the coefficient of variation is commonly used as the index of reliability ofmeasurements.Thus, comparing coefficients of variation is of special interest.Moreover, the lognormal distribution has been frequently used for modeling data from many fields such as health and medical research. In this paper, we proposed a simulated Bartlett corrected likelihood ratio approach to obtain inference concerning the ratio of two coefficients of variation for lognormal distribution. Simulation studies show that the proposed method is extremely accurate even when the sample size is small.


Introduction
In health and medical research, it is common that the variable of interest, , such as the survival time, takes only positive values and the underlying distribution of this variable is highly skewed to the right.In this case, the frequently assumed normal distribution for  is not suitable.A standard approach to first transform  such that the transformed variable  = () is normally distributed.Then the existing statistical theories developed for the normal distribution can be applied.For  > 0 and the distribution of  is highly skewed to the right, the most common transformation is the logarithmic transformation.In other words,  = log() is normally distributed.Hence,  is lognormally distributed.Detailed review of the theories of the lognormal distribution can be found in Aitchison and Brown [1], and Crow and Simizu [2].In practice, Fears et al. [3] investigated the variability and reproducibility of hormone assays used by laboratories with the capability of performing large numbers of tests.They assumed the hormone samples used in laboratories are independent lognormally distributed.
In this case, it is of special interest to know if each sample yields consistent measurements.
The coefficient of variation () is defined as the ratio of the standard deviation to the mean, where the mean is assumed to be non zero.It is an important index for assessment of the reliability of a measuring procedure.Hence, the problem considered in Fears et al. [3] can be viewed as testing if the coefficients of variation used in each laboratory are the same or not.
Mathematically, if a random variable  is distributed as lognormal(, ), then  = log() is distributed as normal with mean  and variance  2 .It is well-known that and var ( Hence, the coefficient of variation, , is Nam and Kwon [4] compared various approximate interval estimations of the ratio of two coefficients of variation for independent lognormal distributions.And their simulation results showed that empirical coverage rates of these methods are satisfactorily close to the nominal coverage rate for medium sample size.The aim of this paper is to develop a more accurate method to obtain inference for the ratio of two coefficients of variation for independent lognormal distributions.Moreover, the proposed method can be generalized to test if the coefficients of variation from  independent lognormal distributions are heterogeneous.
The rest of the paper is organized as follows.Section 2 reviewed the existing methods for obtaining inference concerning the ratio of two coefficients of variation from independent lognormal distribution.The simulated Bartlett corrected likelihood method is proposed in Section 3. A real data example is presented in Section 4 to illustrate the application of the method discussed in this paper.Simulation studies are performed to compare the accuracy of the methods discussed in this paper in Section 5. Extension to testing for homogeneity of coefficients of variations from  independent lognormal distributions is discussed in Section 6.Some concluding remarks are recorded in Section 7.

Existing Methods for Inference on the Ratio of Two Coefficients of Variation of Two Independent Lognormal Distributions
Let ( 1 , . . .,    ) be the  ℎ sample from the lognormal(  ,   ) distribution, where  = 1, . . ., .Then ( 1 , . . .,    ) = (log  1 , . . ., log    ) is the  ℎ sample from the normal distribution with mean   and variance  2  .From (2), the  ℎ coefficient of variation is   = √exp{ 2  } − 1. Nam and Kwon [4] compared four methods in obtaining confidence intervals for  =  1 / 2 .The following is the summary of the methods discussed in Nam and Kwon [4]: (1) Wald type method Let the observed test statistic be where Then   () is asymptotically distributed as standard normal distribution.The significance function of  is () = Φ(  ()), where Φ() is the cumulative distribution function of the standard normal distribution.
(2) Fieller type method Let the observed test statistic be where Then   () is also asymptotically distributed as standard normal distribution.The significance function of  is () = Φ(  ()).
(3) Log method Let the observed test statistic be where Then   () is also asymptotically distributed as standard normal distribution.The significance function of  is () = Φ(  ()).
Then an approximate (1 − )100% confidence interval for log  is (, ) where Thus, an approximate (1−)100% confidence interval for  is (exp{}, exp{}).If ṽar(log τ ), for  = 1, 2, to be the same as that obtained in the Log method, the MOVER method is identical to the Log method.Note that Hasan and Krishamoorthy [5] proposed an improved version of the MOVER method.

Proposed Method
In this section, we will first review the likelihood based methods and the Bartlett corrected likelihood ratio method.Since the required Bartlett adjustment for the Bartlett corrected likelihood ratio method is very difficult to obtain, a numerical algorithm is proposed to approximate the Bartlett adjustment.Then the methods are applied to obtain inference for the ratio of two coefficients of variaation of two independent lognormal distribution.

Likelihood Based Methods and Bartlett
Corrected Likelihood Ratio Method.Let ( 1 , . . .,   ) be a sample from a known distribution with probability density function (⋅, ), where  is a -dimensional vector of parameters.Let  = (), which has dimension  <  be the parameter of interest.The log-likelihood function is Under the regularity conditions stated in Barndorff-Nielsen and Cox [6], we have the standardized maximum likelihood estimate (MLE) statistic ( θ − )  [var( θ)] −1 ( θ − ) and the likelihood ratio statistic 2[ℓ( θ) − ℓ()] that are asymptotically chi-square distributed with  degrees of afreedom,  2  , where θ is the overall MLE, which is the value of  that maximized ℓ(), and var( θ) is approximately the inverse of the Fisher's expected information.When the parameter of interest is  = (), Barndorff-Nielsen and Cox [6] showed that similar statistics can be obtained.The standardized MLE statistic becomes where ψ = ( θ), and var( ψ) can be approximated by the delta method, which takes the form The likelihood ratio statistic is where θ is the constrained MLE, which is obtained by maximizing ℓ() for the given  value.Both () and () are asymptotically  2  .As defined in Fraser [7], the significance function for  is defined as () = ( 2  ≤ ()) or () = ( 2  ≤ ()) can be used to obtain inference concerning  where () and () are the observed values of () and (), respectively.In particular, the (1−)100% confidence region of  is respectively, where  2 ,1− is the (1 − )100 ℎ percentile of  2  .It is well-known that these two asymptotic methods have rate of convergence ( −1/2 ), and they are referred to as the first-order methods.In statistics literature, there exists various adjustments to improve the accuracy of the above methods.In particular, Barndorff-Nielsen [8,9] introduced the modified signed log-likelihood ratio statistics, a thirdorder method.However, this method is restricted to scalar parameter of interest only.On the other hand, Bartlett [10] proposed a transformation of the likelihood ratio statistic such that the mean of the transformed statistic matched the mean of the asymptotic distribution.More specifically, where  is the Bartlett adjustment such that [ * ()] = .And  * (⋅) is known as the Bartlett corrected likelihood ratio statistic.An obvious choice of  is Bartlett [10] showed that the Bartlett corrected likelihood ratio statistic is also asymptotically  2  distributed and it has rate of convergence ( −2 ).Therefore, it is an extremely accurate method.Nevertheless, except in a few well-defined problem, [ * ()] is very difficult to obtain which hinders the use of this method in applied statistics.A review of the Bartlett corrected likelihood ratio method can be found in Barndorff-Nielsen and Cox [6].
Although, mathematically, the explicit closed form of , or even an asypmptotic expansion of , is difficult to obtain, we propose the following algorithmic way to obtain [()] numerically, and hence, an estimated .
Have: Overall maximum likelihood estimate θ, the constrained maximum likelihood estimate θ, and the observed likelihood ratio statistic ().
Step 2: For each set of simulated data, obtain the simulated observed likelihood ratio statistic.As a result, we have  1 (), . . .,   ().
Step 3: Calculate which is an estimate of the mean of the likelihood ratio statistic.Hence, we have B = ()/.
Step 4: The observed simulated Bartlett corrected likelihood ratio statistic is is asymptotically distributed as  2  with fourth order rate of convergence.Thus, the significance function is () = ( 2  ≤  * ()), and the (1 − )100% confidence region of  is As a final note on the proposed algorithm, theoretically, the choice of  should be as large as possible.However, the larger  is, the more calculations are required to obtain ().Moreover, the more nuisance parameters exist in the model, the larger  has to be.We recommend to use trial-by-error of  until () is stablized.).Then the likelihood function for  can be written as

Applying Likelihood Based Method to Obtain Inference on the Ratio of Two Coefficients of Variation of Two Independent
It is easy to show that the overall MLE Since our parameter of interest is  = () =  1 / 2 , where   = √exp{ 2  } − 1, we have For a given  value, the log-likelihood function in (20) can be expressed as a function of  2 2 only, and is Hence, to solve for the constrained MLE θ = (σ 2 1 , σ2 2 ), we have to find σ2 2 that maximized (23), and then σ2 1 = log( 2 exp{σ 2 2 } −  2 + 1).Once we have both the overall and constrained MLEs, we can obtain the observed likelihood ratio statistic () as given in (13).Therefore, the significance function is () = ( 2  1 ≤ ()).Moreover, by applying the algorithm given in the previous section, we can also obtain the observed simulated modified likelihood ratio statistic  * () and the corresponding significance function is () = ( 2  1 ≤  * ()).

Real Data Example
To illustrate the application of the methods discussed in this paper, we revisit the example discussed in Nam and Kwon [4].Faupel-Badger et al. [11] compare concentrations of estrogen metabolites by RIA with the concentrations obtained using a novel and high-performance liquid chromatography-tandem mass spectrometry (LC-MS/MS).The 10% blinded quality control samples were used for assessment of quality control of the laboratory assay.Partial summary of data were presented in Nam and Kwon [4] and we have where the first sample is taken from RIA, and the second sample is taken from LC-MS/MS.Table 1 records the 95% confidence interval for the ratio of the two coefficients of variation assuming that the data are obtained from independent lognormal distributions obtained by the methods discussed in this paper.Note that the MOVER method is identical to the Log method and Hasan and Krishnamoorthy [5] showed that results from the improved version of the MOVER method are still similar to those obtained by the Log method.Hence, both the MOVER method and its improved version are not included in the calculations.Except for the Wald type, the intervals obtained in Nam and Kwon [4] seem to be close to each other.Notice that the results from the Fieller type are different from that reported in Nam and Kwon [4].Moreover, we observed that the likelihood ratio method and the proposed Bartlett correction method seem to be different from the other methods by having a larger upper confidence limit.With the above observation, it is of interest to compare the accuracy of the methods discussed in this paper, especially when the sample size is small.

Simulation Studies
To compare the accuracy of the methods discussed in this paper, simulations studies are performed.The parameters settings are given in Table 2. Other settings have also been calculated but not reported because the results are very similar to those presented.However, they are available upon request.Since we are interested in developing a method that is accurate even for small sample sizes, hence the chosen sample sizes in the simulations studies are relatively small.For each study, we obtain 10,000 simulated samples.Theoretically,  should be as large as possible because we want to use () to be the estimate of [()].However, numerically, we have  simulated samples, and for each simulated sample, we have to do  simulations to obtain ().For these simulation studies, we use  = 500.For each simulated sample, we compute the 95% confidence interval obtained by the methods discussed in this paper.Table 3 reported the percentage of samples where the true  is less than the lower 95% confidence limit (le), is within the 95% confidence interval (cc), and is greater than the upper 95% confidence limit (ue).The nominal values are 2.5%, 95% and 2.5%, respectively.
From Table 3, the three methods discussed in Nam and Kwon [4] do not give satisfactory coverage, especially when the sample sizes are small.The coverage of the likelihood ratio method is improving when the sample sizes increase and, in general, it has asymmetric errors.Nevertheless the proposed simulated Bartlett corrected likelihood ratio method is extremely accurate even when the sample sizes are as small as 5.

Testing Homogeneity of Coefficients of Variation from 𝑘 Independent Lognormal Distributions
For  samples from independent lognormal(  ,   ) distribution, the required log-likelihood function can be written as The aim is to test V   : not all coefficients of variation are the same, which, in this case, is the same as testing  0 :  2 1 = ⋅ ⋅ ⋅ =  2  =  2 V   : not all variances are the same.
(28) Therefore, when  0 is true, the log-likelihood function can be re-written in terms of  2 and is and the constrained MLE is which is the usual pooled variance estimate.The observed likelihood ratio statistic is which is asymptotically distributed as  2 −1 .Hence, the observed simulated Bartlett corrected likelihood ratio statistic is where  is obtained by the algorithm given in Section 2. Simulation studies are performed to compare the accuracy of the likelihood ratio method and the simulated Bartlett corrected likelihood ratio method.In particular, three samples of data from lognormal(  , ) distribution are generated. is calculated and  * is also the calculation with  = 1000.We repeat this process  = 10,000.The proportion of samples that have -values less than 5% is reported in Table 4 for various sample sizes.The choice of   is not important because it does not involve in any of the calculations and, hence, we take it to be 0. Different choices of  result in similar results and are not reported, but they are available upon request.Table 4 reported the cases   = 0 and  = 1.When sample sizes are small, the likelihood ratio method does not give satisfactory results, but it is improving when the sample sizes increase.The simulated Bartlett corrected likelihood ratio method consistently gives extremely accurate result even when the sample sizes are small.

Conclusion
The lognormal distribution has been frequently used for modeling positive valued right skewed data, which commonly arise in health and medical research.In this paper, we proposed a simulated Bartlett corrected likelihood ratio approach to obtain inference concerning the ratio of two coefficients of variation for lognormal distribution.Simulation studies show that the proposed Bartlett correction method is extremely accurate even when the sample size is small.Moreover, the proposed proposed Bartlett correction method is extended to test homogeneity of  coefficients of variation from independent lognormal distributions.

Table 2 :
Parameters settings for simulation studies.

Table 3 (
a) Empirical coverage rate for the simulation studies 1 to 8