Impact of Logarithmic Transformation on the Restoration of Normality in Bioequivalence Data

The Logarithmic Transformation is widely used to address the skewness and assumes the normality assumption of the bioequivalence data but this may not be true in all cases unless the underlying assumption is taken into account and verified that the randomly generated data is normally distributed in the BE studies. Instead of restoring the normality in the data, the Log-Transformation may introduce new problems like inducing skewness with an increase in variability, which are even more difficult to deal with, then the original problem of non-normal distribution of data. Pharmacokinetic parameters, derived from the real biodata of the bioequivalence study of Glimepiride 4mg tablet was statistically analyzed, with and without, Log-Transformation through ANOVA and the two were compared for normality assumption through the standard testing for normality like Shapiro-Wilk and Q-Q Plots. The comparison of the conclusive results from both approaches, linear and log-transformed data, does not conclude any significant difference. A further investigation is required to strengthen this notion and to identify the circumstances and situations where the deterministic parameters are ascertained to select a suitable model for the data analysis and conclusion. The alternative analytic methods that eliminate the need of transforming non-normal data distributions prior to analysis, like Wilcoxon-MannWhitney two one-sided test which has been recommended by Hauschke et al., Hodges-Lehmann estimator or the other newer analytic distribution-free methods, that are not dependent on the distribution of data like the generalized estimating equations (GEE) are recommended.


INTRODUCTION
The randomly generated data in any bio-medical event like scientific experiment, survey or a clinical study, are mostly assumed and considered to be normally distributed.Normal Probability Distribution (NPD) is the most common amongst the continuous distributions and enjoys a prominent place in the biomedical research.Yet the attributes in all bio-medical researches or events are not distributed normally and in many cases, instead of a bell-shaped normal curve, the data show skewness in the mechanisms of distributions.Many a times, such departure from normality can be corrected by applying a standard practices of data transformation, like the logarithmic or square root but all important variables cannot be normalized or transformed to normality this way.
The Logarithmic Transformation (LT) is commonly used in biodata to deal with its skewness in order to get the distribution closer to symmetric or Normality, prior to t-testing.The LT is used in evaluating the majority of random variables and concluding their statistical inferences but it's usefulness is based on the fundamental assumption that the generated biodata is *Address correspondence to this author at the Department of Pharmaceutics, Faculty of Pharmacy, Barrett Hodgson University-Karachi, Karachi, Pakistan; Tel: 92 300 8262065; E-mail: Ghazala_ishrat@yahoo.com distributed normally.Yet it is not guaranteed that LT will assume normality and will not induce a skewness which might worsen the situation as the LT may not only induce skewness but also increase the variability of the data.In addition, the results obtained in a standard statistical test performed on log-transformed data are sometimes not relevant to the original, nontransformed data.
The Bioequivalence Studies (BE) are conducted with an objective of evaluating the equivalence of Test to Reference drug products, mainly for the switchability and interchangeability of generic copies with the innovator's brands.The selected Pharmacokinetic (PK) parameters are computed from the biodata generated during the study and required to be Log.Transformed prior to the statistical analysis.It is generally acknowledged that the validity of such inferences can never be ascertained if the inter-subject or intra-subject variabilities of such data are not distributed normally.Such variabilities represent a case-in-point for BE data.
It has been conventionally assumed that since the decay of plasma drug levels of bio-data, or plasma concentration-time data is exponential and is linear on the logarithmic scale, hence the probability distribution of any PK metrics, extracted from such data will be normal on such scale.Accordingly, the distribution of the variability of main PK metrics such as the area under the plasma concentration-time curve (AUC) and the maximum plasma drug concentration level (C max ) is likely to be normal on such scale.LT of the bio-data is mandatorily required prior to the statistical analysis, by almost all regulatory guidances with the aim of assuming normality of the probability distribution [1][2][3][4].

OBJECTIVE
The aim of the present work is to assess the impact of log-transformation in assuming normality of data, generated during a Bioequivalence (BE) study.The main focus of this research is to examine whether the log-transformation removes the skewness of data and succeeds in the restoration of normality in a distribution.Another prime objective of this research is to demonstrate the intrusive nature of log-transformation and its impact on the dispersion of real data.

METHODOLOGY
The real data for a 2 x 2 BE study of Glimepiride (active pharmaceutical ingredient) 4mg tablet, were used in this research, with the generous permission of JPM [5].Both products, Glitra (JPM) as Test and Amaryl (Aventis) as Reference product, satisfied the official requirements with regards to their pharmaceutical characteristics.The data, carrying unknown Probability Distribution, were assessed for the normality testing, in accordance with the officially approved statistical procedures using Biostat ® , a software, similar to SAS programme.The results of originally generated and log-transformed data of the BE study were compared.The impact of LT on the several known normality indicators, Skewness and Kurtosis, as well as on the outcome of other test procedure and estimates like the width of shortest 90% Confidence Interval (CI) and the outcome of the two-one sided test (TOST) procedure were examined.

Study Design Features, Statistical Evaluation and Bioequivalence Conclusion
The data used in this research work was taken from Glitra (Glimepiride) BE study, which was conducted on thirty six, healthy, male volunteers, as per the protocol, approved by the Ethics Committee.After the drug administration, the blood samples were collected on pre-determined intervals and the plasma Glimepiride levels were determined by a fully validated analytical procedure.
For the BE evaluation, two procedures, recommended by the worldwide regulatory authorities, the 90% classical Confidence Interval (CI) and the twoone sided testing of hypothesis (TOST) were employed.These two procedures are generally operationally equivalent to one another since they are supposed to arrive to, more or less, the same decision with regards to concluding bioequivalence or bio-inequivalence.Needless to say that both procedures have to be conducted on the Log-scale, despite the fact that the variance of all components of ANOVA was found insignificant on either scale.
For the purpose of concluding equivalence, statistical evaluation of the plasma drug concentrationtime data was conducted on the Log-transformed data, included the analysis of the variance (ANOVA) for the area under the plasma conc.-timeprofile from time zero to the last measureable plasma levels (AUC 0→t ), from time zero to infinity (AUC 0→∞ ) and the maximum plasma concentration (C max ).The intra-subject residual component of ANOVA was used to construct the shortest classical 90% CI, as well as Schuirmann's Two One-Sided Test (TOST) procedure was adapted.Nevertheless, BE could only be concluded by the three statistical procedures, used for BE testing, as demonstrated in Tables and Figures, coming ahead.
Notwithstanding, since the prime objective of the present work was to assess the impact of Logtransformation on the results of biodata, the statistical evaluation of data was also performed on the linear scale so that a comparison between the two approaches is established.

RESULT AND DISCUSSION
The results, presented in this work provide preliminary evidence of the weaknesses and shortcoming of the statistical procedures that are presently used to assess BE data.Some suggestions may also be proposed to address this problem.
Details of the BE study including the features, data for individual volunteer, analytical results, main pharmacokinetic metrics and the graphic presentation of the plasma concentration-time profile for the test and reference products are presented in Tables 1 & 2, and Figures 1 & 2 respectively.Detail of the ANOVA on linear-linear and linear-log scale is presented in Tables 3 & 4, graphical presentation of the 90% CI is given in Figure 3, assessment of the restoration of Normality by Shapiro-Wilk test and Q-Q plots of the studentized intra and inter-subject residuals is given in Table 5 & 6, and the statistical inference of the results, according to the       7.

Log-Transformation and Normality Assumptions
Detail of the impact of Log-transformation on the normality indicators for both, Test and Reference products of Glitra data, is presented in Table 5.The contents demonstrate that for PD approximated normality for both products, according to Shapiro-Wilk (SW) test procedure, despite the fact that the PD of C max data did not attain normality according to SW test.However, in spite of the fact that SW test is based on the intra-subject variability, the studentized Q-Q plots of this variability (studentized intra-subject residual) contradicted the outcome of SW test procedures in most of the examined metrics of Glitra study.
A tentative conclusion, may be drawn from these observations is that LT is likely to produce inconsistent outcome with regard to the restoration normality.Whilst the impact of LT on the PD is construed as the restoration of normality, an increase in the values of TOST is indicative of failure to reject the bio-inequivalence hypothesis which favors the conclusion that both products are bioequivalent.

Shapiro-Wilk Statistics and Normality Evaluation
SW test is considered by many scholars as the most robust procedure for the assessment of departure from normality, or the proximity, of probability distribution for any set of random variables.Hence, it was adopted in this work, together with other test procedure for this purpose.As per this test, 50% of the data in question should have a value of SW statistic that is higher than the null hypothesis cut-of value.This situation is exemplified in the Table 5.

CONCLUSION
Though Log-Transformation is assumed to restore or improve normality in the biodata yet it may not achieve the intended purpose as it is not guaranteed that log transformation will assume normality and will not induce a skewness and variability to deteriorate the situation.Using log-transformation may be somewhat problematic and if used at all, its limitations should be carefully considered, particularly when interpreting the relevance of the analysis of transformed data for the hypothesis of interest about the original data.
If the basic assumptions of normality restoration are not observed, in many circumstances, the log transformation does not restore normality or reduce the variability but introduce skewness in the data.Moreover, the inferences concluded from logtransformed data may not usually characterize the original data, since it does not share much with the original data.
It is also concluded that if the data can be reasonably modeled by a parametric distribution such as the normal distribution, it is preferable to use the classic statistical methods because they usually provide more efficient inference but if logtransformation is inevitable and used at all, it must be applied very cautiously.On contrary, in case of skewed data, instead of finding an appropriate statistical distribution or transformation to model the observed data, it may be more appropriate to switch to the other distribution-free methods like Wilcoxon-Mann-Whitney two one-sided test [6] which has been recommended by Hauschke et al. [7], Hodges-Lehmann estimator [8] or the other newer analytic distribution-free methods, that are not dependent on the data distribution like the generalized estimating equations (GEE) [9,10].The GEE approach ignores the distribution assumption and provide valid inference irrespective of the probability distribution of data, nevertheless, it is applicable only in case of the skewed data.

Table 3 :Figure 3 :
Figure 3: Point estimators and the upper and lower 90% Confidence Intervals on Linear and Log scale (Glitra BE study).

Table 5 :
Assessment of the restoration of Normality by Shapiro-Wilk Test (Reference & Test Products)

Table 6 : Q-Q Plots for AUC (Reference and Test Products)Table 7 : Hypothesis Testing According to Schuirmann and Anderson Hauck Procedures TOST
or Anderson & Hauck's test methods are presented in Table