logcondens : Computations Related to Univariate Log-Concave Density Estimation

Maximum likelihood estimation of a log-concave density has attracted considerable attention over the last few years. Several algorithms have been proposed to estimate such a density. Two of those algorithms, an iterative convex minorant and an active set algo-rithm, are implemented in the R package logcondens . While these algorithms are discussed elsewhere, we describe in this paper the use of the logcondens package and discuss functions and datasets related to log-concave density estimation contained in the package. In particular, we provide functions to (1) compute the maximum likelihood estimate (MLE) as well as a smoothed log-concave density estimator derived from the MLE, (2) evaluate the estimated density, distribution and quantile functions at arbitrary points, (3) compute the characterizing functions of the MLE, (4) sample from the estimated distribution, and ﬁnally (5) perform a two-sample permutation test using a modiﬁed Kolmogorov-Smirnov test statistic. In addition, logcondens makes two datasets available that have been used to illustrate log-concave density estimation.


Introduction 1.About this document
Although first uploaded to CRAN in 2006, a detailed description (beyond the package manual) of the functionality of the R package logcondens (Rufibach and Dümbgen 2010) had been lacking so far.This document is an introduction to logcondens, based on Dümbgen and Rufibach (2010), that not only discusses and illustrates the use of the implemented iterative convex minorant (ICMA) and the active set algorithm (ASA), but more functions that are useful in connection with univariate log-concave density estimation.The package also provides functions to evaluate quantities whose explicit computations are not immediate, such as distribution and quantile functions, the smoothed log-concave density estimator and the corresponding distribution function, sampling from the different estimators, and a twosample permutation test.In addition, the data sets analyzed in Dümbgen and Rufibach (2009) and Koenker and Mizera (2010) are provided as part of logcondens.Using the former of these datasets we illustrate how logcondens can be used to explore data.The package is available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=logcondens.data.Especially for small sample sizes, kernel estimates are prone to artifacts, but assuming additional structure on the underlying distribution reduces the influence of single observations on the estimated density.To illustrate this point we added the estimate resulting from invoking density() to Figure 1 (black line).
Every log-concave density is automatically unimodal.
The nonparametric MLE of a continuous unimodal density does not exist (see e.g.Birgé 1997) but the nonparametric MLE of a log-concave density does.Thus the class of logconcave densities may be a useful and valuable surrogate for the larger class of unimodal densities.
Using a standard EM algorithm, mixtures with log-concave component densities can be computed, see Chang and Walther (2007) and Cule, Samworth, and Stewart (2010).
Figure 1 provides an introductory example based on simulated data.Namely, we generated a random sample of size n = 40 from a standard normal distribution (the corresponding density function is in fact log-concave) and estimated the density of these observations under a log-concavity restriction.In the left plot of Figure 1 we provide plots of the true density the observations are sampled from, the estimated density, and the kernel density estimate as provided by density().Given the small sample size of n = 40, both the log-concave and kernel estimate capture the shape of the truth quite accurately.The features of the logconcave MLE are visible: The support of f is [X (1) , X (n) ] and the knots, marked by vertical lines at the bottom of the plot, are indeed at some of the observations.On the other hand, the notorious bumpy behavior of the kernel estimate appears in the tails, especially on the left hand side.Piecewise linearity of the estimated log-density is visualized in the right plot of Figure 1.
Explicit formulae for the distribution, integrated distribution, quantile function, the smoothed estimator as well as for the computation of the difference between two log-concave CDFs are postponed to the appendix.

Computing the log-concave estimator
The general log-likelihood.As discussed in the introduction, our goal is to maximize the functional (ϕ) over all log-concave functions ϕ : R → [−∞, ∞) such that exp(ϕ) is a probability density.Let us first modify somewhat: For given support points x 1 < x 2 < • • • < x m and probability weights w 1 , w 2 , . . ., w m > 0 we consider The standard setting.By default, x 1 < x 2 < . . .< x m are the different elements of {X 1 , X 2 , . . ., X n }, and w j := n −1 #{i ≤ n : X i = x j }.In the idealized setting of i.i.d.observations X i from a density f , the numbers m and n coincide, while x j = X (j) and w j = n −1 .However, there could be tied observations, e.g.due to rounding errors, resulting in m < n support points.
An alternative setting.In case of very large sample sizes one may wish to reduce computation time and storage space by approximating the raw data X i .Let Then we define probability weights w j , 1 ≤ j ≤ m, as follows: If a raw observation X i falls into [x j , x j+1 ], we add n −1 (x j+1 − X i )/(x j+1 − x j ) to w j and n −1 (X i − x j )/(x j+1 − x j ) to w j+1 .That way we approximate the empirical distribution of the raw data X i by a discrete distribution m j=1 w j δ x j having the same mean and possibly larger variance.However one can show that the increase in variance is no larger than max j<m (x j+1 − x j ) 2 /4.Finally, pairs (x j , w j ) with w j = 0 are removed to reduce the dimension as much as possible.
We implemented automatic computation of weights and the binning algorithm described above in the function preProcess in logcondens.This function is automatically invoked by those functions that take the raw data as argument, specifically activeSetLogCon, icmaLogCon and the functions that depend on them, most importantly logConDens.For a description of how to specify a user-defined grid we refer to the respective help files in logcondens.
The modified log-likelihood.To relax the constraint of f being a density and to get a criterion function to maximize over all concave functions, we employ the standard trick (cf.Silverman 1982, Theorem 3.1) of adding a Lagrange term to , leading to the modified functional (1) With the same arguments as Dümbgen and Rufibach (2009) one can show that there exists a unique concave function ϕ maximizing this functional L(•).It satisfies the equation exp ϕ(t) dt = 1 and has the following additional properties: and ϕ is continuous and piecewise linear on [x 1 , x m ] with knots only in {x 1 , x 2 , . . ., x m }.Any such function ϕ is fully specified by the vector ϕ = (ϕ(x j )) m j=1 .We therefore restrict attention to the set P(x 1 , x 2 , . . ., x m ) of vectors ϕ ∈ R m such that This allows to rewrite the integral (1) as with the auxiliary function The constrained optimization problem we are now aiming to solve reads where L is a strictly concave functional on R m , see Dümbgen, Hüsler, and Rufibach (2010, Section 2).For more details on the computations leading to the final form of L we refer to Dümbgen and Rufibach (2009).It is important to note that (iterative) maximization of the functional L yields the vector ( ϕ(x j )) m j=1 and not the density estimate directly.However, due to the piecewise linearity of the function ϕ the maximizing vector ( ϕ(x j )) m j=1 can be identified with for x ∈ R, where s j+1 = ∆ ϕ j+1 /∆x j+1 and ∆v j+1 := v j+1 − v j , 1 ≤ j < m, for any vector v ∈ R m .Finally, the density estimate at x is then simply f = exp ϕ, i.e. f = 0 outside [x 1 , x m ].These two functions are implemented in evaluateLogConDens.Computation of additional functions at a given point x is discussed in Section 5.2.
Approximation.From the definitions of L(ϕ) and J(r, s) above it is clear that numerical inaccuracies may occur whenever two consecutive components ϕ j , ϕ j+1 of the vector ϕ under consideration are getting very close.This means that the argument ϕ has "flat" stretches.
To avoid these numerical inaccuracies in computations, we approximate J and its derivatives in the implementation in logcondens by Taylor polynomials of degree four if |r − s| is small.Exact bounds and formulae for these polynomials are worked out in Dümbgen et al. (2010, Section 6).Similar approximations are also used to compute F and F * , see the appendix.
An iterative convex minorant algorithm.First attempts to compute ϕ are described in Rufibach (2007): Four different algorithms were proposed that all reliably found the maximum of L. However, not all are equally efficient.As a clear winner in the contest arranged in that paper in terms of speed came off the ICMA.For this reason, this algorithm was chosen to be implemented in logcondens, as the function icmaLogCon.First proposed by Groeneboom and Wellner (1992) and further detailed by Jongbloed (1998), the ICMA is especially tailored to maximize a smooth objective function over particular convex cones by maximizing quadratic approximations to the objective function via the pool-adjacent-violaters algorithm (PAVA).
An active set algorithm.For a description of active set algorithms see e.g.Fletcher (1987, Section 10.3) or Nocedal andWright (1999, Section 16.4).The application to logconcave density estimation is discussed in considerable generality by Dümbgen et al. (2010, Section 3) and therefore omitted here.A key feature of an ASA is that it solves a finite number of unconstrained optimization problems.The function activeSetLogCon estimates a log-concave density via an ASA.
Related work.Note also the implementations of the above two algorithms for isotonic estimation in the R package isotone and the description in de Leeuw, Hornik, and Mair (2009).In a similar context, an ASA was used to compute an estimate in the ordered factor regression problem, see Rufibach (2010) and the R package OrdFacReg (Rufibach 2009).
Range of applicability.The minimal sample size that allows estimation of a log-concave density is n = 2. Estimating a log-concave density for n in the millions causes no problems on a laptop computer and depending on the resources may take a few minutes.Alternatively, to speed up computation time for large n, one can approximate the empirical distribution of the raw data by a discrete distribution with m << n support points as indicated before.

Characterization and properties of the estimator
In what follows let F be the distribution function corresponding to the density f = exp ϕ, and let F be the distribution function of the discrete measure m j=1 w j δ x j .Using suitable directional derivatives of the log-likelihood function, Dümbgen and Rufibach (2009) derive various useful facts about ϕ.Here are the two most relevant ones in the present context: This characterization entails that F is rather close to F in the sense that F(t −) ≤ F (t) ≤ F(t) for any knot t of ϕ.It is also relevant for our algorithms, because with ∆ t (x) := min(x − t, 0).Suitable functions to compute H = H(•, ϕ) are implemented in logcondens, see Section 5.
Further properties.Another remarkable property of F and F is the inequality In particular, setting h(x) = ±x and h and Var( F ) ≤ Var(F). (4)

Smoothing the log-concave density estimator
From Dümbgen and Rufibach (2009, Theorem 2.1) we know that ϕ is equal to 0 outside [x 1 , x m ], i.e. ϕ has potentially sharp discontinuities at both ends of its support.In addition, the estimator ϕ may have quite sharp kinks; compare the mode of f in Figure 1.To overcome these minor inconveniences and for additional reasons elaborated in the analysis of the reliability data in Section 8, Dümbgen and Rufibach (2009) introduce a smoothed log-concave density estimator f * , defined for some bandwidth γ > 0 as i.e. the convolution of f with a Gaussian kernel φ γ with mean 0 and standard deviation γ.
By virtue of a celebrated result of Prékopa (1971), it is known that the class of log-concave densities is closed under convolution, whence f * is log-concave, too.Due to the simple structure of ϕ and our restriction to Gaussian kernels, one can deduce explicit formulae for f * (x) and its distribution function F * (x) at any x, see Section 5.2 and Appendix C.
Choice of bandwidth.Let F * be the distribution function of the smoothed density f * .Note that F * is the distribution function of X + Z with independent random variables X ∼ F and Z ∼ N (0, γ 2 ).Thus it follows from (3) that all distribution functions F * , F , F have the same mean X, whereas according to (4).Now consider the estimator In the standard setting, this is an unbiased estimator of Var(X i ).Hence, as proposed by Dümbgen and Rufibach (2009, Section 3), the default value for γ is the square root of This yields an estimated distribution function F * with standard deviation σ.Using γ = γ also makes f * a fully automatic estimator.
The variance of F can be computed explicitly as follows: Here J 10 equals the partial derivative of J(r, s) with respect to r, while J 11 is the partial derivative of J with respect to r and s, see Dümbgen et al. (2010, Section 2) for a derivation and computational details.

Function to compute the estimate
Using the simulated data from the introduction we now describe the function to estimate a log-concave density and the corresponding summary and plot methods.
The primary function of the package is logConDens which returns an object of class dlc.A dlc object is a list consisting of xn: the vector (X i ) n i=1 of original observations, x: the vector (x j ) m j=1 of support points, w: the vector (w j ) m j=1 of weights, phi: the estimated vector ϕ = ( ϕ(x j )) m j=1 , IsKnot: the vector (1{x j is a knot of ϕ}) m j=1 , L: the value L( ϕ), xs: Either the vector of 500 grid points or the initial vector xs of points at which f.smoothed is to be computed.
Finally, the entry smoothed returns the value of the initial argument.Note that the knots collected in IsKnot are not derived from ϕ but appear as a direct by-product of the active set algorithm.The quantities H j can be used to verify the characterization of the estimator in terms of distribution function as elaborated in Dümbgen and Rufibach (2009).See Section 3 for an illustration.
A summary and plot method are available for the class dlc.The summary method provides some basic quantities of the estimates whereas the plot method is intended to provide a quick way of standard plotting ϕ, f , and F and f * , F * (optional).For more tailor-made plots we recommend to extract the necessary quantities from the dlc object and plot them with the desired modifications, as we do when generating Figure 3.
Note that to plot f and F we do not simply linearly interpolate the points exp ϕ(t) for t ∈ S m ( ϕ) but we rather compute f and F on a sufficiently fine grid of points using the function evaluateLogConDens, to display the shape of the estimated density and distribution function between knot points correctly.

Evaluation of the fitted estimators
Thanks to the simple structure of ϕ, closed formulae can be derived for ϕ, f , F , F −1 as well as f * , F * on their respective domain, see Section 2 and the appendix for details.These formulae are implemented in logcondens via the two functions evaluateLogConDens and quantilesLogConDens.
To compute the process H = H(•, ϕ) in (2) one needs to be able to compute the integral of the distribution functions F and F, at an arbitrary point t.Again, exploiting the structure of ϕ a closed formula can be derived for t x 1 F (r) dr, see Appendix B. This formula is implemented in the function intF in the package logcondens, and we use this function together with its companion intECDF (ECDF: empirical cumulative distribution function) for the empirical distribution function to compute the process D.

Sampling from the different estimators
Cule et al. (2010) describe Monte Carlo estimation of functionals of f (which may be oneor multidimensional) by sampling from a random vector (variable) X that has density f and plugging in these samples into the functionals of interest.In order to sample from f , they use a rejection sampling procedure, see Cule et al. (2010, Section 7.1), implemented as the function rlcd in Cule et al. (2009).In logcondens we implemented sampling not only from f , but also from f * as the function rlogcon, using the quantile function F −1 implemented in quantilesLogConDens.Note that by making use of the quantile function F −1 in rlogcon we generate a sample of a given size from f directly, without the need of rejecting (some of) the generated numbers.There is no need to implement the quantile function F * for the sake of simulating samples from F * , thanks to the fact that f * is the density of the convolution of f with a given normal kernel.
Sampling from the estimate may serve at least two purposes: First, as described above, generated samples can be used to approximate functionals of f or f * , as described in Cule et al. (2010).Second, in certain applications it may be the explicit goal to get simulated samples from an estimated density, see Section 8 below and Dümbgen and Rufibach (2009, Section 3).

Illustration of main functions on simulated example
To demonstrate the code in logcondens we take up the sample generated in Section 1.2 (denoted as the object x.sim).Note that the plot method for a dlc object is already illustrated in that introductory example.Here, we detail application of the functions discussed in Section 5. To get a summary of the estimated densities invoke To illustrate the characterization of the estimators provided in Section 3, we use the functions intF and intECDF to compute the process H(t), see Figure 2. The upper plot in Figure 2 displays F and F whereas in the lower plot H is provided.The graphs illustrate that (1) F closely approximates F, a fact that can be made precise, see Dümbgen and Rufibach (2009, Corollary 2.5 and Theorem 4.4), (2) the kinks of ϕ occur where the process H is equal to zero as postulated in Section 3 (compare Figures 1 and 2).

Exemplary analysis of reliability dataset
To further demonstrate the functions in logcondens we will apply them to analyze a reallife dataset that had previously been used to illustrate log-concave density estimation.The reliability data was collected as part of a consulting project at the Institute for Mathematical Statistics and Actuarial Science at the University of Bern, see also Dümbgen and Rufibach (2009, Section 3).Basic descriptive statistics of the dataset are provided in Table 1.Note that in this dataset we have n = 786 original observations.However, pooling tied observations and introducing weights as described in Section 2 we end up with only m = 535 unique observations.A company asked for Monte Carlo experiments to predict the reliability of a certain device they produce, where the reliability depended in a certain deterministic way on five different and independent random input parameters.The dataset reliability in logcondens contains only data on the first of these input parameters.The goal was to fit a suitable distribution to this sample that can be used to simulate from.As elaborated in Dümbgen and Rufibach (2009, Section 3) we first considered two standard approaches to estimate the unknown density f : namely (i) fitting a Gaussian density f par with mean µ(F) and variance σ 2 = n(n − 1) −1 Var(F), (ii) the kernel density estimator where φ σ denotes the Normal density with mean 0 and variance σ 2 .
The very small bandwidth σ/ √ n was chosen to obtain a density with variance σ 2 and to avoid putting too much weight into the tails, which was crucial in the engineers' application.We think that in general such undersmoothed density estimators are a simple way of depicting the empirical distribution of raw data.
Looking at the data, approach (i) is clearly inappropriate because the reliability sample of size n = 786 reveals a skewed and pronouncedly non-gaussian distribution.This can be seen in Figure 3, where the multimodal curve corresponds to f ker , while the dashed line depicts f par .Approach (ii) yielded Monte Carlo results agreeing well with measured reliabilities, but the engineers were not satisfied with the multimodality of f ker .Choosing a kernel estimator with larger bandwidth would overestimate the variance and put too much weight into the tails.Thus we agreed on a third approach and estimated f by the smoothed log-concave estimator f * introduced in Section 4. This density estimator is the skewed unimodal curve in Figure 3. Apart from the advantages described above -unimodality, variance equal to σ 2 , and not too heavy tails -it yielded convincing results in the Monte Carlo simulations, too.In addition and as expected, the kinks and the discontinuities at x 1 and x m of f are smoothed out by f * .Note that both estimators f and f * are fully automatic.Now, the primary goal of this data analysis was to provide simulated samples from the estimated densities f and f * .The function rlogcon that implements sampling from the distribution that corresponds to f in the package logcondens is based on the quantile function F −1 , derived in Appendix A and implemented as quantilesLogConDens.Thanks to the fact that f * is the density of the convolution of f with a Gaussian kernel with standard deviation γ, a random number X * from f * can easily be obtained by computing X * = X + Y with independent random variables X ∼ F and Y ∼ N (0, γ 2 ).

R> set.seed(1977)
R> rel_samples <-rlogcon(n = 20, x0 = x.rel)The sorted sample of size n = 20 from the log-concave density estimator can be extracted using R> sort(rel_samples$X) [1] 1511.954 1594.168 1621.873 1645.044 1663.611 1673.327 1674.606 [8] 1675.477 1689.173 1689.405 1696.573 1697.265 1710.816 1712.501 [15] 1716.976 1724.642 1730.252 1783.473 1792.702 1807.486 and one from the smoothed log-concave density estimator via R> sort(rel_samples$X_star) [1] 1514.480 1599.645 1615.953 1648.505 1662.295 1663.369 1677.193 [8] 1677.583 1685.057 1686.471 1689.579 1705.722 1707.009 1712.771 [15] 1718.095 1727.095 1727.765 1780.637 1785.171 1813.476 9. Smooth two-sample permutation test Dümbgen and Rufibach (2009, Theorem 4.4) have shown that the distribution function estimators F and F are, under mild assumptions, asymptotically equivalent and F acts in this sense as a smoother of F. However, by imposing log-concavity efficiency gains in estimation seem plausible for small to moderate sample sizes.This leads us to the proposal of a smooth test for distribution functions which we now describe.Suppose the researcher is given two samples (X i ) n 1 i=1 and (Y i ) n 2 i=1 of independent random variables X i ∼ F X and Y j ∼ F Y with unknown distribution functions F X , F Y .
To test whether H 0 : F X = F Y versus H 1 : F X = F Y , a commonly used two-sample test statistic is the Kolmogorov-Smirnov statistic, comparing the empirical distribution functions F X and F Y of the two samples: where f ∞ := sup x∈R |f (x)| for any function f : R → R. The limiting distribution of K n and the corresponding asymptotic test can be found in Durbin (1973).
If one imposes that F X and F Y both have log-concave density functions, we propose the following modified test statistic: where F X and F Y are the log-concave distribution function estimators of F X and F Y .Deriving the limiting distribution of this statistic is a difficult task, but if one assumes that under H 0 the pooled sample (Z 1 , Z 2 , . . ., Z n 1 +n 2 ) := (X 1 , . . ., X n 1 , Y 1 , . . ., Y n 2 ) has the same distribution as where Π is a random permutation of {1, . . ., n 1 + n 2 } not depending on the data one can perform a Monte Carlo permutation test of H 0 as follows: Generate M independent copies of Π and calculate the corresponding values of the test statistic K (1) , K (2) , . . ., K (M ) .Then a nonparametric p-value for H 0 is given by The null hypothesis H 0 is rejected if p is not larger than the pre-specified significance level α.
Details on the computation of K( F X , F Y ) as well as for K * n ( F X , F Y ), the statistic based on the smoothed log-concave distribution function estimates, are provided in Appendix E. Computation of differences between distribution functions is implemented in the function maxDiffCDF in the logcondens package.To generate two samples from a Gamma distribution and invoke the smooth two sample test that is based on the function maxDiffCDF use the code: R> set.seed(1)R> n1 <-20 R> n2 <-25 R> x <-sort(rgamma(n1, 2, 1)) R> y <-sort(rgamma(n2, 2, 1) + 0.5) R> twosample <-logconTwoSample(x, y, M = 5, display = FALSE) R> twosample$p.valueSo in this example we get p-values of 0.33 and 0.17, respectively, quantifying the evidence against the null hypothesis of equal distribution functions.The first of these p-values results from a permutation test based on the log-concave MLE whereas the latter is computed via the smoothed log-concave density estimate.Note that we have chosen the (too) small number of permutations M = 5 only in this illustrative example.Clearly, in applications we recommend to set M to at least 1000, as we do in our simulations below.
To assess the performance of this nonparametric permutation test in terms of power compared to the Kolmogorov-Smirnov test we conducted a simulation study considering the following settings: Setting 1: n 1 = 20 n 2 = 25 X ∼ N(0, 1) vs. Y ∼ N(µ, 1) Setting 2: n 1 = 20 n 2 = 25 X ∼ Gam(2, 1) vs. Y ∼ Gam(2, 1) + µ Setting 3: n 1 = 20 n 2 = 25 X ∼ Gam(2, 1) vs. Y ∼ Gam(τ, 1) for µ ∈ {0, 0.5, 1, 1.5, 2} and τ ∈ {1, 1.5, 2, 2.5, 3}.We simulated each setting 1000 times and M = 999 was used as argument in logconTwoSample.The number of rejected null hypothesis when adopting a significance level of α = 0.05 are displayed in Figure 4. We find, in accordance with Cule et al. (2010, Section 9, Remark (ii)), that the test based on the logconcave CDF uniformly outperforms, for the rather moderate sample sizes we consider, the Kolmogorov-Smirnov test.For very large sample sizes the difference in power between the two tests decreases (not shown).We therefore recommend to use the function logconTwoSample to assess the null hypothesis of two identical distribution functions, especially when only few or moderately many observations are available.
Note that this modified test is valid even if the assumption of log-concavity of the density functions is violated.Such a violation would only affect the power.where ∆v j+1 := v j+1 − v j for any v ∈ R m .Suppose further that f := exp(ϕ) is a probability density on R and let F be the corresponding distribution function.Then F (x 1 ) = F 1 := 0, and with F m = 1.For a proof we refer to Dümbgen et al. (2010, Theorem 2.1).To compute the quantile function of F , note that for any x ∈ [x j , x j+1 ] by again exploiting the special structure of ϕ, Some tedious computations lead to the corresponding quantile function This function is the basis of quantilesLogConDens in logcondens.
B. The integral of F at an arbitrary x 0 Recall the definition s j = ∆ϕ j /∆x j from Section 2. In addition, we define f j = f (x j ) for j = 1, . . ., m.To be able to compute the process H = H(•, ϕ) introduced in (2) the aim is to derive an explicit formula for for any j = 1, . . ., m − 1 and x ∈ [x j , x j+1 ].Using (5) we can write .

logcondens: Computations Related to Univariate Log-Concave Density Estimation
Putting the pieces together we receive Specifically, for x = x j+1 , We finally get where i 0 = min{m − 1 , max{i : x i ≤ t}}.
Approximation for s j+1 → 0. Since we divide by s j+1 in the compuation of I j (x) this expression becomes numerically unstable or even undefined once the slope s j+1 of ϕ approaches or even equals 0. To avoid these problems, we compute the limit for a := s j+1 → 0 in the integral in (6): This approximation is used in intF once s j+1 ≤ 10 −6 .

C. The smoothed log-concave density estimator
For γ > 0 and x ∈ R recall the density function φ γ of a centered normal density as Its distribution function is denoted by Φ γ .Elementary calculations reveal that for arbitrary numbers a, x and γ > 0, The smoothed log-concave density estimator then amounts to The function evaluateLogConDens implements f * in logcondens.However, note that it is most convenient to compute f * via specifying smoothed = TRUE in logConDens.
In extreme situations, e.g.data sets containing extreme spacings, numerical problems may occur in (7).For it may happen that the exponent is rather large while the difference of Gaussian CDFs is very small.To moderate these problems, we are using the following bounds in the function Q00 implementing q γ in logcondens:

D. The smoothed log-concave CDF estimator
Using partial integration, we get for arbitrary numbers x, a = 0 and real boundaries u < v: with q γ (x, a, u, v) as in ( 7).As a → 0, this converges to

logcondens: Computations Related to Univariate Log-Concave Density Estimation
This leads to the formula This representation is implemented in evaluateLogConDens and logConDens, where for numerical reasons, Q γ (x, s j , x j−1 , x j ) is replaced with Q γ (x, 0, x j−1 , x j ) in case of | s j | ≤ 10 −6 .

E. Computation of the two-sample test statistic
To calculate K, the built-in R function ks.test was used.This function computes an exact pvalue in a two-sample problem if the product of the respective sample size is smaller than 10 4 , as fulfilled in our simulation examples.As to the computation of K n , define D := F X − F Y , and let z 1 < . . .< z N denote the sorted pooled observations X 1 , . . ., X n 1 , Y 1 , . . ., Y n 2 , i.e.N = n 1 + n 2 .The goal is to find A necessary condition for a x ∈ (L, R) to be a maximum of |D(•)| is D (x) = 0, which is equivalent to ϕ X (x) = ϕ Y (x).As for the computation of x, proceed as follows: 1. Identify all k ∈ {1, . . ., N − 1} such that there exists a y k ∈ [z k , z k+1 ] := I k with ϕ X (y k ) = ϕ Y (y k ).Let K denote the set of these k's.
2. Since both ϕ X and ϕ Y are linear on every interval I k , we can find for each k the point y k where these two functions intersect via the equality yielding .
Note that if the denominator, the difference of slopes of ϕ X and ϕ Y on [z k , z k+1 ], is zero then the two functions must match on that interval (otherwise they would not intersect).This in turn implies that the difference D is constant on that interval so that we can consider y k = z k a possible point where the maximum of D occurs.Consequently, |D(•)| is maximal at one (or more) points in the set R = {L, R} ∪ {y k : k ∈ K} and 1/2 max{D(r) : r ∈ R}.The computation of K * n ( F X , F Y ) appears to be less straightforward than that of K n ( F X , F Y ).In logconTwoSample we first approximate the function (D * ) x = (∂/∂x)(D * )(u, x) for u = z 1 − 0.1 • (z N − z 1 ) on a sufficiently dense equidistant grid (the number of elements in that grid can be provided to logconTwoSample using the argument n.grid).Then, similar to the computation of K n ( F X , F Y ), we identify those grid intervals where (D * ) x changes sign.To finally find the maximum of the latter function we then invoke uniroot to find the precise location of the zeros on these intervals, compute the value of (D * ) x at each of these zeros and identify the largest of these values.In general, the computation of K * n ( F X , F Y ) is more time-consuming than that of K n ( F X , F Y ).However, since on the level of CDFs the MLE and its smoothed version are very similar, differences with respect to the smooth two-sample test in terms of power are very small.For these reasons we omitted the smooth version of the test in our small simulation study reported on in Section 9.

Figure 1 :
Figure 1: Introductory example based on simulated data.

Fhat: a
vector ( F (x j )) m j=1 with values of the c.d.f. of f , H: a vector (H(x j )) m j=1 of directional derivatives (cf.(2) in Section 3), as generated by activeSetLogCon.If smoothed = TRUE in the call of logConDens then dlc additionally contains the entries f.smoothed:A vector that contains the values of f * either at an equidistant grid of 500 values ranging from x − 0.1r to x + 0.1r for x = min j=1,...,m x j , x = max j=1,...,m x j , and r = x − x if xs = NULL or the value of f * at the values of xs if a vector is provided as the latter argument.F.smoothed: The values of F * n on the grid.gam: The computed value of γ.

Figure 2 :
Figure 2: Distribution functions and the process H(t) for the simulated data.Vertical bars at the bottom of the plots indicate observations.

Figure 3 :
Figure 3: Different estimates of the density of the reliability data.

Figure 4 :
Figure 4: Proportion of rejected null hypothesis at α = 0.05 for the null hypothesis of equal distribution functions for two samples of size n 1 = 20 and n 2 = 25.Horizontal offset of points only to increase legibility.
compute K = |D(x * )|.Note that in general D is not unimodal, let alone concave.However, again thanks to the special structure of the log-density estimate computation of x X is possible according to the following scheme.With the order statistics X (i) and Y (j) of the two samples defineL := max{X (1) , Y (1) } and R := min{X (n) , Y (n) }.First, we sort out a simple pathological case: If R ≤ L, then any point x between R and L yields |D(x)| = 1, yielding K = (n 1 n 2 /(n 1 + n 2 ))) 1/2 .Otherwise, due to the monotonicity of distribution functions

Figure 5
Figure 5 gives an example.

Table 1 :
Descriptive statistics of reliability data.