DTDA : An R Package to Analyze Randomly Truncated Data

In this paper, the R package DTDA for analyzing truncated data is described. This package contains tools for performing three diﬀerent but related algorithms to compute the nonparametric maximum likelihood estimator of the survival function in the presence of random truncation. More precisely, the package implements the algorithms proposed by Efron and Petrosian (1999) and Shen (2008), for analyzing randomly one-sided and two-sided (i.e., doubly) truncated data. These algorithms and some recent extensions are brieﬂy reviewed. Two real data sets are used to show how DTDA package works in practice.


Introduction
Randomly truncated data appear in a variety of fields, including Astronomy, Survival Analysis, Epidemiology, or Economics.Under random truncation, only values falling in a random set which varies accross individuals are observed.For the recorded values, the truncation set is also observed.However, when the value of interest falls out of the corresponding random set, nothing is observed.This issue typically introduces a remarkable observational bias, and hence proper corrections in statistical data analysis and inference are needed.
Methods for computing the nonparametric maximum likelihood estimator (NPMLE) of a distribution function (DF) observed under random truncation have been proposed since the seminal paper by Turnbull (1976).Interestingly, the difficulties in the construction of the NPMLE heavily depend on the specific truncation pattern, i.e., on the class of allowed truncation sets.Probably, the most investigated pattern of truncation is left-truncation, for which the truncation set is an interval unbounded from above.In epidemiological studies and industrial life-testing, left-truncation arises e.g., when performing some cross-sectional sampling, under which only individuals "in progress" at a given date (also referred as prevalent cases) are eligible.As a result, large progression times are more probably observed, and this may dramatically damage the observation of the DF of interest.For left-truncated data, the NPMLE has an explicit form and it can be computed from a simple algorithm that goes back to Lynden-Bell (1971).See Woodroofe (1985) and Stute (1993) for the statistical analysis of this estimator.The right-truncated scenario, under which the truncation sets are intervals unbounded from below, can be dealt with similarly by means of a sign change.Inference becomes more complicated, however, when other ways of truncation appear.
In many applications, the truncation sets are bounded intervals, that is, the variable of interest X * is only observed when it falls on a (subject-specific) random interval [U * , V * ].Efron and Petrosian (1999) motivated this double-truncation issue by means of data on quasars, which are only detected when their luminosity lies between two observational limits.In epidemiology, doubly-truncated data are also encountered.For example, acquired immunodeficiency syndrome (AIDS) incubation times (from human immunodeficiency virus (HIV) infection) databases report information restricted to those individuals diagnosed prior to some specific date.This typically introduces a strong observational bias associated to right-truncation, i.e., relatively small incubation times are more probably observed.Besides, since HIV was unknown before 1982, there is some left-truncation effect too.Bilker and Wang (1996) noticed this problem and they discussed the relative impact of each type of truncation in the final sample.Moreira and de Uña-Álvarez (2010) motivated the random double-truncation phenomenon by analyzing the age at diagnosis for childhood cancer patients; as for the AIDS example, in this case the double truncation emerges from the fact that the recruited subjects are those with terminating event falling on a given observational window.Note that left (or right) truncation can be obtained from double-truncation by letting V * (respectively U * ) be degenerated at infinity (respectively minus infinity).
A cumbersome issue with doubly-truncated data is that the NPMLE has no explicit form, and it must be computed iteratively.This complicates the analysis of its statistical properties, posing also a challenge in the design of suitable algorithms for its practical computation.See Efron and Petrosian (1999) and Shen (2008) for technical details.For the best of our knowledge, there is no package oriented to the computation of the NPMLE under double-truncation.The DTDA package described in this work fills this gap.DTDA has been implemented in R (R Development Core Team 2010) system for statistical computing.This package also allows for the analysis of one-sided (left or right) truncated data.The package DTDA contains three different algorithms for the approximation of the NPMLE under double-truncation (in its more general version), as well as some recent extensions, e.g., bootstrap confidence bands (Moreira and de Uña-Álvarez 2010).As it will be described below, it provides useful numerical outputs and automatic graphical displays too.Results in this document have been obtained with version 2.1-1, available from http://CRAN.R-project.org/package=DTDA.
The paper is organized as follows.In Section 2, a brief review of the existing algorithms to compute the NPMLE under double-truncation is given.In Section 3 the DTDA is described and its usage is illustrated through the analysis of two real data sets.Finally, Section 4 is devoted to conclusions and future possible extensions of the package.

Doubly truncated data algorithms
This section gives an introduction to the NPMLE for doubly truncated data, jointly with a review on the existing algorithms to approximate this estimator in practice.Let X * be the lifetime of ultimate interest, with DF F , and let (U * , V * ) be the pair of truncation times, with joint DF K.Under double truncation, only those (U * , X * , V * ) with U * ≤ X * ≤ V * are observed; otherwise, no information is available.For any distribution function W denote the left and right endpoints of its support by a W = inf {t : Woodroofe 1985).Let (U i , X i , V i ), i = 1, . . ., n, denote the sample, which we assumed to be ordered with respect to the X i 's (this is relevant for the algorithm described in Section 2.2).Under the assumption of independence between X * and (U * , V * ), the full likelihood of the sample is given by Here, we assume without loss of generality, that the NPMLE is a discrete distribution supported by the set of observed data (Turnbull 1976).The quantity F i will represent the amount of mass contributed by the lifetime DF on the truncation interval [U i , V i ].As noted by Shen (2008), the full likelihood, L(f, k), can be decomposed as a product of the conditional likelihood of the X i 's given the (U i , V i )'s, say L 1 (f ), and the marginal likelihood of the (U i , V i )'s, say L 2 (f, k): The first term in the decomposition in equation ( 2) plays a very important role in the algorithms introduced by Efron and Petrosian (1999).

First Efron-Petrosian algorithm
The conditional NPMLE of F (Efron and Petrosian 1999) is defined as the maximizer of This criterion leads to an estimator f satisfying, for all j = 1, . . ., n where Fi = n m=1 fm J im .Equation (4) was used by Efron and Petrosian (1999) to introduce the following iterative algorithm to compute f in (3).
Step EP 1 Apply equation (4) to get an improved estimator f(1) and compute the F(1) pertaining to f(1) .
Step EP 2 Repeat Steps EP 0 and EP 1 until a convergence criterion is reached, remembering to rescale the density estimator obtained after each application of equation ( 4) .
As claimed by Efron and Petrosian (1999), this algorithm often converges quite slowly.The authors suggested a different algorithm based on an adaptation of Lynden-Bell (1971) method for computing the NPMLE in the case of one-sided truncation.This method is described as the second Efron-Petrosian algorithm in the next section.

Second Efron-Petrosian algorithm
The survival curve G = (G 1 , G 2 , . . ., G n ) and the hazard function h = (h 1 , h 2 , . . ., h n ) attached to f = (f 1 , f 2 , . . ., f n ) are in general defined, for all m = 1, . . ., n as follows: As usual, one can always recover the survival function G and the density f from h, for all m = 1, . . ., n via the relationships: with the conventions For doubly-truncated data it happens that the NPMLE, namely f , has hazard function ĥ where N m , m = 1, . . ., n, denotes the size of the risk set at time X m if only left-truncation is considered (Woodroofe 1985), i.e., J im are the inclusion indicators defined in (1), and Qi = ĜV i + / Fi (6) (Efron and Petrosian 1999) The numerator of equation ( 6) is the MLE probability of exceeding V i , the upper observational limit for X i .In the case of left truncation, Qi = 0 since V i = ∞, and (5) takes the form which is just Lynden-Bell (1971) estimate.In this situation, equation ( 5) gives the NPMLE directly, without any iteration.When dealing with two-sided truncation, equation ( 5) was used by Efron and Petrosian (1999) to introduce the following iterative algorithm to compute f .
Step L 1 Apply equation (5) to get an improved estimator ĥ(1) and compute the F(1) pertaining to the corresponding f(1) .
Step L 2 Repeat Steps L 0 and L 1 until a convergence criterion is reached.

Shen algorithm
The two different algorithms presented above are suitable if the main aim is to estimate the lifetime DF.However, in some circumstances it may be interesting to display some estimator of the truncation times distribution.This will be the case, for example, when analyzing the truncation pattern, which may be informative about different features of the process under investigation.The problem of estimating the DF of the truncation times was first discussed by Shen (2008), who provided an algorithm to jointly compute the DF of both the lifetime and the truncation random variables.
In order to introduce Shen (2008) algorithm, interchange the roles of the X i 's and the (U i , V i )'s in the decomposition of equation ( 2).Hence, the full likelihood can be also written as the product where Here, L 1 (k) denotes the conditional likelihood of the (U i , V i )'s and L 2 (k, f ) refers to the marginal likelihood of the X i 's.Note that K i stands for the probability of getting a truncation interval around X i and hence it provides information about the relative probability of observing each of the recruited lifetimes.
Maximization of L 1 (k) leads to a k = argmax k L 1 (k) such that: where Ki = n m=1 km J mi .Shen (2008) proved that the solutions to equations ( 4) and ( 8) are not only the conditional but also the unconditional NPMLE's of F and K respectively, and that both estimators can be obtained in a simultaneous way by solving the following two equations, for j = 1, . . ., n: The expressions in ( 9) and ( 10) were used by Shen (2008) to introduce the following iterative algorithm to compute f and k.
Step S 1 Apply the formula in (10) to get the first step estimator of k, namely k(1) , and compute the K(1) pertaining to k(1) .
Step S 2 Apply the formula in (9) to get the first step estimator of f , f(1) , and compute its corresponding F(1) .
Step S 3 Repeat Steps S 1 and S 2 until a convergence criterion is reached.
This algorithm and the other two discussed by Efron and Petrosian (1999) are implemented in the package DTDA.
As convergence criterion in all the algorithms above, we have used that the maximum pointwise error when estimating f in two consecutive steps should be below an error threshold, namely 1e−06.In addition, this is an usual precision level for several packages in R.

Bootstrap approximation of the NPMLE
The asymptotic distribution of the NPMLE for doubly truncated data is not easy to determine.This is mainly because the estimator has a non-explicit form.The available results, Shen (2008), do no provide answers to important practical issues such as the computation of standard errors and the construction of confidence limits.
Moreira and de Uña-Álvarez (2010) proposed the simple bootstrap as a suitable method to approximate the finite sample distribution of the NPMLE for doubly truncated data, extending the ideas in Gross and Lai (1996) for the one-sided truncated scenario.Gross and Lai (1996) and Moreira and de Uña-Álvarez (2010) also presented a critical comparison with the obvious bootstrap method.Both procedures can be briefly explained as follows.
The simple bootstrap draws (with replacement) independent random vectors (indexed by b) (U ib , V ib , X ib ) , i = 1, ..., n, from the empirical distribution that puts weight 1/n at each of the observations (U i , V i , X i ), i = 1, ..., n.This allows for the construction of every b-th bootstrap resample, and then the procedure is repeated a large number of times B to approximate the distribution of a given statistic.
The obvious bootstrap starts by estimating the distributions of X * and (U * , V * ) on the basis of the observable data; this can be done by following the algorithm described in Section 2.3 and proposed by Shen (2008).Then, the resamples for X * and (U * , V * ), say X ib and (U ib , V ib ), i = 1, . . ., n, are independently obtained with probability P (X ib = X j ) = fj and The simple bootstrap method is usually preferred to the obvious bootstrap method not only because it is substantially simpler to implement but also because it completely dispenses with the stringent assumptions (continuity of the underlying distributions, independence between the truncation times and the lifetimes, see Shen 2008) that are needed for consistent estimation of F and K in the obvious bootstrap method.The obvious bootstrap may be preferred, however, if one wants to incorporate the independence assumption in the resamples, so they can reproduce in a more precise way the sampling nature in the independent case.It should be also noticed that if the algorithms in Efron and Petrosian (1999) are to be used, then it is not possible to apply the obvious bootstrap.This is because these algorithms do not provide an empirical version of the truncation times joint distribution.
After any of simple or obvious bootstrap resampling methods are performed, the 100(1 − α)% confidence limits for a given target can be computed in the usual way.To this end, from the large number B of values of the estimator, the upper and lower 100(α/2)% of values are eliminated to compute the limits.This idea is incorporated in the DTDA package, as it is explained below.

Package DTDA in practice
The DTDA package contains different algorithms for analyzing randomly truncated data, including one-sided and two-sided (i.e., doubly) truncated data.This section shows the usage of DTDA by analyzing two real data sets.The first one concerns doubly truncated data, while the second example only includes right truncation.
The new package incorporates the iterative methods introduced by Efron and Petrosian (1999) and Shen (2008) which have been presented and discussed in the previous sections.Estimation of the lifetime DF and of the truncation times joint and marginal DFs is possible, together with the corresponding pointwise confidence limits based on bootstrap methods.Graphical displays can be automatically generated.
The DTDA package is composed of three functions (objects) that enable users to fit the proposed models and methods.In summary the three functions are: efron.petrosian() computes the NPMLE of a lifetime DF observed under one-sided (right or left) and two-sided (double) truncation with the first algorithm of Efron and Petrosian (1999).It also provides simple bootstrap pointwise confidence limits.
lynden() computes the NPMLE of a lifetime DF observed under one-sided (right or left) and two-sided (double) truncation with the second algorithm of Efron and Petrosian (1999), based on an extension of Lynden-Bell's method for one-sided truncation.Simple bootstrap pointwise confidence limits are obtained.
shen() computes the NPMLE of a lifetime DF observed under one-sided (right or left) and two-sided (double) truncation with the algorithm proposed by Shen (2008).
The NPMLE of the joint distribution of the truncation times along with its marginal distributions are also computed.Simple or obvious bootstrap pointwise confidence limits can be generated.
Table 1 shows a summary of the arguments in the three functions.It should be noted that only X, U and V are required arguments.The structure of the data input is as follows: each individual is represented by a single line of data.The variable X represents the lifetime of ultimate interest and it can not be NA.The variable U represents the left truncation times.If there is no left truncation, by putting U = NA the program (and the algorithm) are prepared for dealing with this type of data.The same happens with the variable V, which represents the right truncation times: if there is no right truncation (i.e., if the data are only left truncated), just set V = NA.If the values of the variables U and V are such that they do not really provide truncation from right and from left, the estimators obtained from the package should coincide with the ordinary empirical estimator which puts mass 1/n at each data point.This will happen if all the left truncation times are smaller than the minimum of the lifetimes, and all the right truncation times are greater than the maximum of them.

An example with doubly truncated data
In Astronomy, one of the main goals of the quasar investigations is to study luminosity evolution (Efron andPetrosian 1999, Shen 2008).The motivating example presented in the paper of Efron and Petrosian (1999) concerns a set of measurements on quasars in which there is double truncation because the quasars are observed only if their luminosity occurs within a certain finite interval, bounded at both ends, determined by limits of detection.
The original data set studied by Efron and Petrosian (1999), comprised independently collected quadruplets (z i , m i , a i , b i ), i = 1, . . ., n, where z i is the redshift of the ith quasar and m i is the apparent magnitude.Due to experimental constraints, the distribution of each luminosity in the log-scale (y i = t(z i , m i )) is truncated to a known interval [a i , b i ], where t represents a transformation which depends on the cosmological model assumed (see Efron and Petrosian (1999) for details).Quasars with apparent magnitude above b i were too dim to efron.petrosian(),lynden() and shen() arguments A character string giving the plot type to be used to represent the joint distribution of the truncation times.This must be one of "image" or "persp", with default NULL.
Table 1: Summary of the arguments of the efron.petrosian(),lynden() and shen() functions.The arguments marked with * in the first part of the table are included in shen() with further options.
yield dependent redshifts, and hence they were excluded from the study.The lower limit a i was used to avoid confusion with non quasar stellar objects.The n = 210 quadruplets investigated by Efron and Petrosian (1999) were kindly provided by the authors.At the beginning of Section 2 we referred to some identifiability conditions for the estimation of the population DFs.For this data set, the extreme ordered statistics of the adjusted log luminosities (−2.34 and 2.08) are relatively close to the minimum lower bound (−2.40) and the maximum upper bound (2.58) respectively, suggesting that a K 1 ≤ a F or b F ≤ b K 2 could be violated.Note that, in general, the obtained estimator for F can only be regarded as an estimator of F conditionally on In this section the usage of the three functions efron.petrosian(),lynden() and shen() is illustrated by analyzing the quasars data set.The practical application is mainly focused on the function shen() because, unlike the other two functions, it provides not only the estimators for the 'lifetime' DF but also the curves corresponding to the truncation times.
Besides, the computation of confidence limits throughout the two bootstrap resampling methods discussed previously in Section 2.4 is also provided.Numerical outputs for the function efron.petrosian()will not be given, since they are just a subset of the results displayed here.However, since the algorithm behind the function lynden() is somehow different, some of the results obtained with this function are also shown.
The data are incorporated in the matrix object Quasars; the second and the third columns correspond to the left and right truncation times respectively, while the first column is reserved for the variable of interest (in this example, log of quasar luminosity).Using shen() the estimated cumulative distribution can be analyzed, jointly with the estimated survival and other values of interest provided by the next output (edited to show only the first and last lines of output):  Note that the output provides information about the observed adjusted log luminosities, the number of events (which will be 1 in the case of no ties), the estimated density at each point, and the cumulative curves (cumulative DF and survival function).There is some preliminary information about the confidence level used for the computation of the bootstrap confidence limits as well as the number of iterations when computing the NPMLE and the maximum pointwise error when estimating f in two consecutive steps.The default stop criterion here is 1e-06.
Automatic graphical displays are obtained when changing the default FALSE to TRUE for the arguments display.FS (cumulative DF and survival function) and display.UV (marginal DFs of the truncation variables).These plots are reported in Figure 1, which includes the 95% confidence bands based on the simple bootstrap.These bands can be skipped by setting boot = FALSE; alternative bands based on the obvious bootstrap can be displayed by setting boot.type= "obvious".Similarly, a graphical plot of the bivariate DF of the truncation variables is obtained by setting display.joint= TRUE.This output is reported in  As it was already mentioned, adjusted log luminosity databases report information restricted to those quasars with apparent magnitude within a limit of detection interval.This introduces a strong observational bias since relatively small and large luminosities are less probably observed.This feature can be observed in Figure 3 (left panel), which shows that adjusted log luminosities below zero are observed with a particularly small probability.This display was constructed from the output biasf of shen() function, which contains the estimated quantities P (U * < x < V * ) representing the probability that the detection interval contains a lifetime (i.e., adjusted log-luminosity) of magnitude x.In the untruncated case, the curve in Figure 3, left, should be flat; under truncation, however, different shapes representing the observational bias will be obtained.
In order to compare the confidence bands obtained when using the two different bootstrap methods "simple" and "obvious", Figure 3 (right panel) shows the estimated log survival function for the quasar data together with the 95% pointwise confidence bands.The pointwise confidence bands using "simple" bootstrap are shown in green, whereas the confidence bands with "obvious" bootstrap are plotted in red.It can be seen that these methods produce in general different results; this is not surprising, since they are not equivalent as discussed in Section 2.
The second algorithm proposed by Efron and Petrosian (1999), as mentioned at the beginning of this Section, may report results somehow different to those corresponding to the first algorithm.Besides, both algorithms, although oriented to maximize the same likelihood, follow different steps, and hence it is not surprising that the solutions may be slightly different in particular cases.As it can be observed in the next output, the number of iterations needed to meet the stop criterion is quite smaller here compared with the previous output obtained in fit1 using shen() function (7 against 43).This feature is in agreement with the discussion in Efron and Petrosian (1999) (see pp. 828-829).Note that this numerical output coincides with that of the function shen() (which uses the first algorithm in Efron and Petrosian (1999), for the computation of the lifetime density and DF).This does not need to be the case in general (see our second example in Section 3.2), although the differences between the solutions provided by both functions should not be large.In general, the function lynden() could be recommended to save computational time.An issue that has been overseen in many applications is that of the important bias associated to random truncation.For the quasar data, ignoring the left truncation may be very important, as suggested by our Figure 3.The plot in Figure 4 was depicted by using the lynden() function applied to several situations.The first one is that considering the double truncation, as performed above (in fit2).The second one (saved as fit3 below) ignores right truncation; this can be easily done by setting V = NA.Finally, fit4 below contains the output of the function lynden() when removing both (right and left) truncation times.For doing this, V = NA must be kept and at the same time, ignorable lower truncation bounds must be introduced (since U = NA does not work in the presence of V = NA).This latter output just provides the ordinary survival function which attaches mass 1/n to each of the adjusted log luminosities.Figure 4 reveals the strong impact of left truncation in the estimation of the DF of the quasar luminosities.This is in agreement with the observational bias depicted in Figure 3, left.

An example with right-truncated data
Induction Times for AIDS data from Lagakos, Barraj, and de Gruttola (1988) are used to illustrate a situation in which one-sided (rather than two-sided) truncation appears.This data set is available from the book by Klein and Moeshberger (2003, Table 1.10, pp. 20).The data include information on the infection and induction times for 258 adults and 37 children who were infected with HIV virus and developed AIDS by 1996-06-30 The data consist on the time in years, measured from 1978-04-01, when adults were infected by the virus from a contaminated blood transfusion, and the waiting time to development of AIDS, measured from the date of infection.In this sampling scheme, only individuals who had developed AIDS before the end of the study period were included and so the induction times suffer from right truncation.
Let X be the induction time, that is, the time from HIV infection to the diagnosis of AIDS; and denote by T the time from HIV infection to the end of the study, which plays the role of right truncation time.Only those individuals (X, T ) with X ≤ T are observed.In this example the sole information included is the infection and the induction times for the 258 adults.These variables X and T are reported in the second and the third column, respectively, of the matrix AIDSdata called below.
In order to perform the data analysis, the function shen() is used, setting U = NA to inform about the absence of left-truncation.As it can be seen in the next numerical output, the algorithm converged after 19 iterations; it can be also noticed that (unlike for the quasar data example) there is a clear presence of ties in this data set.The automatic graphical displays of the command line above is given in Figure 5.The confidence bands (based on the simple bootstrap) are wider for large incubation times, in accordance to the under-information at these points, related to the right-truncation phenomenon.Since this data set is one-sided truncated, the best algorithm here is the second one proposed in Efron and Petrosian (1999), which is just Lynden-Bell (1971) method.As discussed in Section 2, this algorithm converges after one iteration under one-sided truncation (indeed, the estimator has an explicit form in this case).The following output displays the numerical results achieved by the function lynden().Unlike for the quasar data example, notice that the figures are not exactly the same as those reported by the function shen().

Conclusions
This paper discusses the implementation in R of several algorithms for computing the NPMLE of the cumulative DF in the presence of random truncation.The DTDA package implements in a friendly way the methods proposed by Efron and Petrosian (1999) and Shen (2008).For the best of our knowledge, this is the first contribution of this type to deal with the nonstandard (and sometimes ignored) issue of random truncation.The package DTDA provides not only the numerical outputs of main interest but also automatic graphical displays of several curves, such as the cumulative DF and the survival function of the lifetime as well as the marginal and joint DFs of the truncation times.Besides, two different bootstrap methods are implemented for the computation of confidence limits.
The function lynden() may give results somehow different to those provided by the functions efron.petrosian()or shen().The algorithm behind lynden(), although oriented to maximize the same likelihood as shen() and efron.petrosian(),follows different steps, and hence it is not surprising that the solutions may be slightly different in particular cases.We should also point out the slow speed of convergence of the algorithms EP 0 -EP 2 and S 0 -S 3 when compared to L 0 -L 2 (Efron and Petrosian 1999, p. 828); see also our application to quasar data above.Although this seems to be typically the case, we have found special situations in which shen() or efron.petrosian()may converge in fewer steps than lynden().So a definite conclusion about this point can not be given.
An interesting extension of the package would be the implementation of smooth estimates for e.g., density and hazard rate functions.This could be done by computing kernel estimators, which are obtained from the NPMLE by convolution with a kernel function, providing a smooth estimator.See for example Wand and Jones (1995) for access to related literature.Finally, adaptation of the implemented methods to the context of regression with truncated responses could be provided, by using the empirical estimators computed by DTDA to weight the residuals, in the spirit of Stute (1993) for the censored case, see also Sánchez-Sellero, González-Manteiga, and Van Keilegom (2005) for left-truncated, right censored responses.However, this is a field of research which remains unexplored for doubly truncated data, and new methods should be carefully worked out before this extension is possible.

Figure 1 :
Figure 1: Cumulative DF and Survival function of quasars luminosities (top) and marginal DF of each of the truncation variables (bottom), applying shen(), with "simple" bootstrap for 95% pointwise confidence bands.

Figure 2 :
Figure 2: Bivariate distribution, in log scale of the truncation variables, for the quasar data, using option "persp" (left panel) and option option "image" (right panel).

Figure 2
Figure 2 when choosing two different types of plotting: plot.type= "persp" (left panel) and plot.type= "image" (right panel), considering in both cases the log joint distribution.

Figure 3 :
Figure 3: Bias function for the quasar data (left panel).Estimated log survival for the quasar data, using shen() function and 95% pointwise confidence bands for simple (red) and obvious (green) bootstrap methods (right panel).

>
fit3 <-lynden(Quasars[, 1], Quasars[, 2], V = NA, boot = FALSE, + display.F = FALSE, display.S = FALSE) > fit4 <-lynden(Quasars[, 1], U = min(Quasars[, 1]) -1, V = NA, + boot = FALSE, display.F = FALSE, display.S = FALSE) .UV Logical.Default is FALSE.If TRUE, the marginal distributions of U (fU) and V (fV), are plotted.plot.jointLogical.Default is FALSE.If TRUE, the joint distribution of the truncation times is plotted.plot.type * .B Numeric value.Number of bootstrap resamples .The default NA is equivalent to B = 500.alphaNumericvalue.(1−alpha) is the nominal coverage for the pointwise confidence intervals.display.F Logical.Default is FALSE.If TRUE, the estimated cumulative distribution function associated to X, (F ) is plotted * .display.S Logical.Default is FALSE.If TRUE, the estimated survival function associated to X, (S) is plotted * .shen()argumentsboot.type A character string giving the bootstrap type to be used.This must be one of "simple" or "obvious", with default "simple".display.FS Logical.Default is FALSE.If TRUE, the estimated cumulative distribution function and the estimated survival function associated to X, (F) and (S) respectively, are plotted.display Estimated log survival as a function of the adjusted log luminosity evolution for the quasar data, using the NPMLE of Efron and Petrosian (black line), the Lynden-Bell estimate ignoring upper truncation (red curve), and ignoring both left and right truncation (green curve).
function for the cumulative DF and survival function of the AIDS induction times (top) and marginal DF of the right-truncation variable (bottom), together with the 95% pointwise confidence bands based on simple bootstrap method.