Regression models for censored time-to-event data using infinitesimal jack-knife pseudo-observations, with applications to left-truncation

Jack-knife pseudo-observations have in recent decades gained popularity in regression analysis for various aspects of time-to-event data. A limitation of the jack-knife pseudo-observations is that their computation is time consuming, as the base estimate needs to be recalculated when leaving out each observation. We show that jack-knife pseudo-observations can be closely approximated using the idea of the infinitesimal jack-knife residuals. The infinitesimal jack-knife pseudo-observations are much faster to compute than jack-knife pseudo-observations. A key assumption of the unbiasedness of the jack-knife pseudo-observation approach is on the influence function of the base estimate. We reiterate why the condition on the influence function is needed for unbiased inference and show that the condition is not satisfied for the Kaplan–Meier base estimate in a left-truncated cohort. We present a modification of the infinitesimal jack-knife pseudo-observations that provide unbiased estimates in a left-truncated cohort. The computational speed and medium and large sample properties of the jack-knife pseudo-observations and infinitesimal jack-knife pseudo-observation are compared and we present an application of the modified infinitesimal jack-knife pseudo-observations in a left-truncated cohort of Danish patients with diabetes. Supplementary Information The online version contains supplementary material available at 10.1007/s10985-023-09597-5.

where ϕ ′ f (h) is called the derivative of ϕ at f in direction h. The derivative f ↦ ϕ ′ f is a functional from W into L 1 (D, E), the space of linear, continuous maps from D to E. The space L 1 (D, E) is a Banach space when equipped with the operator norm: For a continuous linear map a ∶ D → E there exists a constant K > 0 so that ∥ a(f ) ∥ D ≤ K ∥ f ∥ E for all f ∈ D. The operator norm of a is the smallest of such constants, Continuity and differentiability of f ↦ ϕ ′ f is defined in L 1 (D, E) equipped with the operator norm. A functional ϕ is continuously differentiable, or C 1 , if it is differentiable and the derivative ϕ ′ is continuous with respect to the operator norm. Higher order differentiability can be defined recursively for k > 1 on the space L k (D, E) = L 1 (D, L k−1 (D, E)), with the operator norm The Schwarz Theorem states that the k'th derivative ϕ (k) f is symmetric in its argument, cf. Theorem 5.27 of Dudley and Norvaiša (2011).
A functional ϕ ∶ W → E is analytic if it has a Taylor expansion around each point of W . Analyticity of ϕ on W implies that ϕ is C k on W for any k ≥ 1, cf. Theorem 5.28 of Dudley and Norvaiša (2011). When ϕ is a C k functional, we have the kth order Taylor expansion with an integral remainder Theorem 5.42 of Dudley and Norvaiša (2011). If ϕ ∶ W → E and ψ ∶ V → F are C k then the composition ψ ○ ϕ is also C k , cf. Theorem 2.10.0 of Keller (2006). The chain rule specifies that the derivative of ψ ○ ϕ at f in direction g is A functional ϕ ∶ W → E is said to be Lipschitz continuous if there exists a constant K > 0 so that for all f, g ∈ W . The functional is said to be locally Lipschitz continuous if for all f ∈ W there exists a ball, B f ⊆ W , around f and a constant K f > 0 so that for all g, h ∈ B f . If a functional ϕ ∶ W → E is continuously differentiable (C 1 ) then it is also locally Lipschitz continuous, cf. Proposition 1.4 in the supplement of Overgaard et al. (2017). We are particularly interested in the case with functional in C 2 with locally Lipschitz continuous second order derivatives. For two C 2 functions, ϕ ∶ W → E and ψ ∶ V → F, if both ϕ and ψ have locally Lipschitz continuous second order derivatives, then ψ ○ ϕ also has a locally Lipschitz continuous second order derivative, cf. Proposition 1.5 in the supplement of Overgaard et al. (2017).

Functions with finite p-variation
Let (Ω, F, P) be a probability space, (X , A) a measure space and (D, ∥ ⋅ ∥ D ) a Banach space. Consider an i.i.d. sample X 1 , . . . , X n defined on (Ω, F, P) with values in X and a map δ (⋅) : δ Xi be a sample average and let F denote the limit of F n when it exists. We consider base estimatesθ n which are functional of a sample average,θ n = ϕ(F n ). The functional ϕ can be considered a map from D into a parameter Banach space, E say, where the parameter space is R with the Euclidean norm in our applications.
The Banach space (D, ∥ ⋅ ∥ D ) will be the space of bounded functions in p-variation. The where the supremum is taken over all m ∈ N and points x 0 < x 1 < . . . < x m in the interval J. The space of functions of bounded p-variation is a Banach space (Dudley and Norvaiša, 2011).
We are interested in the rate of convergence of F n = 1 n ∑ n i=1 δ Xi . The results of Dudley and Norvaiša (2011), Theorem 6.2 of Part I, deals with F n that are the empirical distribution functions of a one-dimensional i.i.d. sample with common distribution function F : for 1 ≤ p < 2. The result was extended to δ x that is a vector of at-risk and counting processes with at most one jump, with the same rate as in (2), in the supplement of Overgaard et al. (2017), Lemma 2.1. In particular, the rate of convergence ∥ F n −F ∥ [p] = o P (n −1/4 ) for p ∈ [ 4 3 , 2) is sufficient for our application. We consider in the present paper base functional ϕ that are differentiable, and indeed analytic, for 1 ≤ p < 2. This is proven by considering the base estimate functional as composition of multiplication, integration and product integration.

Influence functions
Consider a base estimate of the formθ n = ϕ(F n ), with F n = 1 n ∑ n i=1 δ Xi . When ϕ is differentiable at F , a first order Taylor expansion gives When ϕ has locally Lipschitz derivative, the above convergence can be strengthened to If F n converge to F with the rate in (2), we then have is called the second order influence function of ϕ. The first and second order influence function satisfies cf. formula (3.23) and (3.24) in Overgaard et al. (2017).

B: Infinitesimal jack-knife pseudo-observations
Let (X 1 , Z 1 ), . . . , (X n , Z n ) denote an i.i.d. sample of observations with time-to-event data X i and covariate Z i . Consider a base estimate ϕ that is a functional of averages of the observations and the infinitesimal jack-knife pseudo-observations, We let ∥ ⋅ ∥ denote the p-variation norm from the previous section.
Proof. The infinitesimal jack-knife pseudo-observation for observation x is defined in terms of the function The function ψ satisfies the first order Taylor expansion with integral remainder, cf.
(1). Consider the integrand in the remainder, In total, the integrand in the remainder is dominated by uniformly in x for fixed f . Evaluated at f = F , g = F n − F , where F n − F is small with large probability when n is large, As the second order derivative is locally Lipschitz continuous, there exists a constant K > 0 so that for large n with high probability The right-hand side of (7) converge to zero as n → ∞. Using the assumption that ∥ F n − F ∥= o P (n − 1 4 ), we therefore obtain the approximation of the infinitesimal jack-knife pseudo- uniformly in i = 1, . . . , n. In the last equation, we have used Proposition 3.1 in Overgaard et al. (2017) in the approximation ofθ n,i .

C: Competing risk data with left-truncation
Define the modified at-risk indicators The combined sets of functions are denoted Using the independence of (T, ∆), C and L, we have For the condition on the influence function, we consider for simplicity the scenario without competing risk, so that one minus the Aalen-Johansen estimate reduces to the Kaplan-Meier estimate. Let χ denote the Kaplan-Meier functional of 1 n ∑ n i=1 δ L Xi . With a similar argument as in Overgaard et al. (2017), the Kaplan-Meier influence function in the sampled cohort iṡ with expectation where the last equation comes from the Duhamel equation (Johansen and Gill, 1990).

D: Modified infinitesimal jack-knife pseudo-observations
LetX 1 = (X 1 , Z 1 ), . . . ,X n = (X n , Z n ) denote an i.i.d. sample of observations with time-to-event data X i and covariate Z i . Consider modified infinitesimal jack-knife pseudo-observations of the formθ where F n = 1 n ∑ n i=1 δ Xi and δ * Xi is map of X i . We let F denote the limit of F n when it exists. In the application to a cohort with left-truncation, F corresponds to the measure conditional on truncation. Formula (9) can be seem as an estimate of the infinitesimal jack-knife pseudoobservations in (6). The following result extend Theorem 1 to modified infinitesimal jack-knife pseudo-observations.
Let the base estimate function ϕ be two times differentiable with a locally Lipschitz continuous second order derivative, e.g. three times continuous differentiable. Let ρ be differentiable with a locally Lipschitz continuous derivative, e.g. two times continuous differentiable. Then Proof. The infinitesimal jack-knife pseudo-observation for observation x is defined by the function , The function ψ satisfies the first order Taylor expansion with integral remainder, cf.
(1). Consider the integrand in the remainder, Since ϕ ′′ ρ(f ) is locally Lipschitz, it is possible to find a constant K f > 0 so that when ∥ g ∥ is small. With the same constant, we can also bound Since ρ is differentiable with locally Lipschitz derivative, we can find a constantK f > 0 so that ∥ ρ(f + sg) − ρ(f ) ∥≤ sK f ∥ g ∥ when ∥ g ∥ is small. Finally, since ρ ′ f is locally Lipschitz, it is possible to find a constantK f > 0 so that In total, the integrand in the remainder is dominated by Let ω Fn (⋅) be an estimated weight function and A(β; Z i ) a column vector. Consider estimatesβ n that are the solution of The functions ρ , δ * Xi and ω are chosen so that the condition (10) in Theorem 2 is satisfied. In the application to a cohort with left-truncation, the probability measure in the expectations in (10) and (11) corresponds to the measure conditional on truncation.
Theorem 2. Consider the setup of Proposition 1. Assume A(β; Z i ) has finite second moment and the weight functional f → ω f (⋅) is differentiable with locally Lipschitz continuous derivative, e.g. two times continuous differentiable. Assume further that ω satisfies Then and Proof. Using that weight function f → ω f is differentiable with locally Lipschitz derivative, we may express the estimating function as The first term of (13) is a symmetric mean zero U-statistics of order 2. The mean of the first term of (12) is zero by assumption (10). The mean of the second term of (12) is due to the property of the influence function in (4). The mean of the third term of (12) is E (ω F (X i )A(β 0 ; Z i )ϕ ′′ ρ(F ) (δ * Xi − ρ(F ), ρ ′ F (δ Xj − F ))}) = EX i (ω F (X i )A(β 0 ; Z i )E (ϕ ′′ ρ(F ) (δ * x − ρ(F ), ρ ′ F (δ Xj − F ))) | x=Xi ) = 0, due again to the property of the influence function.
It follows from Theorem 12.3 of van der Vaart (1998) that n −1/2 U n (β 0 ) converges in distribution to a mean zero normal distribution with variance (11).
The asymptotic distribution ofβ n now follows for standard asymptotic arguments (Parner et al., 2020). We state the result, but leave the proof to the reader.
Theorem 3. Assume the following regularity conditions 1. µ(⋅; z) and A(⋅; z) are continuously differentiable for all z ∈ Z. Then an estimatorβ n exists so that U n (β n ) = 0 with a probability tending to 1 for n → ∞. Moreover, as n → ∞.