Canonical Response Parameterization: Quantifying the structure of responses to single-pulse intracranial electrical brain stimulation

Single-pulse electrical stimulation in the nervous system, often called cortico-cortical evoked potential (CCEP) measurement, is an important technique to understand how brain regions interact with one another. Voltages are measured from implanted electrodes in one brain area while stimulating another with brief current impulses separated by several seconds. Historically, researchers have tried to understand the significance of evoked voltage polyphasic deflections by visual inspection, but no general-purpose tool has emerged to understand their shapes or describe them mathematically. We describe and illustrate a new technique to parameterize brain stimulation data, where voltage response traces are projected into one another using a semi-normalized dot product. The length of timepoints from stimulation included in the dot product is varied to obtain a temporal profile of structural significance, and the peak of the profile uniquely identifies the duration of the response. Using linear kernel PCA, a canonical response shape is obtained over this duration, and then single-trial traces are parameterized as a projection of this canonical shape with a residual term. Such parameterization allows for dissimilar trace shapes from different brain areas to be directly compared by quantifying cross-projection magnitudes, response duration, canonical shape projection amplitudes, signal-to-noise ratios, explained variance, and statistical significance. Artifactual trials are automatically identified by outliers in sub-distributions of cross-projection magnitude, and rejected. This technique, which we call “Canonical Response Parameterization” (CRP) dramatically simplifies the study of CCEP shapes, and may also be applied in a wide range of other settings involving event-triggered data.


Introduction
Electrical stimulation of the brain can be used for a variety of diagnostic, therapeutic, and scientific purposes. Interactions between brain regions may be studied by applying or inducing pulses of electrical stimulation to a particular site, while measuring the electrophysiological response at the same place or elsewhere [1][2][3]. In particular, the averaging of measured voltages from implanted electrodes following brief (several millisecond) pulses of current produces widespread but sparse deflections from baseline (Fig 1). These voltage traces are typically called "single-pulse electrical stimulation" responses or "cortico-cortical evoked potentials" (CCEPs) [4][5][6]. We make measurements of these types with recordings of the convexity brain surface electrocorticography (ECoG) or in deeper structures from stereoelectroencephalography (stereoEEG; sEEG) and deep brain stimulation (DBS) electrodes with our neurosurgical patients [7]. Despite the "CCEP" name, these stimulation-evoked potential changes are seen with stimulation and recording of non-cortical structures such as white matter, basal ganglia, thalamus, and others [8,9]. Contemporary analysis of these CCEP responses has suffered from reliance on pre-defined assumptions about the shape that the response should have and quantification of effect only by the voltage at a particular time. This manuscript describes an algorithmic approach to formalize and simplify CCEP analysis so that responses of different shape, duration, and magnitude may be quantitatively compared with one another.
Because stimulation studies often involve a very large number of stimulated-at and measuredfrom brain sites, the potential set of interactions to study can become very large and make it difficult to examine data to discover simplifying principles. To address this, we recently introduced a conceptual framework formalizing four basic paradigms for interpreting CCEP data [10]: • The hypothesis-preselected paradigm-Two brain sites are chosen based upon a pre-defined anatomical or functional hypothesis, and a 1-way or 2-way interaction between them is characterized.
• The divergent paradigm-Stimulation is performed at one brain site and measured responses at all sites are examined and compared. For N brain sites, this characterizes N interactions.
• The convergent paradigm-One brain site is measured from, and the effects of stimulations at all brain sites are compared versus one another based upon the response shapes at the measured-from site. For N brain sites, this characterizes N interactions.
• The all-to-all paradigm-All brain sites are stimulated at, and responses are measured at all sites. For N brain sites, this characterizes N 2 interactions.
We previously addressed the convergent paradigm [10], which allows one to uncover "Basis Profile Curves" (BPCs) whose shapes characterize different types of responses at a measuredfrom brain site that can be intuitively mapped back anatomically to the stimulated brain sites. However, for many studies, what is needed is a simple way to characterize the structure of an evoked response at a single measured-from brain site produced by stimulating at one brain site (the hypothesis-preselected paradigm). This manuscript addresses this need with a new technique for identifying structure in an evoked timeseries and parameterizing single trials in terms of it.
Previous quantifications of voltage deflections in single-pulse responses (CCEPs) have typically assumed a single canonical shape consisting of characteristic negative deflections between *10-100 ms from stimulation called the "N1" response and a later second negative deflection (called the "N2") [4,11,12]. However, there are a wide variety of evoked potential Single-pulse biphasic electrical stimulation is delivered through adjacent sEEG electrode contacts (200μs, 6mA), separated by 3-7s between pulses. C. Cartoon voltage traces that might be elicited at two different sites in response to stimulation at a third site (i.e. with a stimulation artifact followed by a characteristic evoked potential deflection). D. An example set of actual evoked potentials showing the stimulation-locked evoked potential matrix V, with columns V k (t)) shown as individual traces. E. Average stimulation-evoked potential from (D). F. Examples of some of the different measured average response shapes seen in these studies (as in E). These selected responses were produced from 5 different stimulation sites across two patients (over the interval 15ms-1s post-stimulation, where the gray line indicates 0 μV). The variety of different shapes seen in just this small subset shows that there is no one typical form of stimulation evoked potential shape. shapes in CCEP responses, and the N1/N2 description is insufficient to describe most of them, as seen in Fig 1. There has not been an alternate, generic, way of approaching these data in the time domain (though some have proposed generic frequency-domain approaches [13]). A formulation is needed that studies a set of repeated trials of stimulation and extracts a canonical structure in the response (if one exists), without a pre-set assumption of the response shape. Our proposed method, which we call "canonical response parameterization" (CRP) provides a recipe for examining structural similarity between trials to a) identify whether there is a significant reproducible response shape (and over what time interval), b) characterize what this shape is, and c) parameterize single trials by the weight of the discovered shape and the residual (after the discovered shape has been regressed out). Equipped with our novel CRP parameterization, researchers can quantify the magnitude, duration, and significance of response to stimulation between pairs of brain sites in a generic framework.

Ethics statement
The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board of the Mayo Clinic IRB# 15-006530, which also authorizes sharing of the data. Each patient / representative voluntarily provided independent written informed consent to participate in this study as specifically described in the IRB review (with the consent form independently approved by the IRB).

Measurement of cortico-cortical evoked potentials
Two patients with epilepsy (19 and 63 years old, both male) participated in the study while undergoing monitoring to localize their seizure onset zone. Patient 1 was implanted with 15 bilateral sEEG leads and patient 2 was implanted with 13 left sided sEEG leads plus several scalp EEG electrodes (in canonical 10/20 locations). Each sEEG leads consisted of 10-18 contacts, composed of cylindrical platinum-iridium electrodes of 2 mm length, with 1.5 mm between (3.5mm center-to-center separation). The diameter of the lead is 0.8 mm, giving each contact has an exposed surface area of 5.0 mm 2 (Fig 1). Electrode recordings were excluded if they were: 1) in seizure onset zone, 2) not stimulated, 3) artifactual, or 4) not in the brain (i.e. not extended past the bolt, etc). Voltage data were recorded at 2048Hz with a Natus Quantum amplifier. Electrode pairs were stimulated 10 times with a single biphasic pulse of 200 microseconds duration and 6 mA amplitude every 3-7 seconds using a Nicolet Cortical Stimulator. Data first were notch filtered to remove 60Hz line noise and then re-referenced to a modified common average on a trial-by-trial manner to exclude stimulated channels and channels with large variance, as described in prior work [14]. Electrodes were localized on post-operative CT scans and coregistered to preoperative MRI using the sEEG View package [15], available on github [16]. All code to implement this technique along with the sample data to reproduce the illustrations are publicly available for use without restriction (other than attribution) at: https://osf.io/tx3yq and https://github.com/kaijmiller/crp_ scripts.
Our illustration of this technique is limited to data from two patients. However, the reader interested in applying this technique with a wider set of example data than that included here can access further illustrative recordings released with our other emerging work by van Blooijs et. al. [17], Ojeda Valencia et. al. [18] and Huang et. al. [14].

Data structure
The quantification of interaction between a stimulated brain site and a recorded brain site begins with a matrix of single-trial voltage responses. Matrix V k (t) is drawn from the voltage data from the measured brain site, selecting epochs of time t over the naïvely chosen interval t 1 to t 2 following the time τ k of the k th stimulation from the stimulated electrode pair, where t denotes the time from the k th electrical stimulation at brain site m, τ k : (τ k + t 1 ) � t � (τ k + t 2 ). The dimensions of V are T × K, with T total timepoints (over the interval t 1 � t � t 2 ) by K total stimulation events ( Fig 1D).This fragment is an error. In this manuscript, t 1 of 15ms was chosen to reduce the likelihood of contamination by stimulation artifact, and t 2 of 1s was chosen because (anecdotally) the vast majority of sEEG CCEPs we have observed return to baseline well before then. We anticipate that researchers will adjust t 1 and t 2 based upon their own circumstances. Before data are analyzed and parameterized, one should evaluate the preprocessed data for baseline fidelity. Issues with inappropriate referencing or lack of a baseline around zero can be avoided by visual inspection or calculation of the mean and/or median values at far from stimulation times.

Single-trial cross-projections
In order to understand shared structure between stimulation trials, we first obtain a matrix of unit-normalized single trials:Ṽ k ðtÞ ¼ V k ðtÞ=jV k ðtÞj. EachṼ k ðtÞ is then projected into all other trials, P ¼Ṽ T V: Note that P(k, l) 6 ¼ P(l, k). The full matrix P is subsequently sorted into a combined set S, with self-projections (k = l) omitted, and a total of K 2 − K elements (Fig 2). Each element (initially with units mV � ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi # samples p ) is then scaled by 1= ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi samplerate p so that it carries the units mV � ffi ffi s p . The average over the set of cross-projections, S, summarizes the interaction from stimulation to response. Quantifying single-trial cross-projections. A. Stimulation and recording sites for this example, shown in an axial MRI section. B. 15 single trials of stimulation response (gray) produced the averaged evoked potential shape (black). C. Trial #1 (light blue) was unit-normalized and projected into the other 14 trials, omitting self-projection. D. As in (C), but for normalized trial #10 (orange). E. All 210 projections are shown sorted, note the obvious sub-sets corresponding to the projections of unit-normalized single trials. The projections of each trial into the others can reflect how representative each trial is of the canonical evoked potential response shape. F. The projections from (E), aggregated into a single column (i.e. imposing the assumption that the order of trials doesn't matter, which will be false under some circumstances). https://doi.org/10.1371/journal.pcbi.1011105.g002

Response duration
In order to quantify how long there is a significant effect after stimulation, the set S can be constructed over different time periods to determine the duration of most statistically-meaningful response. We do this by determining projection weights S and S as a function of time, S ! S t , and quantify a temporal profile, S t , as illustrated in Fig 3. Because S may be thought of as a reflection of mutual information between responses (i.e. their correlated deviation from 0V), the peak of S t represents the time past which further information is not reliably contained in the response-when the distribution of voltages across responses drifts to be indistinguishable from 0μV. We define this peak time as the "response duration", or τ R . It is important to note that this CRP approach is very sensitive to baselining, as the sign of the voltage in cross-projections around zero determines when the profile of S t begins to decrease. The uncertainty of τ R could be estimated in many ways, but, for the illustrations in this manuscript, we place error bars where Sðt 2 Þ exceeds 98% of Sðt R Þ. The truncated voltage matrix representing just the times in the response to stimulation up to this duration τ R is henceforth denoted V, with dimensions τ R − t 1 ("T R ") timepoints and K trials. The initial voltage matrix (over the naïvelychosen time interval) will be specifically designated as such when discussed further in the text.

Extraction significance: Quantifying reliability of response structure and identifying anomalous single trials
The set of projection magnitudes S t R can be tested against zero for significance, which we call the "extraction significance". Note that one cannot use the full distribution of S t R -this creates Using time-resolved projection weight to quantify response duration, τ R . A. Stimulation and recording sites for this example, shown in a sagittal MRI section. B. 10 single trials of stimulation response (gray) produced the averaged evoked potential shape (black). C. Abbreviated timeseries are calculated from t 1 to a range of t 2 s to obtain time-resolved projection weights (individual dots). The traces above indicate a subset of the projections (for the normalized trace of 9 th trial) at times t 2 = 20ms, t 2 = 80ms = τ R , t 2 = .5s, and t 2 = 1s, with distributions of S t 2 at each of these timepoints highlighted in the light red background. In this example, the blue dots are projection of the normalized trace of 9 th trial (illustrated in traces above). The thick black line is Sðt 2 Þ. Calculated response duration, τ R , is indicated by a red circle. Small vertical red lines indicate thresholds where Sðt 2 Þ exceeds 98% of Sðt R Þ (providing an estimate of the error in calculating τ R ). Note that blue dots in bottom portion are the projections illustrated for the 9 th trial from the traces on the top. D. The projection weight temporal profile, from the black line in the lower portion of (C), is shown with a gray line. The averaged voltage response, from the black line in (B), is shown with a black line, and the significant portion of the response is highlighted (i.e. up to τ R ). E. As in (D), but for the example response from Fig 2. artificial significance since a pair-wise interaction between two trials is counted twice. Once when the once when the first trial is normalized & the second trial is raw, and once in the inverse case. Therefore, only half of projections are considered for p-value and t-value analysis. These are selected such that 1) each trial is the normalized trial half of the time and raw half of the time, and 2) pair-wise interactions are counted only once. Sub-distributions of S t R corresponding to the projection magnitudes involving single trials can be used to identify the "most anomalous and most normal" single trials-(with comparison against the same selection of half of the projection magnitudes). This can be used as a simple technique for artifact rejection.

Identification of canonical CCEP shape using Linear Kernel PCA
We would like to identify a characteristic shape of the canonical CCEP, C(t), determined from V, that characterizes a stimulation-induced interaction between brain regions. The most common way to do this is to take the simple average trace (i.e. CðtÞ ! VðtÞ). However, we prefer a quantity that represents the "principal direction" (1 st principal component) of V, which captures the variance of the data and is more robust than the average against outlier trials. A standard principal component decomposition (PCA, [19]) is generally not possible in these data because of the practical fact that the number of timepoints, T R , generally far exceeds the number of trials, K, in these data (i.e. T R � K), which would require K > T 2 R to characterize the T Rby-T R matrix of interdependencies between timepoints in PCA. As in prior work [10], we address this issue by inverting the decomposition using the Linear Kernel PCA technique [20][21][22]. This method allows for the interchange of an eigenvalue decomposition of the matrix VV T (T 2 R elements) with V T V (K 2 elements). Following this approach, we obtain a matrix F, whose columns are the eigenvectors of V T V, with associated eigenvalues contained in the diagonal matrix ξ 2 , satisfying ðV T VÞF ¼ Fξ 2 . We can then solve for the eigenvectors of VV T , contained in the columns of X: Xξ ¼ VF T . We keep the first column of X as our canonical CCEP shape C(t) (Fig 4).

Parameterizing single trials in terms of the canonical response shapes
We utilize the formalism from functional data analysis to parameterize our data [23,24]. Each individual trial is represented as a projection of a canonical CCEP form C(t), scaled by a scalar α k , with residual ε(t) (note that ε(t) reflects combined measurement noise and uncorrelated brain activity): We assert that the expectation values related to ε(t) are E(ε) = 0 and Eðε 2 k Þ � Eðε 2 l Þ, for all k and l. This allows us to estimate the projection of C(t) into each individual trial as follows. First, we expand our single-trial formalism above by application of ∑ t C(t) to both sides, i.e.: X t CðtÞV k ðtÞ ¼ This allows us to calculate α k for each trial: Knowing α k , we can quantify the residual signal after regressing out the shape of C(t): , several useful quantities for each trial V k can be described: a "projection weight" α k ; a scaled version of projection weight, a 0 k that is normalized by the square root of the number of samples in C(t) (i.e. in t 1 to τ R interval) and carries intuitive units of μV (analogous to root-mean-squared response); a scalar "noise" summary term ffi ffi ffi ffi ffi ffi ffi ffi ffi ε T k ε k p (magnitude of the residual); a "signal-to-noise" ratio a k = ffi ffi ffi ffi ffi ffi ffi ffi ffi ε T k ε k p ; the "explained variance" by the canonical stimulation response (CCEP shape) is 1 À Table 1 summarizes these discovered parameters, with some examples illustrated in Fig 5. Note that canonical response extraction and parameter discovery can be highly sensitive to baselining appropriately (Fig 6). The distributions of single trial parameters can be used to quantify the significance of the canonical shape to explain variation in the data, and we call these measures "parameterization significance" to distinguish them from the extraction significance that is described above. The voltage response V k (t) from trial k (black) is parameterized by how strongly the canonical response shape (C(t), red trace, time interval t 1 to τ R ) is represented (scaling factor α k ) plus the "residual" ε k (t) (green): , and ε k (t) for example trial #6. D. As in (C), for all 10 trials. https://doi.org/10.1371/journal.pcbi.1011105.g004

Results and discussion
The figures in this manuscript show a wide variety of shapes in evoked voltage responses to brain stimulation. No single or several pre-defined form(s) would adequately capture the shape of these responses, with or without sign-flips, temporal scaling, or other manipulation. Therefore, we have constructed an approach to extract structure from these data that begins by calculating semi-normalized dot-product projections between single trials over increasing time intervals. From this, we can uncover a duration of significant response τ R , extract a characteristic shape C(t), and then parameterize the single trials in an intuitive formalism: Fig 4). We call this recipe "Canonical Response Parameterization" (CRP).

Projection magnitudes and response duration
Description of the voltage time series response to a stimulus typically begins by visualization of the average of many repeated stimuli (Fig 1). In practice, one then tries to infer how robust this shape is by visually observing a suppression of "roughness" in the small deviations in shape as more trials are added to averaging. Alternately, one can plot single trials in the background of the average shape to quantify trial-to-trial variability from the mean (as in Figs 2B or 3B, for example). However, it would be preferable to have a direct quantification of similarities between different trials, and our technique addresses this by performing pairwise cross-projections between trials (with one normalized) to identify structure. The distribution of these cross-projection magnitudes can be compared versus zero to determine significance, which we call extraction significance (illustrated in Fig 2). By omitting self-projections, there is no self-consistency in significance determination and no appeal to the mean across all trials. This projection technique is then further elaborated upon by applying it to limited time epochs for comparison, as illustrated in Fig 3. A temporal profile for projection magnitude results from this and illustrates the accumulation of information as more structure is considered in the comparison. When adding further time includes data where structure is lost as individual traces trend across zero, negative contributions to the dot product produce a decrease in the overall cross-projection magnitude. The timepoint of the maximum of the temporal profile of cross-projection magnitude therefore reveals the end of the time epoch that is meaningful across trials (we call this the "response duration" τ R , Figs 3 and 4).
Residual data after regressing out the shape of C(t). ffi ffi ffi ffi ffi ffi ffi ffi ffi ε T k ε k p mV � ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi # samples p A scalar single trial summary term quantifying the magnitude of the residual.
"Explained variance" by the stimulation response C(t). show that significant vs. insignificant trials can be readily identified by applying simple statistics to the cross-projection magnitudes, S. Furthermore, the response durations τ R obtained from the peak of the cross-projection magnitude temporal profiles S t clearly capture the timing of meaningful structure that is visually apparent in the CCEP traces. For present use, we plot error bars around τ R representing the limits where S t exceeds 98% of Sðt R Þ.
Synthetic response traces, shown in Fig 7 (and S1 Fig) give the reader a set of simple illustrations to develop intuition for S t . Notably, responses that are mirrored in voltage or mirrored in time produce the same peak response magnitude, at the same duration (Fig 7G-7L). Splitting a response in two parts and separating them in time does not change the peak cross-projection response magnitude (Fig 7A-7C). Addition of noise to a response will not change the response duration, and only decreases the cross-projection magnitude at very high levels of noise (Fig 7D-7F). Note that cross-projection magnitude profile S t for a sustained fixed voltage offset increases by ffi ffi ffi ffi ffi ffi ffi ffi ffi time p , as seen in Fig 7. This is a consequence of the fact that the measure is semi-normalized-one of the single-trial vectors in the dot product is normalized

PLOS COMPUTATIONAL BIOLOGY
How V(t) is normalized prior to cross-projection has a marked effect on how significance is determined, as illustrated in Fig 8. The un-normalized approach (V k (t)V l (t)) is sub-optimal because trials with large amplitude are relatively over-emphasized, even when their shape does not reflect the most characteristic structure. Conversely, fully-normalized projections V k ðtÞṼ l ðtÞ are sub-optimal because they measure higher significance for shorter lengths of As discussed in the manuscript, different trials V k (t) and V l (t) may be compared with each other directly, or after normalization withṼ k ðtÞ ¼ V k ðtÞ=jV k ðtÞj. A. Un-normalized projections V k (t)V l (t) are sub-optimal because trials with large amplitude are over-emphasized in comparison with trials of lower amplitude but more characteristic structure. B. The timeresolved structure of fully-normalized projectionsṼ k ðtÞṼ l ðtÞ are sub-optimal because they dramatically favor early transients and cannot resolve temporally-sustained structure. C. Semi-normalized projections are optimal in that they balance emphasis of amplitude and sustained structure between trials. Panels D-F show the same sample data as A-C, and illustrate the effect of extracting the canonical response from different epochs of time. In the "standard" extraction approach we have illustrated so far, C(t) is discovered using linear kernel PCA from V(t) over the isolated time interval from t 1 to τ R (black line with yellow highlight). We can also unit normalize the average voltage VðtÞ over the t 1 to τ R interval, though the explained variance and signal-to-noise are slightly worse. D. If a C(t) is extracted using linear kernel PCA from t 1 to t 2 = 3 s (blue+red compound trace), the explained variance and signal-to-noise is very poor due to the introduction into the algorithmic process of a large amount of unnecessary noise from the time following τ R , even if the extracted form is truncated at τ R for parameterization (red trace). E and F. As in (D), but for t 2 = 2 s (E) and t 2 = 1 s (F). Note how the shapes converge as t 2 decreases.
Response duration τ R captures the point in time where the signal produced from stimulation becomes indistinguishable from zero (Fig 3). Automated quantification of response duration, rather than visual identification, is important because there is wide variation in duration across pairs of stimulated-at and measured-from brain sites (i.e. Fig 1E). It is also very important because it enables further discovery and robust parameterization of structure in the data: by taking only the segment of data up to the response duration when performing parameterization, unnecessary noise that follows this time does not confuse or diminish the algorithmic process (as illustrated in Fig 8D-8F).
In principle, a response onset/beginning time, τ B , could be calculated moving backward from the discovered response duration, e.g. search through a profile of τ B − τ R once τ R has been discovered. For the present application, that is felt to be unproductive since conduction times between stimulated electrode pairs and measured responses is of the same order as the initial τ B (*15ms). However, calculation of onset/beginning time would be useful in other, future, contexts, where there is a clear delay between the stimulation and response onset. For example, application of CRP and calculation of τ B may be useful in the study of visual or auditory evoked responses, where we know that there is a lag between visual presentation and physiological response that can change in the context of disease (e.g. visual evoked potentials increasing in latency in the context of optic neuritis [25]).

Parameterization of single trials by canonical CRP shapes, C(t), magnitude of the voltage response, a 0 k , and the residual, ε k
The discovery of response duration defines the information-rich epoch of data following stimulation, and allows for isolation of the characteristic induced response shape C(t), by using linear kernel PCA on V(t) over the isolated time interval from t 1 to τ R . Our data-driven CRP approach is an important tool to move analysis of these brain stimulation data past the level of characterization by eye, discovering C(t) empirically (rather than assuming a pre-defined shape). With CRP, researchers can identify and compare different response shapes across stimulation and recording brain sites in different patients using a unified quantification. The formalism is adopted from the field of functional data analysis [23,24] and allows us to express single trials of the voltage response as V k (t) = α k C(t) + ε k (t) (Fig 4). This representation allows single trials to be summarily characterized by normalized projection weight (α 0 , in units voltage), signal-to-noise ratio, and explained variance (Fig 5). Importantly, the CRP technique is quite robust and performs well with diminishing signal in the presence of constant noise (which we have found explicitly, using synthetically-generated responses-illustrated in S1 Fig). Quantifying effect size and statistical significance in this way helps to compare many different response shapes (whether short or long) within one framework, and opens up the possibility to explore data in the hypothesis-preselected and divergent paradigms [10]. While our illustrations consist of a low number of trials (10-15), the technique works easily and is associated with higher statistical significance when a much higher number of trials are obtained (S2 Fig). Seemingly dissimilar responses may be statistically compared with one another without difficulty, as illustrated in Fig 5. Of note, the N1/N2 shape, when present, is clearly and effectively captured (e.g. second row of Fig 5).
While the numerical values of α are not intuitive, α 0 is normalized by the square root of the number of samples in C(t) (i.e. in t 1 to τ R interval) and roughly captures the average voltage deflection from zero during the significant response interval. α 0 is comparable to the root-mean-squared metric that has been shown to be useful in this context [26], but is weighted only by the empirically-discovered significant interval of the response, rather than a preselected epoch defined by the researcher.
The different metrics to quantify and compare response size and significance will be most useful depending on the context. Normalized projection weight α 0 is useful to compare whether one stimulation-response pair has a larger average voltage. However, one brain site might have less baseline activity (as quantified by voltage) than some other site, due to a different cellular milieu or organization; in this case the magnitude of the voltage deflection compared to residual or the explained variance in the signal by the stimulation i:e: a k = ffi ffi ffi ffi ffi ffi ffi ffi ffi Canonical shapes C(t) can be compared across brain sites and patients by taking the dot products to compare similarity. As such comparisons mature, one might account for variations in anatomy and physiology that preserve overall response shape but not conduction speed by performing scaling in time (i.e. by "stretching" C(t) with established techniques [27,28]).
Note that the ability of C(t) to capture the signal in, and explain the variance of, the voltage responses is diminished if one applies the PCA extraction on longer segments of data, or uses a unit-normalized version of the averaged trace (CCEP), as seen in Fig 8D-8F.
Although this manuscript has concentrated on exploring interactions between stimulatedat and measured-from brain sites, one need not measure at a different site than was stimulated to apply our methodology. One may stimulate and measure the evoked response at the same site, applying this parameterization to the measurement, but ensuring that the beginning time of analysis, t 1 is chosen to be well after the stimulation artifact and volume conduction effects have passed [29].
The residual term, ε k (t), is a signal that reflects all local brain activity not directly linked to stimulation timing, combined with measurement noise (i.e. from amplifiers and the environment). For example, if a researcher wishes to examine non-phase-locked oscillatory (rhythmic) activity resulting from the stimulation, they should calculate this from ε k (t) rather than V k (t), since the shape of the deflections of the evoked potential C(t) will have corresponding power in the Fourier domain. For example, a positive deflection in a component of C(t) lasting 100ms will have power at 1/(2 � 0.1s) = 5Hz, but not be an oscillation. Extracts of broadband spectral activity (spread across all frequencies according to a power-law form, but often captured at high frequency by researchers) that capture local brain activity [30] might be best extracted from ε k (t) rather than V k (t).
The ε k (t) term can be used as a tool to understand changes in the shape of the response after external conditions have been applied to perturb brain state. After performing a set of stimulations and parameterizing the responses, one might administer a pharmacologic agent, perform a behavioral analysis, apply therapeutic stimulation, or observe a global state change (e.g. transition from waking to sleep, etc). Stimulations may then be re-performed with the brain in the perturbed state, but the responses are parameterized according to the original C(t), obtaining a new set of residuals ε n k ðtÞ. Then, the extraction and parameterization described in this manuscript is applied to ε n k ðtÞ rather than V(t): If any significance is identified, then the resulting new C n (t) that emerges reveals the structure introduced by the perturbation to brain function (pharmacologic, stimulation therapy, awake/asleep, etc).

Characterizing significance, anomalous trials, and artifact
When considering whether a set of N trials have a significant response to stimulation, the extraction significance defined above reveals how robust the shape is, providing a distribution of N 2 − N cross-projection magnitudes that may be tested versus zero for significance (e.g. 90 datapoints for a 10 trial set). It should be noted that there is some relationship between conjugate cross-projection magnitudes P tṼ k ðtÞV l ðtÞ and P tṼ l ðtÞV k ðtÞ, but they are not the same (as seen clearly by the difference between red and green datapoints in Fig 9D. This relationship is addressed by a balanced downsampling so that trial-pairs of projections are only included once (as described above). The distribution of p-values obtained from many iterations of null data are evenly distributed on the 0-to-1 interval, indicating that the extracted significances are statistically appropriate (S3 Fig). As an alternative to the extraction significance, one may calculate the parameterization significance based upon the single trial parameters noted in Table 1, of which there are N datapoints for each parameter (e.g. 10 datapoints for a 10 trial set). For test of significant response (versus no response) α k is the most useful, because it would be expected to be distributed around zero (insignificant) for spurious discovered structure C(t).
When comparing different stimulation-response sets that have very different shapes, it can be quite useful to compare the distributions of parameters between the two. One might say that a response is "significantly larger" than another by comparing one distribution of α 0 to another or a response is "more robust" by comparing 1 À We may use the tools of this extraction and parameterization to characterize single trials within a set of stimulation-response measurements-anomalous single trials can be identified by individual comparison of the distribution of cross-projection magnitudes involving one trial to the all of the cross-projection magnitudes involving other trials. Trials with larger- Schematic, showing sEEG stimulation and EEG recording. B. Ten single-pulse EEG trials (gray) and average trace across trials (black). Note the clearly artifactual trial. C. Time-resolved projection magnitudes for trials from (B). D. Projection magnitudes at τ R = 0.23s, suggesting that trial #6 is artifactual (p = 1.8�10 −6 , unpaired t-test comparing red+green vs black). Green dots indicate projections of normalized trial #6 into other trials, and red dots indicate normalized projections of other trials into trial #6. E and F. As in (B and C), with trial #6 removed. Note the change in τ R from 0.23 to 0.28s and S t R from 8.5 to 14.5μV� ffi ffi s p .
https://doi.org/10.1371/journal.pcbi.1011105.g009 than-average cross-projections may be interpreted as the "most representative" of the shape of the response. Conversely, as illustrated in Fig 9, trials with cross-projections that are far below the others can reveal artifactual trials for automatic rejection from the dataset (for automated artifact rejection, one can use projections at τ R or at full duration T sent for analysis). Note that there is a large and rich literature for identifying anomalous data [31], and it will be an interesting future direction to explore more comprehensively with data and methodology of this type. Fig 1 shows that there are a wide variety of stimulation response shapes in the sEEG stimulation-evoked-potential responses, and the durations of these responses may be less than 100ms or last up to 600-700ms. This would be expected from brain surface ECoG measurements, where very different response shapes and durations could be evoked at the same measurement site depending on what site was stimulated [10]. One might hypothesize that the systematic study of waveform shapes C(t) may provide understanding about the biology underlying these responses: Could each C(t) morphology type reflect a set of projections to different aspects of laminar architecture (e.g. different cell classes, or unique synaptic subtypes on pyramidal neurons) [32]? For example, in primary visual neocortex, differences in laminar pattern separate feedforward and feedback connections across the 6 layers, allowing for the characterization of a visual hierarchy [33]. Feedforward connections preferentially terminate in the middle layer (layer 4), feedback connections preferentially avoid layer 4, while lateral connections terminate in roughly equal density across all layers. A different primary brain area, the motor neocortex, completely lacks a distinct layer 4. The hippocampus, which is archicortex rather than neocortex, only has 3 layers. One would therefore expect that a homologous input type, even if originating from the same source, would produce very different C(t) in visual, motor, and hippocampal cortex. Therefore, consistent differences in C(t) across these regions could inform new models of how intralaminar dynamics generate characteristic voltage responses.

Biological interpretation of sEEG CCEP
Will future work find that specific shapes of C(t) imply specific biology, such as pro-dromic versus anti-dromic propagation, long-track versus u-fiber white matter transmission, intracortical excitation (via axons that project laterally and remain within the gray matter), or thalamocortical relays [34]? We know for example that, for the divergent paradigm, evoked potentials may arrive with smoothly varying latency, duration, or polarity along adjacent sites in an sEEG lead traversing a natural axis in a brain structure (e.g. the body of the hippocampus in response to stimulation [18]).
The amplitude, width, and overall shape of voltage deflections are influenced by factors relating to the synchronous electrical activity produced in these neuronal populations. At the microelectrode scale, local field potentials have been shown to predominantly reflect coordinated synaptic inputs [35,36]. For example, negative deflections in LFP recorded at the cortical surface can often represent current sinks generated by synchronized excitatory postsynaptic potentials (EPSPs) at apical dendrites of superficial pyramidal cells. In contrast, EPSPs at deeper cortical layers result in positive deflections in the same surface LFP. The width of an LFP deflection may therefore reflect the coherence of synaptic inputs, or may reflect the timescale of charge influx, which is specific to the neurotransmitter type, signal transduction cascade, and channel dynamics that characterize each synapse [37].

Applications in other scientific and medical contexts
Although we have illustrated this parameterization to the case of single-pulse electrical stimulation through sEEG electrodes, the approach might be applied in many other settings where one wants to characterize a reliable response structure of unknown duration. A few such possibilities are: • Evoked electrical and magnetic changes in the brain in response to visual or auditory stimuli are typically called event-related potentials (ERPs) (cf. [38,39]). ERPs have been studied exhaustively in EEG, ECoG, and MEG, to study sensation, perception, cognition, memory, and other aspects of brain function [38,[40][41][42][43][44]. Event-related potentials have been studied to understand injuries and diseases of the brain and spinal cord [45][46][47], and are also used intraoperatively to dynamically to study the function of the spinal cord and brainstem (e.g. somatosensory evoked potentials-SSEPs [48] and brainstem evoked auditory responses-BAERs [49]). Much as in the case of the N1/N2 formulation described above, these ERP data typically focus on identification of a feature by the voltage at hand-picked latencies after the stimulus. The CRP approach detailed here would automate and simplify identification of structure and relative significance in the ERP. For the example case of ERPs in EEG, it is often said anecdotally that single trial signal is very low compared to the residual "noise"application of CRP would allow one to quantify this explicitly.
• Early work with a similar formulation has been also useful for colleagues in neuroscience examining the EEG response to deep brain stimulation [50]. Our specific extraction and parameterization may fit nicely into their work, expanding on it by allowing for identification of the salient duration τ R and single-trial parameterizations noted in Table 1.
• The hemodynamic response functions (HRFs) measured with fMRI have different shapes across different regions and laminae (cf. [51,52]), and CRP might simplify the comparison of these in different voxels.
• A parameterization could be performed by replacing stimulation times with "discovered events" in ongoing brain data may be useful in examining electrophysiology studies such as action potential characterization and sorting in high-impedance microelectrode recordings.
• Brain state under anesthesia can affect CCEP shape [11]. One might apply CRP to a set of stimulations performed under one state of anesthesia, and apply the initial parameterization to a new set of stimulations performed during a different subsequent anesthesia state. Changes in α 0 or repeated CRP applied to the residual ε(t) of this subsequent parameterization would reveal change in response structure that accompanies change in anesthesia.
• Somatosensory evoked potentials (SSEPs) are measured from the brain or spinal cord in response to electrical stimulation of the peripheral nervous system for medical diagnostics in the operating room and the clinic. In the operating room, these are a realtime diagnostic electrophysiology that can dynamically reveal impending injury so the surgeon can stop an action before causing permanent injury. In the clinic and hospital setting, these can be used to diagnose brain function in coma (diminished level of consciousness) after anoxia or traumatic brain injury [46,53,54]. Parameterization would dramatically simplify the nuance required by the electrophysiological technician who assists in these surgical procedures.
• Single-pulse electrical stimulation of the white and gray matter has been used for intraoperative connectivity mapping during surgery for tumor and epileptic focus resection [55,56]. The utility of these diagnostics is still being explored [57], and CRP could help to simplify and standardize the interpretation of the CCEPs (the shape of which, as shown here, will vary dramatically), helping to identify the optimal approach for assistance during resection.
• Our ongoing work-as well as those of many colleagues [58]-is focused on the exploration of epileptic networks. With the advent of stimulation devices that can record and perform open-or closed-loop stimulation [59][60][61], and can stimulate through 4 leads [62], brain stimulation for epilepsy is rapidly evolving. As we learn to stimulate networks in tandem at different cortical and thalamic sites, the ability to quantify connectivity during sEEG implants will help to drive better DBS and RNS system implantations [63].

Limitations, alternative considerations, and future technical strategies
There are important limitations to consider when implementing CRP. By construction, this method cannot parameterize the timing of particular features. An example of this is the case where one wishes to quantify the propagation time between two areas (i.e. the latency). This is typically done by finding the first extrema (peak or trough) in the averaged response as an important time [6,17,64]. While this could be performed on C(t), it is not explicitly built into the parameterization process. A change in output can have forms that are not easily tracked by the CRP technique, such as perturbations in the overall brain state affecting the amplitude of a specific deflection within the CCEP (while sparing other deflections) in subsequent stimulation trials; alternatively, timescales can change and the duration may get longer. CRP may not be useful in those settings (though the change would be quantified in correlated structure across the residual ε(t)). By taking the peak magnitude of S t as the duration, a late resurgence of structure ("blip") following a period of relative insignificance will only contribute if it can overcome any intervening loss in cross-correlation to create a higher peak in the projection profile. Notably, we have not yet observed this in our studies, but it remains an important possibility to be aware of.
The reader should be aware of two frequent artifactual conditions, illustrated in Fig 6. In the first of these (Fig 6C and 6D), a seemingly insignificant response has a very significant brief structure at the beginning of the examined period. This situation may arise when a latent effect of stimulation artifact "carries forward" into the window being considered. Determining what is stimulation artifact and what is brief evoked neural activity is a nuanced topic that we defer to future study. The second artifactual condition to consider (Fig 6E and 6F) is the case where a set of responses appear to be complete noise, but the time-resolved projection magnitude S t grows steadily in time. Inappropriate baselining of the data produces this-we also defer comprehensive exploration of this to future treatment.
As opposed to quantifying time-domain (i.e. raw voltage) changes, one might instead study responses to single pulse electrical stimulation in the frequency domain. Broadband changes in the frequency domain are of particular appeal since their shapes may be interpreted generically as increases in firing rate [30,[65][66][67], without the need to interpret polyphasic shapes as we do in this manuscript. Crowther, et. al. and Kundu et. al. [13,68] showed that broadband changes can effectively identify interactions between brain areas, and it is very likely that broadband changes and raw voltage changes have complementary information, which has been shown for ECoG responses to visual stimuli [41]. Frequency-domain changes that are peaked at a particular frequency (rather than broadband) can reveal stimulation-evoked brain rhythms (oscillations), and are a topic of future study. As noted above, one must be careful when inferring the presence of a rhythm purely from examining responses in the frequency domain, since a simple voltage deflection (like many of those seen in Fig 1), will have power at a frequency that is the inverse of the width of the voltage deflection.
Future exploration might expand this parameterization approach in a number of different directions: • C(t) could be chosen in different manners than we have, such as: using the simple average trace (i.e. CðtÞ ! VðtÞ-we have anecdotally found that using the simple mean as the canonical form, rather than the PCA-based extraction, increases α 0 , while reducing the SNR, as seen in S4 Fig); using the "most representative" single trial, identified by the trial that has the largest average cross-projection when compared with other trials (e.g. the first trial, indicated by blue dots, in Fig 6E and 6F), truncated to τ R ; a globally-defined average shape, such as one chosen from the average in a corresponding brain site over many patients [14]; or canonically-defined shape, like the "N1/N2" shape.
• As noted above, a beginning time τ B could be calculated by moving backward from the optimal duration and recalculating the S t profile, but over τ B − τ R once τ R is known.
• It is possible to calculate the "significance of the leftover", by performing projections on the leftover matrix V k (τ R − to − T), and defining the new τ R as the value for which "leftover" cross-projection magnitude drops below a particular significance level. There are problems with this, because choice of T is somewhat arbitrary without further constraint, so the threshold will also be arbitrary.
• In our present application, error bars (uncertainty) around τ R represent the limits where S t exceeds 98% of Sðt R Þ (since S t reflects a distribution). Future approaches might employ a more nuanced approach to quantify this.
• There is very little jitter in the response onsets of these CCEPs. In other contexts, such as ERP research, there is known variation in response onset, and expanded approaches will be needed in order to align trials temporally prior to parameterization.
• It is possible to calculate cross-projections using an alternative approach: Instead of normalizing one trial by its norm, and not normalizing the other, one can normalize both trials by the square root of their respective norms. This has the effect of tightening the distribution of projections-generating higher extraction significance, but also de-emphasizing anomalous trials (i.e. those that are most representative, or those that are most likely to be artifactual).
• It may be useful in future studies to keep additional columns of X in the linear kernel PCA to study variation across trials (i.e. the second-order moment in the data), rather than the first column alone which is our canonical CCEP shape C(t) (a robust approximation of the mean).
• Future treatments might examine the effect of temporal dilation in a response-where the shape of the physiologic response is prolonged or contracted due to disease, medication, etc. The field of functional data analysis has developed "time warping" approaches [27,28] for just this purpose, and they can be applied directly to the CRP parameterization.
• It will be interesting to explore an optimal CRP parametrization for multimodal brain data (e.g. [69][70][71]). Here the parametrization may further reflect cross-modal spatial and temporal dependencies. B. The first 10 trials from (A) were parameterized in an identical fashion. No trials were identified as artifactual, and the associated t-value for extraction was 23. Note that the response duration (τ R ), mean projection at response duration (S), and scaling coefficient (α 0 ) were all nearly identical. However, the statistics of the parametrization were much more robust for 69 trials. The averaged explained variance and SNR were slightly higher for 10 trials (as might be expected). We would also like to thank a reviewer from the manuscript "Basis profile curve identification to understand electrical stimulation effects in human brain networks" who suggested inquiry along these lines, the exploration of which lead to this present work.