SPOCK, an R based package for high-throughput analysis of growth rate, survival, and chronological lifespan in yeast

Plate reader-based methods for high-throughput measurement of growth rate, cellular survival, and chronological lifespan are a compelling addition to the already powerful toolbox of budding yeast Saccharomyces cerevisiae genetics. These methods have overcome many of the limits of traditional yeast biology techniques, but also present a new bottleneck at the point of data-analysis. Herein, we describe SPOCK (Survival Percentage and Outgrowth Collection Kit), an R-based package for the analysis of data created by high-throughput plate reader based methods. This package allows for the determination of chronological lifespan, cellular growth rate, and survival in an efficient, robust, and reproducible fashion.


INTRODUCTION
The awesome power of yeast genetics has accelerated the study of many biological phenomena, including cellular aging. Many phenotypes first studied in yeast have been found to be conserved in higher eukaryotes, including humans, underscoring the importance of this model organism for the study of aging and other biological processes. Using yeast as a model organism, the impacts of many genes on aging and metabolism, and the mechanisms underlying their response to cellular and genomic damage, have been elucidated by measuring growth rate and cell survival.
In an exponentially dividing yeast culture, commonly referred to as logarithmic phase (log phase) culture, growth rate can be expressed as a doubling time [1]. Doubling time is a key metric for understanding the effects of select genes, drugs, nutrients, and damaging agents on aging, replication, cell cycle, and damage repair pathways. Distinct from survival, growth rate queries the acute effect of a treatment on a cell's ability to replicate. Yeast doubling time is typically calculated using optical density at 600nm (OD 600nm ), which correlates with cell number in liquid suspension [2] at multiple time points during log phase growth at 30°C with vigorous shaking. Initially, these measurements were typically performed manually, but automated plate readers with shaking incubator capabilities have allowed collection of OD 600 measurements at preset intervals in multi-well plate formats [3]. Calculation of the doubling time is then performed manually for each strain and experimental condition.
Cell survival is determined by the capacity of a cell to proliferate (the ability to re-enter or continue the cell cycle), which should not be mistaken for viability, as a viable cell is not necessarily capable of proliferation.
In a traditional measure of survival, yeast cells from a diluted culture of interest are plated onto nutrient rich agar plates and grown into visible colonies. Subsequently, colonies, which represent the total number of cells or colony forming units (CFUs) that were plated and capable of proliferation, are counted [4]. The time-consuming nature of this assay has limited high-throughput studies of cell survival.
A key metric of aging research, chronological lifespan, is the capacity of chronologically aged cells to re-enter the cell cycle. In budding yeast, S. cerevisiae, chronological lifespan is related to how long stationary phase cells remain able to proliferate after glucose has been exhausted from the growth media and proliferation has halted [5]. Thus, chronological lifespan measures survival longitudinally. Chronological lifespan assays in yeast have relied on plating an aging culture periodically over a relevant span of time, followed by counting CFUs [6]. The relative fraction of CFUs before and after aging represent the fraction of surviving cells at each time point.
A less time-intensive alternative to manually measuring OD 600nm to determine the effects of experimental treatments on doubling time can be accomplished in a high-throughput manner using an incubated, shaking plate reader such as the Bioscreen C (Growth Curves USA) [7], Synergy HTX (Biotek) [8], VICTOR Nivo (PerkinElmer), or Tecan M200 Infinite Pro (TECAN) [9], among others. These plate readers allow the user to measure OD at a defined wavelength, over time, without manual intervention, and without disturbing the growth of the experimental culture. By extension, it is also possible to define the survival and chronological lifespan of a culture by measuring the growth curves of aged and young cultures alongside the time required for them to proliferate to an equivalent OD.
The analysis associated with determining the doubling time, survival, and chronological lifespan presents a substantial bottleneck to using there otherwise high-throughput methods. Several software packages to automate such an analysis have been constructed [10][11][12]. However, none of these packages are both available as an R package and capable of chronological lifespan analysis. R is a powerful open-source language which has become increasingly utilized in the field of biology, and it contains numerous packaged tools for downstream statistical analysis [13]. Herein we describe SPOCK (Survival Percentage and Outgrowth Collection Kit), an R package which automates the calculation of chronological lifespan, doubling times, and survival percentages required for the phenotypic analysis of mutants, while implementing numerous sanity checks to protect against the encroachment of experimental error (Fig. 1). This package was developed using the Bioscreen C machine, but can be applied to other high-throughput plate readers as well.
Brewers strains were grown at 30°C in sterile YPD medium.
S. pombe was grown at 30°C in sterile YES medium [14].
Flame-sterile techniques were used throughout.

Generation of OD calibration ladder
To generate calibrated OD values, a "ladder" of serially diluted culture of a known concentration was read with the Bioscreen C. Briefly, the OD 600nm of a mid-log culture (~0.7 -0.9 OD/ml) was determined for 4 technical replicates on a VICTOR Nivo (PerkinElmer) plate reader using a 4:1 dilution. An appropriate volume of the culture was spun at 3000rpm for 3 min in a conical tube, the supernatant aspirated, and the pelleted cells were resuspended in 4°C YP (1% yeast extract, 2% peptone), to obtain a 6 OD/mL concentrated culture. This culture was then serially diluted ten times with YP at a 2:1 ratio, resulting in a final culture density of ~0.01 OD. The OD of the serial diluted "ladder" was then read in 3 technical replicates with the Bioscreen C. The OD of the YP without any added inoculum was read to determine the background of the OD readings. The background was subtracted, the values averaged, and calibration coefficients were determined using the ladder.create() function from SPOCK.

Generation of growth curves
To generate growth curves for doubling time analysis, mid-log phase S. cerevisiae cultures from two biological replicates were diluted 1/100 in YPD medium, and 200μl of the cell suspension was plated in four separate wells, for a total of four technical replicates per biological replicate. S. pombe growth curves were generated in a similar fashion, with YES media used in lieu of YPD. Small  To generate growth curves for survival analysis of S. cerevisiae strains, four biological replicates of either young (2-day post-diauxic culture) or progressively aged (7 or 21 days) cultures were diluted 1/100 in 20°C YPD medium. Subsequently, 200μl of the diluted culture were transferred to four independent wells, for a total of four technical replicates per biological replicate. For each survival measurement, strains and replicates were diluted in the same order to minimize any effects dilution time might have on the time shift, Δt n .
Machine settings for the Bioscreen C for both doubling-time and outgrowth analyses were as follows: shaking was set to high/continuous and measurements were taken every 15 minutes for 36 hours at OD 420-580nm .

Creation of data-sets for sanity checks
To simulate the introduction of bacteria into a culture, 50μl OP50 E. coli culture was inoculated into 10 ml of an S. cerevisiae culture and growth curves were obtained (using the Bioscreen C).
To simulate spontaneous mutations, or the introduction of wild yeast into a culture, strains with known short and long doubling times were switched in silico between days 7 and 21 during survival analysis.

Linear Range OD Correction
The linear range of an optical plate reader limits the maximum and minimum ODs that can be determined without dilution or concentration of a cell cultures. It has also been shown that OD measurements outside of the linear range vary in a proportional relationship to the actual culture density. To correct for these limitations, a "ladder" of serially diluted culture of known OD is read on a plate reader to define this proportional relationship for that specific plate reader. While this proportional relationship is similar for various wild type laboratory S. cerevisiae strains as shown in Fig. S1a, differences in strain size and color reinforce the necessity to create a calibration ladder for each parental strain. Actual ODs versus ODs read by the plate reader are interpolated by fitting the data to a cubic polynomial [15,16] as shown in (Fig. S1b). With this method, it is possible to correct ODs measured on the plate reader to their actual ODs without dilution (Fig. S1c).

Defining the Limits of Logarithmic Growth for Calculation of Doubling Time
To calculate the doubling time of a cell culture of interest, it is critical that only OD measurements from the logarithmic phase of growth be included in the analysis. When calculating the doubling time manually, a user trims the lag phase (pre-log phase), and postlog phase growth (for example, the period in which the doubling time of S. cerevisiae slows prior to the exhaustion of glucose at the diauxic shift) from OD readings. The user then log transforms the data, fits the data by e.g., linear least squares estimation [17], and verifies linearity by an R 2 coefficient before calculating the doubling time by the following equation: Automating this process depends on a reliable and repeatable method to automatically identify a range of growth that only contains log phase (exponential) growth. One of the largest sources of complexity when picking these limits is the noise within the data created by automated plate readers used in this high-throughput analysis. The environment within these plate readers drives evaporation; subsequently, condensation on the plate lid results in the random scattering of transmitted light. Coupled with the effects of sporadic flocculation and deflocculation of yeast in small shaken volumes, this creates a substantial degree of noise, particularly during lag phase. Furthermore, voltage variation and light source decay drive noise between growth measurements. This inherent noise complicates the selections of limits for the logarithmic growth phase, as the noise can occlude the underlying growth curve signal. This is a perfect case for using a signal filter. One such filter is a Butterworth or maximally flat magnitude filter [18], which is commonly used to remove noise in the context of signal processing while preserving the underlying frequency response. Butterworth filters offer many advantages over other filter types, including locally estimated scatterplot smoothing (LOESS), also known as a Savitzyk-Golay filter [19]. Among these advantages are decreased computational demand, increased user tunability, and a discrete and reproducible outcome from the filter, which many filters do not produce. Butterworth filters allow the user to define the cutoff-frequency and the order or stringency of the filter. While the machine noise created by these plate readers is mostly high frequency, the underlying growth curve is a low frequency signal. The ability of the user to easily define the parameters of a Butterworth filter allows the user of SPOCK to tune the filter to their specific use case, which may include sources of noise outside of those predicted by the authors. As the user increases the order of the filter, the frequency response approaches a maximally flat response, that is to say, the higher the order of the filter from 0 to the "n th " order, the sharper the frequency response becomes, and the more stringent the filter becomes. Following the application of a Butterworth filter to the raw outgrowth data, the limit of log phase can more precisely be defined once noise has been removed. To define the limit of the lag phase, a sliding window (=n) over which the slope is calculated is applied across the outgrowth curve in an iterative fashion until the slope of the curve reaches a user defined limit κ.
The value of κ can be defined heuristically, taking into account the expected slope of a S. cerevisiae strain doubling every ~120 minutes, and the limits of detection of the plate reader.
The inflection point of the growth curve or the end of the logarithmic growth phase can be defined as the point, ξ, where the first derivative of the growth curve peaks. This requires the assumption that the slow replication following the diauxic shift in S. cerevisiae does not surpass the doubling rate of the logarithmically dividing population. This can be calculated by the following equation: Fig. 3 shows the plot of the first derivative of a representative growth curve, which has been overlaid onto the growth curve to show that the upper limit coincides with the maximum first derivative of the growth curve. Fig. 3 highlights the linear nature of the limit defined region as picked by SPOCK, when natural log transformed.
Following signal processing, the doubling time is calculated by the exponential growth formula for the log-transformed growth curve, between the defined limits, by application of a linear least-squares regression. The R 2 coefficient is automatically calculated for this natural log transformed data, where as R 2 approaches 1, the goodness of exponential fit of the growth curve increases.

Calculation of Survival Percentage and Survival Integral
The survival of a culture can be measured by determining the shift in the time required for the same number of cells from an aged and young culture of S. cerevisiae released into nutrient rich medium to produce an equivalent number of cells. This shift is proportional to the fraction of cells capable of proliferation present in the aged culture versus the young culture. Assuming that the doubling time, an inherent characteristic of a strain, remains constant as the culture ages, the survival percentage can be calculated using only the shift in time (Δt n ) seen in Fig. 4a, and the doubling time (δ n ), where S n is the survival percentage, using the following formula: To calculate the survival integral [20], or the area under the survival curve which is equal to the mean survival time, the following equation is applied using the calculated survival percentage:

SANITY CHECKS
Any wells that have doubling times less than a user defined limit (default of 45 minutes) are identified and flagged by SPOCK, as it is likely that these wells contained bacteria contamination. Importantly, these wells are not removed from downstream analysis, but are flagged in the final analysis.
The issue of suppressor mutations in slow growing mutants, or of wild yeast contamination, present a second indication for a flag when analyzing survival percentage. As doubling time is assumed to be an inherent phenotype of a strain, if the doubling time between two aging time points varies more than a user defined amount (default 25%), this may indicate that something has changed during the course of the measurements and it is possible that the culture/replicate is not suitable for survival analysis.

SPOCK automates the determination of yeast doubling-times
SPOCK automates the calculation of doubling times, survival percentages, and mean lifespan for phenotypic analysis of mutants and varied growth conditions, while permitting ease of downstream statistical analysis within R. Directions for the implementation of SPOCK are documented further in the supplemental file README.txt.
SPOCK was used to calculate the doubling time for a number of WT S. cerevisiae strains commonly used in the lab, using the Bioscreen C. Values for OD calibrated doubling times and their replicate statistics were calculated (Table 1). Doubling times were seen to match doubling times previously determined in the lab for these strains. Furthermore, SPOCK was shown to reproducibly determine doubling times across many replicates.
To further the validation of SPOCK, doubling times were also determined for the fission yeast Schizosaccharomyces pombe, and two brewer's yeast strains, Sigmund's Voss Kveik [21], and Metschnikowia reukaufii [22]. SPOCK accurately determined the doubling times of these yeast strains when compared to traditional methods.

SPOCK automates the calculation of S. cerevisiae aging metrics
S. cerevisiae strains with previously identified long or short chronological lifespan were analyzed using the Bioscreen C and SPOCK at days 2, 7, and 21 after culture inoculation. The doubling times were calculated and then used to calculate the survival percentage and the survival integral of these strains (Fig. 5). It was expected that Strain 1 would have a short chronological lifespan as prior data have shown that it poorly forms quiescent yeast cells. In contrast, the longer chronological lifespan seen for Strain 2 is consistent with its ability to more robustly form quiescent cells, which has been shown to be a defining factor in chronological lifespan [23].

SPOCK applies sanity checks to diminish error
A subset of wells were spiked with E. coli strain OP50 [24] to mimic bacterial contamination (Table S1), and the flagging component of SPOCK correctly identified these wells as likely to harbor bacterial contamination.
Finally, SPOCK was shown to accurately flag strains that have markedly increased or decreased doubling times between aging points. We suggest that this would be analogous to a spontaneous mutation or wild yeast contamination, but recognize that this flag may not apply to all culture conditions.

CONCLUSION
Herein, we have described SPOCK, which automates the calculation of doubling times, survival percentages, and mean lifespan for phenotypic analysis of mutants and varied growth conditions in yeast, while permitting ease of downstream statistical analysis within R. We showed that SPOCK reproducibly calculates these metrics across a variety of yeast strains, while accurately implementing a number of sanity checks which provide a safeguard against experimental and biological error. By utilizing a user-tunable Butterworth filter for sources of noise in the underlying measurements, and automated user-adjustable threshholds for exponential growth followed by a multi-point fit for doubling time, SPOCK can robustly and reproducibly estimate yeast growth rates, survival, and chronological lifespan.

PACKAGE DISTRIBUTION
The software package, SPOCK, is available through github at https://github.com/ labmccormick/SPOCK, and can be installed in R using the R package utility install_github, and the command install_github("https://github.com/labmccormick/SPOCK"). For a detailed example of how to install SPOCK from github including necessary dependencies, please see <SPOCK-Intro.R>. SPOCK-Intro.R, a README file, and example input data for configuration testing are available at https://github.com/labmccormick/SPOCKHelp. SPOCK is licensed under the GNU public license version 3 (GPL-3.0-or-later). The full details of this license are available at https://www.gnu.org/licenses/gpl-3.0.txt.  Butterworth filtering of growth curve data is an effective way to robustly remove typical sources of noise. A. Fourier transformations of representative unfiltered and Butterworth filtered growth curves, using a 3 rd order Butterworth filter with 5[cHz] frequency cutoff. The Fourier transformation of the impulse response underlying the Butterworth filter is also shown. B. Unfiltered, Butterworth filtered and Savitzyk-Golay filtered representative growth curve, suggesting that Butterworth is a better choice than Savitzyk-Golay for these data. A representative Butterworth filtered growth curve, overlaid with the first order derivative (multiplied times 10 for visualization), showing that the maximum first order derivative occurs at the inflection point of exponential growth. By this method OGA() picks the limits of exponential growth. The data between these limits is reproducibly highly linear when log transformed. A. Growth curves of S. cerevisiae strains known to have decreased or increased survival relative to a wild type strain at days 2,7, and 21 after culture incoluation. B. Percent survival is calculated as described in Section 3.2, in which each strain is plotted at each timepoint during aging.