A time-and-motion approach to micro-costing of high-throughput genomic assays.

BACKGROUND
Genomic technologies are increasingly used to guide clinical decision-making in cancer control. Economic evidence about the cost-effectiveness of genomic technologies is limited, in part because of a lack of published comprehensive cost estimates. In the present micro-costing study, we used a time-and-motion approach to derive cost estimates for 3 genomic assays and processes-digital gene expression profiling (gep), fluorescence in situ hybridization (fish), and targeted capture sequencing, including bioinformatics analysis-in the context of lymphoma patient management.


METHODS
The setting for the study was the Department of Lymphoid Cancer Research laboratory at the BC Cancer Agency in Vancouver, British Columbia. Mean per-case hands-on time and resource measurements were determined from a series of direct observations of each assay. Per-case cost estimates were calculated using a bottom-up costing approach, with labour, capital and equipment, supplies and reagents, and overhead costs included.


RESULTS
The most labour-intensive assay was found to be fish at 258.2 minutes per case, followed by targeted capture sequencing (124.1 minutes per case) and digital gep (14.9 minutes per case). Based on a historical case throughput of 180 cases annually, the mean per-case cost (2014 Canadian dollars) was estimated to be $1,029.16 for targeted capture sequencing and bioinformatics analysis, $596.60 for fish, and $898.35 for digital gep with an 807-gene code set.


CONCLUSIONS
With the growing emphasis on personalized approaches to cancer management, the need for economic evaluations of high-throughput genomic assays is increasing. Through economic modelling and budget-impact analyses, the cost estimates presented here can be used to inform priority-setting decisions about the implementation of such assays in clinical practice.


INTRODUCTION
The use of molecular information is transforming the field of oncogenomics. Ongoing genomic discoveries in the area of cancer research have contributed to the potential for personalized approaches to cancer management 1 . Molecular characterization of tumours has already contributed to the discovery of targeted therapies such as trastuzumab for her2-positive breast cancer 2 and rituximab for first-line chemotherapy in diffuse large B-cell lymphoma (dlbcl) 3 , both of which have transformed the standard of care for those disease sites. More recently, immunohistochemistry and cytogenetic analysis for the non-Hodgkin (nhl) and Hodgkin lymphomas have been used to identify prognostic biomarkers in the anaplastic large-cell lymphomas, to understand the prognostic impact of molecularly distinct cell-of-origin subgroups in cases of dlbcl, and to identify various factors that can be used to risk-stratify Hodgkin lymphoma patients with the aim of offering targeted treatment regimens [4][5][6] . In a fiscally conscious health care environment, there is great appeal in being able to identify patients for whom newer and potentially more costly treatments will be cost-effective. Despite those advances in cancer research, economic evidence supporting the application of high-throughput genomic technologies in everyday practice is lacking 7,8 . The accelerated pace with which high-throughput genomic assays and next-generation sequencing technologies are being used in cancer diagnostics threatens to outpace an assessment of the economic implications of changes in practice [9][10][11] . A recent rapid review of the cost-effectiveness of nextgeneration sequencing technologies, including wholegenome, whole-exome, and targeted capture sequencing, conducted by the Canadian Agency for Drugs and Technologies in Health, found that the economic evidence is currently insufficient to make definitive claims about the cost-effectiveness of next-generation sequencing technologies 12 . Similarly, a review by Buchanan et al. 7 of the literature on economic evaluations of genomic interventions identified a number of methodology challenges faced by health economists when undertaking evaluations of personalized medicine technologies, one of them being a lack of comprehensive cost estimates that are available to researchers. Some research institutions and manufacturers publish price lists of personalized medicine technologies, but because those services are often provided for profit, using the provided estimates to inform decision-making within a publicly funded health care system cannot be recommended. In addition, such estimates often neglect the multidisciplinary nature of the work (encompassing, for example, clinical researchers, bioinformatics analysis, laboratory oversight), and assumptions factored into the related calculations are not always made apparent, hence limiting their applicability 12 .
The foregoing challenges highlight a critical gap in the costing literature that can be addressed by the collection of comprehensive cost estimates for high-throughput genomic technologies. Use of micro-costing techniques 13-16 -in particular, those reflecting a bottom-up costing approach 17,18provides a rigorous method for directly measuring activities related to the relevant assays and assigning prices at a perunit level to the resources used.
Various methods are available for measuring the quantities of resources used in costing studies, one of which is a direct observational (that is, time-and-motion) approach 15,19,20 . In a time-and-motion study, an external observer records, for a defined set of activities, the resources used and their quantities (that is, laboratory equipment, supplies) and the time taken in conducting the activities. Studies of this kind have been used in the health care setting to understand the workflow in surgical operating rooms [21][22][23] , the per-patient cost of disease screening 24,25 , and the implementation of health informatics and information technologies 26,27 .
The aim of the present study was to determine the per-case (that is, per-patient) cost of 3 commonly used high-throughput genomic assays-digital gene expression profiling (gep), fluorescence in situ hybridization (fish), and targeted capture sequencing-in the context of lymphoma cancer research in the province of British Columbia.

METHODS
The study took the perspective of the BC Cancer Agency (bcca) and was conducted as part of a multi-year (2013-2017) research project in genomics and personalized health, one of its aims being to establish a province-wide system to guide the personalization of treatment for lymphoid cancers in British Columbia 28 . Enrolled patients were at least 16 years of age, had been diagnosed and were treated in British Columbia, were hiv-negative, and had 1 of 3 types of nhl (dlbcl, follicular lymphoma, chronic lymphocytic leukemia) or Hodgkin lymphoma. A separate set of procedures was being used for patients diagnosed with chronic lymphocytic leukemia, and hence those cases were excluded.
Formalin-fixed paraffin-embedded tissues from biopsies are delivered to the bcca for standard and molecular diagnostic analyses in the bcca histopathology laboratory ( Figure 1). After analysis, leftover tissues from eligible dlbcl, follicular lymphoma, and Hodgkin lymphoma cases are transferred to the bcca Department of Lymphoid Cancer Research (lcr) laboratory for dna or rna extraction, quantification, quality assessment, and storage. Depending on the type of lymphoid cancer, specimens undergo targeted analysis to further characterize their unique genomic (fish, targeted capture sequencing) and transcriptomic (digital gep) features. The clinical assessment and genomic or transcriptomic profiles are then returned to the treating oncologist for evaluation of available treatment options.
For the purposes of the broader research project, a set of 16 standard operating procedures (sops, see Table i) were developed to process individual cases through specimen collection, preparation, and quality assessment (sop-01 to sop-09), digital gep (sop-10 to sop-12), fish (sop-13 and sop- 14), and targeted capture sequencing and bioinformatics analysis (sop-15 and sop-16). Digital gep is performed using the nCounter Digital Analyzer platform (NanoString Technologies, Seattle, WA, U.S.A.) with an 807-gene code set, which includes 20 genes to determine cell-of-origin for dlbcl cases 29 and 26 genes to stratify Hodgkin lymphomas as either high-or low-risk 30 . Dual-colour fish is conducted using the Metafer imaging software (MetaSystems, Altlussheim, Germany) and a Carl Zeiss Axio Imager Z2 microscope (Carl Zeiss Microscopy, Oberkochen, Germany), with break-apart probes used to detect MYC, BCL2, and BCL6 rearrangements in dlbcl. Targeted capture sequencing is performed using Agilent SureSelect capture protocols (Agilent Technologies, Santa Clara, CA, U.S.A.) and the Illumina MiSeq platform (Illumina, San Diego, CA, U.S.A.) to detect somatic single nucleotide variants and insertions or deletions in 32 genes that are reported to be recurrently mutated in dlbcl, follicular lymphoma, and chronic lymphocytic leukemia, and to harbour clinically actionable potential. Each assay is conducted in the lcr laboratory. For each sop, resource use related to labour (staff), capital and equipment (for example, MiSeq desktop sequencer), supplies and reagents (for example, disposable gloves, commercial probes), and overhead was collected using a direct observational technique. A micro-costing approach was then applied to calculate the costs attributable to each resource.
The research project obtained ethics approval from the University of British Columbia-bcca Research Ethics Board (H05-60103).

Time-and-Motion Study
Various approaches to conducting time-and-motion studies can be used, each of which has its own strengths and limitations 16 . For the present research study, a direct observer approach was used to avoid some of the input errors and inconsistencies that can affect other data collection techniques (work sampling, for instance) 13,20 .
The setting for the time-and-motion study was the lcr laboratory. An external observer (SC) conducted the observations during a 5-month observation period in 2014. The only exception to this observer-led approach occurred with respect to the bioinformatics analysis, whereby the mean per-case duration to analyze the targeted capture sequencing data (sop-16) was calculated based on the minimum and maximum time estimates provided by the bioinformatician. Depending on the sop, each observation lasted between 30 minutes and 5 hours. At the start of the observation period, approximately 150 cases had been enrolled in the genomics and personalized health research project. In the time-and-motion and micro-costing study, 99 individual cases were observed, with some cases being observed more than once across all sops ( Table ii). The breakdown of unique cases by diagnostic type was 48 dlbcl cases (48.5%), 19 follicular lymphoma cases (19.2%), and 32 Hodgkin lymphoma cases (32.3%). The patients were predominately men (65%) and were less than 60 years of age at time of diagnosis (67%); all patients were diagnosed in 2014.
The amount of staff "hands-on" time, capital and equipment, and supplies and reagents used to process the batch of samples for every observation during the sop was recorded-that is, "fixed" times (the extended periods during which samples are on the equipment with minimalto-no staff oversight, such as overnight incubation periods) were not used. Per-case estimates of hands-on time reflect the average time required for a batch of cases to be processed through the sop across multiple observations  Specimen collection, preparation, and quality assessment

Specimen collection and new case entry
Describes how and when to collect new cases from the clinical lab (histopathology), how to record information, how to process cases, and where to store them.
2. Purification of genomic DNA and total RNA from formalin-fixed paraffin-embedded tissue sections Describes how to extract DNA and RNA from tissue sections that have been formalin-fixed and paraffin-embedded.
3. Quantifying, labelling, and storing RNA Describes how to quantify RNA, how to correctly design and print labels, where to store the original extraction case, and how to enter RNA extraction information into the tracking database.
4. Quantifying, labelling, and storing DNA Describes how to quantify DNA, how to correctly design and print labels, where to store the original extraction case, and how to enter DNA extraction information into the tracking database.

Preparing tumour DNA dilutions for targeted capture sequencing
Describes how to prepare, label, and store 50 ng/μL dilutions for targeted capture sequencing.
6. DNA quality assessment Describes how to assess and estimate the degree of degradation in DNA extracted from formalin-fixed, paraffin-embedded tissue.
7. RNA quality assessment Describes how to assess the quality of RNA.

Identifying and tracking peripheral blood DNA cases
Describes how to identify, locate, and track case peripheral blood DNA cases in the tracking database.

Preparing peripheral blood DNA dilutions for targeted capture sequencing
Describes how to prepare, label and store dilutions for targeted capture sequencing.
Digital gene expression profiling 10. NanoString Technologies nCounter Gene Expression Assay a hybridization procedure Describes how to perform the NanoString Technologies nCounter Gene Expression Assay hybridization using total RNA extracted from formalin-fixed, paraffin-embedded tissue sections.

NanoString Technologies nCounter prep station operation
Describes how to operate the NanoString Technologies nCounter prep station for purification and immobilization of post-hybridized nCounter assay cases to an nCounter cartridge allowing for subsequent data collection on the nCounter Digital Analyzer.

NanoString Technologies nCounter Digital Analyzer a operation
Describes how to operate the NanoString Technologies nCounter Digital Analyzer for data collection.
Fluorescence in situ hybridization (FISH)

FISH protocol for paraffin-embedded tissue
Describes how to perform FISH on paraffin-embedded tissues.

FISH analysis
Describes how to automatically score a tissue section using the Metafer b semi-automatic scoring software.

Illumina Multiplex Sequencing c
Describes an optimized protocol for Illumina paired-end multiplex library preparation using the SureSelect XT2 Library Prep and Capture System c .

Sequence analysis for mutation calling
Describes the processing and analysis of sequence data to identify high-confidence somatic variants ("mutation calls"). of the procedure or procedures (Table ii). In that way, the mean per-case estimate reflects an average across all cases observed, and not the total time required to process a single case in isolation. The hands-on time for each sop was recorded using a stopwatch, and the number of resources used was recorded in a data collection template; the resulting data were later transcribed. When a set of cases was batched for processing together (for example, digital gep), total resource use was divided by the number of cases in the batch.

Micro-Costing Analysis
A cost model was developed to calculate the monetary value of the 4 resource categories: labour, capital and equipment, supplies and reagents, and overhead. The mean hands-on time for each sop and the annual case throughput-that is, the number of cases that can feasibly be processed through the sop, given current resourceswere used to estimate the proportion of the unit cost for each resource that could be attributed to the 16 sops. Case throughput was based on historical referral patterns to the bcca for newly diagnosed nhl (dlbcl and follicular lymphoma) and Hodgkin lymphoma cases 31,32 , together with the project-specific average case accrual rate of 3-4 specimens per week. The result was an estimated case throughput of 180 cases annually. Although that estimate is conservative, it was considered reasonable, given the resource capacity of the study setting and the assumption in the cost model that resources (for example, number of staff) remained constant over time.

Labour
The mean per-case hands-on time derived from the timeand-motion study was used to calculate the proportion of total working hours devoted to conducting the sop. A per-minute salary rate was calculated based on a typical 7.5-hour workday, taking into consideration statutory holidays and vacation days 25 . The resulting per-minute rate was multiplied by the mean hands-on time to derive a per-case labour cost estimate. Minimum and maximum salary ranges were used in the calculation of upper and lower labour cost estimates.

Capital and Equipment
The unit cost for each type of capital resource and equipment used in a sop reflects the purchase price of the item. Because most of the capital resources and equipment used for the assays are shared between projects and laboratory staff, the proportion specific to the genomics research study was calculated based on mean per-case hands-on and fixed times for all sops. The annual capital resource and equipment costs were then divided by the estimated case throughput to establish a per-case capital and equipment cost. Straight-line amortization was used to calculate the annual cost of each unit during its useful lifetime (in years) 14 , which was informed by laboratory staff. When that information was not known, the duration of the service contract was used as a conservative estimate of the useful duration of the unit. Upper and lower per-case cost estimates were calculated based on manufacturer prices, where available (for example, manufacturer Web site).

Supplies and Reagents
The unit prices for supplies and reagents reflect the local purchase prices of the items. Each purchase price was divided by the number of units contained within the purchased item to establish a per-unit cost, which was then multiplied by the units required to process a single case through the sop. Maximum and minimum values for major drivers of supply and reagent costs were based on manufacturer prices, where available, and were used to calculate upper and lower per-case cost estimates.

Overhead
The 2014 overhead costs were provided by the bcca's operations management division and included costs related to information technology, security, building operation, and maintenance. To adjust for cost-sharing within the laboratory, the annual overhead cost (per square foot) was distributed between the laboratory directors so as to reflect the proportion attributed to a research project 14 . The cost per square foot was multiplied by the size of the lab bay in which each assay took place and was then divided by the estimated annual case throughput to establish a per-case overhead cost. To avoid double counting, overhead was applied only once in the calculation of assay costs.

Overall Cost
Once costs were derived for each resource category, a total per-case cost estimate for each assay was calculated by summing the per-case costs for labour, capital and equipment, supplies and reagents, and overhead across all relevant sops. Maximum and minimum cost estimates for each sop and assay were also calculated by summing the upper and lower cost estimates.

RESULTS
The present study provided a baseline understanding of the time and costs associated with processing some commonly used genomic assays in a research setting. Time estimates and per-case cost estimates are reported in the subsections that follow.

Estimates of SOP Hands-On Time
Table ii presents mean per-case hands-on time (with standard deviation) for each sop. For some sops (specifically, sop-08 and sop-15), only 1 full observation of the procedure was conducted; hence, a standard deviation is not available. The fewest cases were observed for the fish and targeted capture sequencing assays, in part because of staff and observer availability, but also because of the requirement for the larger batch size needed to process the assay (such as for targeted capture sequencing). The hands-on time of the sops related to specimen collection, preparation, and quality assessment was estimated at 86.5 ± 13.6 minutes per case. The assay that was found to be the most labourintensive was fish, at 258.2 ± 55.2 minutes per case, followed by targeted capture sequencing with bioinformatics analysis at 124.1 minutes (just one observation). The assay with the lowest hands-on time was digital gep with an 807gene code set: 14.9 ± 8.9 minutes per case. Bioinformatics analysis for gep was not observed.  43). An 807-gene code set-which is typically used in research settings for broad panel discoveries, including cell-of-origin set-was used for the gep cases observed during the present study. In clinical application of gep, a subset of that comprehensive gene panel is more likely to be used if the main application is cell-of-origin discovery. For that reason, the cost for digital gep with a 20-gene code set was also calculated: $419.49 per case (minimum: $320.89; maximum: $457.39).

Cost Estimates
The total cost to analyze a typical dlbcl, follicular lymphoma, and Hodgkin lymphoma case was calculated using the per-case assay cost estimates ( Table iii). As depicted in Figure 1, all eligible cases undergo specimen collection, preparation, and quality assessment at a mean cost of $146.30 per case. In addition, a typical dlbcl undergoing digital gep with an 807-gene code set, fish, and targeted capture sequencing is estimated to cost $2,670.41 per case. If the gep with a 20-gene code set was to be used instead, the cost declines to $2,191.55 per case. A typical follicular lymphoma case undergoing only targeted capture sequencing and bioinformatics analysis is estimated to cost $1,175.46 per case, and a typical Hodgkin lymphoma case undergoing digital gep with an 807-gene code set is estimated to cost $1,044.65 per case ($565.79 per case using a 20-gene code set).

DISCUSSION
Faced with priority-setting decisions between various treatment and therapeutic options, decision-makers in cancer control look to clinical, cost-effectiveness, and budget-impact analyses to inform the evidence base from which such decisions and policies can be made [34][35][36][37] . To the best of our knowledge, the present study is the first to report per-case cost estimates for 3 high-throughput genomic assays frequently used in cancer diagnostics, including fish, targeted capture sequencing, and digital gep. Used in economic decision models and budget-impact analyses, those cost estimates can help to determine the cost-effectiveness of the associated technologies and can inform decisions related to their implementation in clinical practice. The costing model developed in this study provides a platform from which future cost analyses of other genomic technologies, or of other settings, can be built.
The variability in per-case cost estimates for each assay is not surprising given the variability in the proportion of the cost contributed by each resource category to the performance of each sop. Targeted capture sequencing and digital gep are examples of assays that process cases in batches, with a relatively lower reliance on labour than is required for fish, whose case processing is much more individualized. Supplies and reagents accounted for most of the per-case cost for fish, likely because multiple break-apart probes are used to detect MYC, BCL2, and BCL6 rearrangements in every dlbcl case. The capital and equipment category was a major contributor to the costs of gep and targeted capture sequencing, which could in part be attributable to the conservative estimate of "useful lifetime" used in the depreciation calculations. External validation of the cost estimates derived in this study is challenging, given that estimates for the relevant technologies are not generally available from traditional costing resources such as administrative datasets and reimbursement systems 12,38 . That being said, the BC Medical Services Plan laboratory fee schedule includes a reimbursement rate of $466.46 for complex cytogenetic analysis (fish, fee item 93050), with additional billing codes applied if multiple probes are used on the same sample (that is, fee item P93051 at $192.68, up to a maximum of 3 times) 38 . In one cost-effectiveness study 39 that evaluated the use of digital gep to determine tumour-site origin, a per-case cost of US$4,400 (CA$4,338) was used for tissueof-origin gep with a 2000-gene code set. Other institutional price lists suggest that the cost of gep with a gene code set of 700 or more lies somewhere between US$315 per sample (CA$369) 40 and CA$500 per assay for catalogued code sets 41 . Some references to targeted capture sequencing suggest that the cost for a large disease-targeted multigene sequencing test range as high as US$2,000-US$10,000 (CA$2,120-CA$10,600) 42 , with one study using a base-case estimate of US$2,400 (CA$2,544) for a 34-gene sequencing panel test 43 . That variability for cost estimates highlights the importance of transparency in the calculation of estimates so as to ensure their representativeness for the setting or disease group of interest, an aspect of microcosting that we have aimed to achieve in the present study.
A number of limitations of our study should be noted. The setting for the study was one research laboratory situated within an established cancer centre in British Columbia. The nature of the setting has implications for the applicability of the resulting estimates outside a research-based setting. It is reasonable to expect that per-case costs could change once the assays are moved into clinical practice and become part of a regulated, provincially funded system. Specifically, additional costs would have to be considered for items related to data storage, additional quality control or assurance protocols, expanded informatics support, and lab management and oversight 7 . Furthermore, compared with the researchgrade reagents used in the present study, clinical-grade reagents are likely to come at an increased cost.
One of the strengths of a direct observational approach is that, compared with self-reported estimates, the data collected tend to be more precise. However, a direct observer approach also has a number of limitations, including the cost associated with training staff (acting as observers) to reliably collect information during a continuous observation period. For that reason, other forms of data collection such as activity logs tend to be used more frequently in practice 13 . One of the unintended consequences of using a direct observational approach could be the tendency for the performance of an activity to be altered to seem more favourable to the observer. To the extent possible, mitigation of that effect was attempted through the collection of multiple observations of each sop over an extended period of time, consultation with laboratory managers to ensure validity, and where possible, comparison of the results with available estimates from the broader grey literature. Estimates of time for each assay were based on observations from an individual staff member who was responsible for the procedure and experienced in its execution. The efficiency with which cases were processed through each sop could differ in other laboratory settings. It is also important to note that the time estimates reported here exclude "fixed" times, and hence the total duration of each sop (that is, the sum of the fixed and hands-on times) is longer in practice than is expressed here.
The small sample sizes observed for each sop (Table ii) also somewhat limit the level of analysis that could be conducted for the study, including calculation of the standard deviations for some assays. Future research studies might consider expanding the study criteria or increasing the observation period to increase sample size.
For some of the capital resources and equipment, "useful lifetime" is not widely known, because the technologies are relatively new and continuously changing. For those items, the depreciation calculations used a conservative estimate that reflected the length of the service contracts, a choice that could have overrepresented the "true" annual cost of the items. This potential overrepresentation therefore poses a limitation, in that the cost estimates are highly sensitive to changes in sample throughput.

CONCLUSIONS
The demand for high-throughput genomic assays-including cytogenetic analyses (for example, fish), digital gep, and targeted capture sequencing-for application to cancer diagnostics and personalization of cancer-specific interventions has risen dramatically over time. Pressure to apply such technologies outside the traditional research settings and into standard clinical practice is increasing. The challenge for health economists and funders of the technologies at the health care system level is to understand the economic impact and to allocate resources wisely. Our study provides a set of reliable cost estimates that can be used to inform economic evaluations of personalized approaches to cancer care. Future work can continue to build on this approach and to focus on the generalizability of the results to other cancer sites and populations.