Clinical Utility of Peripheral Blood Gene Expression Profiling of Kidney Transplant Recipients to Assess the Need for Surveillance Biopsies in Subjects with Stable Renal Function

Background: TruGraf is a blood test that measures gene expression signatures in kidney transplant recipients, providing information on adequacy of immunosuppression. Signatures derived from peripheral blood using DNA microarrays have been internally and externally validated in two populations of transplant recipients: (i) patients designated as TX (“Transplant eXcellence”) stable serum creatinine and normal biopsy, indicative of immune quiescence, and (ii) patients designated as not-TX (renal dysfunction and/or histological abnormalities). The test is intended for use in subjects with stable renal function as an alternative to protocol biopsies. Methodology: Simultaneous blood tests and transplant biopsies were performed in 169 patients. The molecular laboratory was blinded to renal function and biopsy results. Results: Biopsy-confirmed clinical phenotype was TX (105 cases), not-TX (64). Renal function was stable in 125 subjects (105 TX, 20 not-TX). Positive predictive value of TruGraf for detecting TX was 86% and 105/125 (84%) had a normal biopsy result. Significance of study: In subjects with stable renal function, TruGraf blood test result of TX corresponded to biopsy findings in 88% of cases. Results indicate that had the blood test been run in place of surveillance biopsies, 107/125 (86%) of patients with stable renal function may have avoided an invasive biopsy and 92/105 (88%) of these patients with biopsy-confirmed TX may have avoided a biopsy for a negative result.


Introduction
In 2015, 18,587 Americans received a kidney transplant, 60% of whom were Medicare patients [1]. The number of Americans living with and depending upon a functional kidney transplant is also rising. In 2015, there were over 200,000 living kidney recipients in the US, an increase of >3% per year since 2012 [2]. Results of kidney transplantation from the Scientific Registry of Transplant Recipients (SRTR) 2015 Annual Report indicate that short-term outcomes of kidney transplant patients have improved considerably due to an improved understanding of the immune system's role in transplant rejection, as well as better management of immunosuppression. However, 10 years after transplantation, only 47% of deceased donor transplants and 63% of living donor transplants are still functioning [1]. As a result, 13.2% of transplants every year are re-transplants, with the unfortunate "side effect" that re-transplantation of some may deny the opportunity of ever receiving a transplant to others on the waitlist [1]. The US kidney transplant wait list currently contains more than 100,000 candidates, many of whom will die having never undergone a transplant [3]. Currently, the median waiting list time for a kidney transplant is 3. 6

years [4].
A key reason why long term graft loss remains a significant problem relates to the fact that kidney injury that leads to irreversible damage, and eventual graft loss, is most often asymptomatic, i.e., subclinical immune injury leading to chronic rejection, for weeks and months prior to detection. Following kidney transplantation, patients require a lifetime of immunosuppressive drug therapy to prevent their immune system from rejecting the kidney. There are significant challenges to detecting injury early when the kidney has the greatest chance of regaining normal function. The standard of care for monitoring and detecting kidney injury includes measuring serum creatinine levels, immunosuppressive drug levels and performing surveillance graft biopsies [5,6]. Serum creatinine is an insensitive and lagging indicator of tissue injury and is a poor marker of the underlying severity of graft pathology, and drug levels may indicate potential toxicity, but are poor predictors of subclinical rejection and kidney damage [7]. Biopsies are expensive, invasive, risking infection and bleeding and even graft loss, such that they are unsuited for frequent monitoring; moreover, significant intra-observer variation in interpretation of biopsy results exists [8].
Currently, there is no validated test to measure or monitor the adequacy of immunosuppression, the failure of which may result in over-immunosuppression and opportunistic infections, or underimmunosuppression leading to subclinical, acute and chronic rejection [9]. Subclinical rejection (subAR) is histologically defined as acute

Journal of Transplantation Technologies & Research
rejection characterized by mononuclear cell infiltration identified from a biopsy specimen, but without renal dysfunction (stable serum creatinine level) [5,6,10]. SubAR can therefore, by definition, only be diagnosed on surveillance biopsies taken as per the individual centers protocol at a fixed time after transplantation, rather than being driven by clinical indication [5,6]. In contrast, clinical rejection (cAR) is characterized by acute functional renal impairment [6], and is therefore typically diagnosed with a for-cause biopsy. Detecting subAR with serial and non-invasive molecular biomarkers has become a highpriority objective of transplant medicine to prevent undetected subAR that can lead to chronic rejection and transplant failure [5,6,9]. Recent reviews have highlighted that biomarkers, that correlate with and/or predict allograft injury and improve therapeutic decision making, are priorities in transplantation, while underscoring the need for robust multicenter validation studies [11][12][13].
The TruGraf blood test (Transplant Genomics Inc, Mansfield, MA) is a Laboratory Developed Test (LDT) performed as a service available exclusively through the Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory at Transplant Genomics Inc. TruGraf relies on a specific gene expression signature in the peripheral blood to enable proactive non-invasive serial monitoring [14,15].
We have discovered and validated signatures derived from the peripheral blood of two populations of patients: 1) Patients following kidney transplantation with: (i) stable renal function, defined as a serum creatinine <2.3 mg/dL and <20% increase in serum creatinine compared to the average of 3 prior creatinine levels; and/or (ii) surveillance biopsies that revealed no evidence of histologic rejection. These patients were designated as TX.
2) Patients following kidney transplantation not meeting the strict criteria for TX. All patients in this group had either stable renal function (subclinical graft injury) or acute renal dysfunction (clinical graft injury) and underwent either surveillance or for-cause biopsies which confirmed the diagnoses. These patients were designated as not-TX.
A TruGraf blood test reported as "TX" in a kidney transplant recipient with stable renal function would allow physicians to identify, with high probability, patients who can be followed routinely, including with serial TruGraf monitoring, without the need for an invasive surveillance biopsy. The TruGraf test result of TX in a patient with stable renal function is a blood-based assay that provides a noninvasive assessment and a high probability of absence of graft rejection/injury reflecting immune quiescence and adequate immunosuppression. A TX result reassures the clinician that continued monitoring is sufficient without the need for an invasive biopsy. This is particularly important because surveillance biopsies only yield a 15-20% rate of positivity. Therefore, 80-85% of surveillance biopsies are unnecessary.
As part of our CLIA laboratory test validation efforts, we evaluated the analytical performance of the blood-based TruGraf gene expression assay used to assess the absence of graft rejection/injury and by inference, the adequacy of immunosuppression after kidney transplantation [15]. We have also performed an economic analysis of the cost effectiveness of molecular gene profiling in kidney transplant recipients [16].
This manuscript describes a retrospective data analysis of the clinical study that we performed to assess the potential ability of the TruGraf test to decrease the number of protocol/surveillance biopsies (the standard of care at most high volume transplant centers) in kidney transplant recipients with stable renal function.

Methods
Patients enrolled in this study were treated at The Northwestern University (NU) Comprehensive Transplant Center (CTC) in Chicago, IL and five participating clinical centers as part of the Genomics for Kidney Transplantation Project (NIH 1U19AI063603-01). All studies were approved by the Institutional Review Boards of the respective institutions and carried out in accordance with the Helsinki Declaration of 1975. The NU CTC houses a large repository of samples from transplant recipients. All kidney transplant recipients at Northwestern University undergo surveillance biopsies at 3, 12 and 24 months post-transplantation or for-cause biopsies in response to renal dysfunction. All patients who undergo biopsies are approached to provide informed consent to enroll in the biorepository.
In addition to blood samples (two 2.5 ml PAXgene tubes), kidney biopsy cores were obtained for standard histology and to be stored in RNAlater (Thermo Fisher, Waltham, MA) for future molecular phenotyping. Biopsies were classified using Banff 2007 criteria [17]. For the current analysis, samples from 169 transplant recipients were randomly selected. All samples for gene expression were derived from recipients who had clinical and laboratory data available, as well as a histologically confirmed biopsy diagnosis. All patients who participated in this study were >18 years of age, and recipients of a primary or subsequent kidney transplant alone. Recipients of multiorgan or prior non-renal transplants and patients with HIV were excluded. Details of the gene expression profiling methodology have been described previously [14,16,18].
The original version of the TruGraf test utilized a classifier developed using the Support Vector Machines (SVM) algorithm to identify genes specific to each phenotype (TruGraf 1.0). In further analyzing the gene specific data, we found that highly variable expressing genes were contributing to an increased noise level in the assay's performance. The classifier was therefore modified by using Random Forests algorithm to select component genes, which enabled in-depth interrogation of each gene's weighting and contribution to the assays performance (TruGraf 1.1). This resulted in performance improvements in accuracy and positive predictive value (PPV) of the TX phenotype. We analyzed data from these samples using a classifier threshold of 0.6 to distinguish between TX and not-TX, and then subsequently reduced the threshold to 0.5 to provide enhanced sensitivity for TX at the expense of specificity. No classifier is perfect, and the error profile can be manipulated based on trade-offs between sensitivity for the positive result (i.e., TX) and specificity. In light of the intended use for this assay, as described below, we concluded that it would be preferred to bias the classifier towards somewhat over-calling TX rather than risk over-calling not-TX. The 120 gene expression markers used by the TruGraf 1.1 classifier are listed in Table 1. 1561058_PM_at
The intended for use of the test is the assessment of adequacy of immunosuppression in subjects with stable renal function, as a possible alternative to protocol biopsies, and possibly to help guide TruGraf-informed for-cause biopsies. The results of the TruGraf test were compared to the histologic diagnosis in the 125 subjects with stable renal function. For this purpose a TruGraf blood test result of TX was designated as the positive result. Performance metrics of the TruGraf test were based on the comparisons between the blood test result and the biopsy diagnosis in these 125 subjects. The molecular laboratory was blinded as to the results of the biopsy histology.

Discussion and Conclusion
In our study, the PPV of the TruGraf test to detect TX was 86% and NPV 28%, with an accuracy of 78%. This compares well with other molecular diagnostic tests becoming available for use in transplant recipients. The Allosure test (CareDx, Inc, Brisbane, CA) measures circulating donor-derived cell-free DNA (dd-cfDNA) in transplant recipients. Studies have been reported in both heart [19] and kidney [20] transplant patients.
In kidney transplant recipients suspected of rejecting, dd-cfDNA had a sensitivity of 59% (95% CI, 44-74%) to discriminate active rejection (a term created to describe patients with >1% dd-cfDNA, albeit with clinical meaning yet to be determined), from no rejection with a PPV of 84% and NPV of 61% [20].
TruGraf is a qualitative, "rule in/ rule out" assay a TruGraf blood test reported as "TX" in a kidney transplant recipient would allow physicians to identify, with high probability, patients who can be followed routinely, including with serial TruGraf monitoring, without the need for an invasive surveillance biopsy. In addition, when reducing immunosuppression in the normal course of following a patient post-transplant, a signature of "TX" may reassure the clinician that in the lower level of immunosuppression is adequate without having to perform a protocol biopsy. Conversely, a signature of "not-TX", whether obtained in the process of monitoring a patient with stable renal function, or following reduction in immunosuppression, might prompt the clinician to monitor the patient more closely, perhaps to reverse the reduction in immunosuppression, and if indicated, to perform a biopsy.
In the particular cohort studied here, had the TruGraf blood test been run in the place of surveillance biopsies, 92/105 (88%) of patients with stable renal function and biopsy-confirmed TX would have avoided undergoing a biopsy for a negative result. In addition, 5/20 (25%) of patients with biopsy-confirmed not-TX would have been picked up with a blood test, providing additional decision support for performing a surveillance biopsy in spite of stable renal function.
The implications of incorrect TruGraf results also need to be considered. A false positive (TX blood profile but not-TX clinical phenotype) TruGraf test result would have resulted in missing a case that would presumably have been diagnosed as subAR on biopsy in 15/125 (12%) of patients. Given that the TruGraf test is designed to be run serially, these likely subARs would have ample opportunity to be picked up in the course of serial monitoring. In the case of a false negative (not-TX blood profile but a TX clinical phenotype), the TruGraf test result may have led to a TruGraf prompted biopsy in 13/125 (10%) of patients.
These data imply substantial clinical utility of the TruGraf test in supporting physician decisions around kidney transplant recipient health management and the ability to offer personalized immunosuppression and reduce surveillance biopsies. Currently, large scale prospective studies are being initiated to further test the utility of TruGraf. The current study has demonstrated that peripheral blood gene expression profiling may provide an effective, reliable and noninvasive method for assessing adequacy of immunosuppression in kidney transplant recipients with stable renal function. Serial blood profile monitoring can be done more frequently than kidney biopsies and is less invasive, and results in a significant cost saving to the health delivery system. It also has the potential to guide the clinician as to when to do a biopsy in a patient with normal renal function, and may eventually be a tool for replacing surveillance (protocol) biopsies. At transplant sites that currently perform routine surveillance biopsies, the routine use of TruGraf monitoring would reduce the overall number of biopsies and significantly lower the number of negative biopsies. At sites that do not currently use routine surveillance biopsies, TruGraf would greatly increase detection of subAR following TruGraf-prompted biopsies and result in a modest number of negative TruGraf-prompted biopsies, far lower than the rate of negative biopsies prompted only by protocol.
The results of the TruGraf test need to be considered in the context of all of the other clinical information available to help support a decision on whether or not to perform a protocol biopsy in a patient who appears to be doing just fine. The universal use of TruGraf monitoring would also allow for a standardized approach at all transplant sites, avoiding the current lack of a standardized approach to confirming immune quiescence or performing biopsies to detect suspected subAR.