Towards a more precise and individualized assessment of breast cancer risk

Many clinically based models are available for breast cancer risk assessment; however, these models are not particularly useful at the individual level, despite being designed with that intent. There is, therefore, a significant need for improved, precise individualized risk assessment. In this Research Perspective, we highlight commonly used clinical risk assessment models and recent scientific advances to individualize risk assessment using precision biomarkers. Genome-wide association studies have identified >100 single nucleotide polymorphisms (SNPs) associated with breast cancer risk, and polygenic risk scores (PRS) have been developed by several groups using this information. The ability of a PRS to improve risk assessment is promising; however, validation in both genetically and ethnically diverse populations is needed. Additionally, novel classes of biomarkers, such as microRNAs, may capture clinically relevant information based on epigenetic regulation of gene expression. Our group has recently identified a circulating-microRNA signature predictive of long-term breast cancer in a prospective cohort of high-risk women. While progress has been made, the importance of accurate risk assessment cannot be understated. Precision risk assessment will identify those women at greatest risk of developing breast cancer, thus avoiding overtreatment of women at average risk and identifying the most appropriate candidates for chemoprevention or surgical prevention.

AGING both the United States Preventative Services Task Force and the American Cancer Society [7,8]. Women at moderate risk can begin annual screening earlier and should consider FDA-approved chemoprevention, such as tamoxifen, raloxifene or aromatase inhibitors [9]. Women at highest risk are candidates for aggressive screening (e.g., with breast MRI) or surgical prevention [10][11][12][13].

Limitations of current risk assessment models frequently used in the clinic
A number of models are available for estimation of individual breast cancer risk based on clinical factors such as family history, reproductive profile, history of prior breast biopsy, and breast density ( Table 1). The most commonly used clinical models are the Gail [14,15], the Claus [16], and the International Breast Cancer Intervention Study (IBIS) models [17]. For an excellent and comprehensive discussion of all available clinical models (e.g., hereditary, etc.) see the 2017 Cintolo-Gonzalez review [18,19]. The Gail model uses reproductive and biopsy information but only a limited family history (mother or sister with breast cancer) to calculate risk. This model is validated and classifies subsequent breast cancer cases modestly well, with estimates of the area under the receiver-operating characteristic curve (AUC) of 0.45-0.74 [15,[20][21][22]. For risk calculations see https://bcrisktool.cancer.gov. The Claus model uses first-and second-degree family history to calculate risk but does not consider additional family history and other risk factors (such as hormonal factors or biopsy history). This model has an estimated AUC of 0.72 [20]. For risk calculations see CancerGene (https://cagene.com/) [23]. The IBIS model uses reproductive history, biopsy history, family history and body mass index (BMI). The IBIS model also includes a more extensive assessment of family history, characterizing breast cancers in both first-and seconddegree relatives and the age at which they were diagnosed. The AUC of the IBIS model ranges between 0.54 -0.76, depending on the population assessed [20,22,[24][25][26][27][28]. For risk calculations see http://www.emstrials.org/riskevaluator/. See Table 1 for a more complete review of factors included in each model and the discriminatory accuracy in both general and highrisk populations.
Newer clinical models such as the Breast Cancer Surveillance Consortium (BCSC) model and updated/revised versions of the IBIS model (version 8) have incorporated mammographic density (MD) into assessment of risk. Mammographic density is a strong, independent risk factor for breast cancer development with studies showing a 4-6-fold increased risk for breast cancer for women with the highest breast density category compared with women in the lowest breast density category [29][30][31][32][33][34][35][36][37][38]. The BCSC model also incorporates reproductive factors, first-degree familyhistory, and recently added biopsy history to its set of predictors [39,40]. This model is validated and classifies breast cancer incidence with an AUC of 0.67 [39,41]. Accuracy of the latest version of the IBIS model has not been assessed.
Given that an AUC of 0.5 suggests that the test (or model in this case) performs no better than chance, the fact that none of the above models have an AUC greater than 0.76 leaves room for improvement [22,42,43]. There is, therefore, a significant need for more precise risk assessment. Recent advances in genetics have improved our ability to assess risk at the individual level. Genome-wide association studies have identified >100 single nucleotide polymorphisms (SNPs) associated with breast cancer risk [44][45][46][47] and polygenic risk scores (PRS) have been developed by several groups using this information [48,49]. Case-control studies have demonstrated the ability of PRS to accurately categorize risk (with AUC ranging from 0.59 -0.65) [50][51][52]. However, risk associated with any of the developed polygenic risk scores needs to be interpreted with caution as their predictive capacity has not been validated outside of the populations in which they were developed. As seen with genetic testing, this may limit generalizability [53]. Several groups have examined whether use of PRS improves accuracy of currently available clinical models and demonstrated AUCs between 0.62 and 0.70 [41, [54][55][56][57][58][59]. The ability of PRS to improve current clinical models is under prospective evaluation in the WISDOM trial [60,61]. Given that many of the SNPs included in polygenic risk scores are likely associated with hereditary risk, caution should be used when adding genetic factors to family history-based models without accounting for joint influences on model fit. The ability of a PRS to improve risk assessment is promising; however, utility in genetically and ethnically diverse populations must be studied.

5-yr risk
In breast cancer patients, the presence of miRNA in circulation correlates with expression of that miRNA in primary breast tumors [80][81][82]. Additionally, significant differences in specific C-miRNA have been found between cancer patients and healthy controls [65,[83][84][85][86], suggesting potential clinical utility for cancer detection [64,81,82,[87][88][89][90][91][92][93]. For cancer risk assessment a biomarker must predict disease status with acceptable specificity and sensitivity [94]. To date, only a handful of studies have evaluated the utility of C-miRNA in cancer risk assessment. For example, several studies have evaluated miRNAs associated with risk for colon cancer and identified miRNAs associated with a preneoplastic colon lesion [95][96][97][98][99]. An independent and larger study identified a panel of 3 C-miRNAs as a promising colon cancer risk biomarker [100]. Other studies have discovered a number of miRNAs dysregulated in women <18 months from a breast cancer diagnosis, consistent with early detection [101][102][103]. Taken together, these data suggest that it is feasible that C-miRNAs can provide a signature of breast cancer risk with actionable lead-time for prevention.
Our group recently identified a C-miRNA-based risk signature predictive of long-term risk in a prospective cohort of women at increased risk for developing breast cancer. This IRB-approved prospective cohort includes over 600 high-risk women (who have signed informed consent) with a median follow-up of 8.9 years. From this cohort we selected 24 invasive breast cancer cases, to whom we matched controls on age, reason for highrisk status (e.g., strong family history of breast cancer or benign breast disease), and follow-up time. The median age at blood draw was 55.4 (range 33.9-77.5) for affected cases and 55.1 (range 32.8-78.4) for cancerfree controls (see Table 1: Subject characteristics in Oncotarget [104] for complete cohort clinical characteristics). RNA was isolated from banked serum, and profiled for over 2500 mature human miRNAs. The full Affymetrix GeneChip miRNA v4 (miRbase v20) microarray expression dataset is freely available in GEO Datasets (GSE98181, https://www.ncbi.nlm.nih. gov/geo/query/acc.cgi?acc=GSE98181) and the R scripts used for data analysis accompany our open access 2017 Oncotarget manuscript as a supplement [104]. We identified 25 C-miRNAs that were significantly differentially expressed between cases and controls. From these 25 miRNAs, we discovered a AGING group of 6 C-miRNAs that together discriminated cases from non-cases with high accuracy (AUC=0.896) (Figure 1). For the women who developed cancer in this cohort, blood had been banked a median of 3.2 years (range 0.6-8.7) prior to diagnosis, making this clearly a signature associated with risk and not early detection [104]. Refinement and validation of this risk signature is ongoing, using banked samples from previously performed randomized clinical trials. The validation of a sensitive and specific, non-invasive C-miRNA risk assessment tool will arm clinicians with vastly improved individualized risk estimates for patients, relevant to both young and older women. These risk estimates can be used to guide selection of the most appropriate screening and prevention options for a given individual. Information from miRNA expression will also provide valuable insight into the underlying biology of breast cancer initiation and may provide targets for chemoprevention.
Personalized and precise risk assessment can identify those women at greatest risk to develop breast cancer, thus avoiding overtreatment of women at lower/average risk and identifying women at high risk who would be candidates for high risk screening, chemoprevention or surgical prevention. Progress has been made towards personalized risk assessment and some promising new markers have been identified. However, rigorous validation of the most promising markers, and the predictive models they contribute to, in relevant populations is necessary before deployment for clinical use.