Development and Validation of Prediction Model for High Ovarian Response in In Vitro Fertilization-Embryo Transfer: A Longitudinal Study

Objective To develop and validate a prediction model for high ovarian response in in vitro fertilization-embryo transfer (IVF-ET) cycles. Methods Totally, 480 eligible outpatients with infertility who underwent IVF-ET were selected and randomly divided into the training set for developing the prediction model and the testing set for validating the model. Univariate and multivariate logistic regressions were carried out to explore the predictive factors of high ovarian response, and then, the prediction model was constructed. Nomogram was plotted for visualizing the model. Area under the receiver-operating characteristic (ROC) curve, Hosmer-Lemeshow test and calibration curve were used to evaluate the performance of the prediction model. Results Antral follicle count (AFC), anti-Müllerian hormone (AMH) at menstrual cycle day 3 (MC3), and progesterone (P) level on human chorionic gonadotropin (HCG) day were identified as the independent predictors of high ovarian response. The value of area under the curve (AUC) for our multivariate model reached 0.958 (95% CI: 0.936-0.981) with the sensitivity of 0.916 (95% CI: 0.863-0.953) and the specificity of 0.911 (95% CI: 0.858-0.949), suggesting the good discrimination of the prediction model. The Hosmer-Lemeshow test and the calibration curve both suggested model's good calibration. Conclusion The developed prediction model had good discrimination and accuracy via internal validation, which could help clinicians efficiently identify patients with high ovarian response, thereby improving the pregnancy rates and clinical outcomes in IVF-ET cycles. However, the conclusion needs to be confirmed by more related studies.


Introduction
With the rapid development of assisted reproductive technology (ART), in vitro fertilization-embryo transfer (IVF-ET) has become an important treatment for infertility [1]. Controlled ovarian hyperstimulation (COH) is a key step of IVF-ET, where gonadotropin (Gn) stimulates the development of multiple follicles and produces multiple mature oocytes, thereby improving pregnancy rates [2,3]. However, it cannot be ignored that ovaries' overreaction to Gn could increase the risk of iatrogenic complication-ovarian hyperstimulation syndrome (OHSS) [4], which is characterized by an increase in ovarian volume and brings more severe and even fatal infertility. Therefore, it is still necessary to identify the risk of OHSS in IVF-ET for patients.
High ovarian response, defined as excessive ovarian response, is reported as an adverse effect of IVF-ET [5]. It is mainly due to changes in the systemic stress state led by the recruitment and development of multiple follicles and abnormally high steroid substances [5]. Hormone level is higher in patients with high ovarian response, which is not conducive to endometrial receptivity and embryo implantation, thereby increasing the incidence of ovarian hyperstimulation syndrome (OHSS) [6]. In view of this, it is of great clinical significance to find predictors for high ovarian response, which may decrease the risk of OHSS, improve pregnancy rates, and optimize pregnancy outcomes.
Previous studies verified that anti-Müllerian hormone (AMH) and antral follicle count (AFC) were the effective predictors of high ovarian response [7][8][9][10][11][12]. In the study of Oehninger et al., female age and follicle-stimulating hormone (FSH) were also found to be the predictors of high ovarian response [13]. To our knowledge, most studies focused on the predictive factors of poor ovarian response, and few studies have been done on predictive factors of high ovarian response. Not only that, a complete predictive system on high ovarian response has not been established yet, and there are only some researches of individual indicators, which was not accurate for predicting high ovarian response. Hence, this study was to explore the independent predictors of high ovarian response in patients undergoing IVF-ET and set out to develop a model for predicting the risk of high ovarian response and perform internal validation. We believed that the results will provide better guidance to the clinical use of a reasonable COH protocol, thereby improving pregnancy rates and clinical outcomes in IVF-ET cycles.

Collection of the Data.
A longitudinal study of 1,142 outpatients with infertility who underwent IVF-ET at the Affiliated Renhe Hospital of China Three Gorges University from January 2018 to December 2019 was consecutively selected. The patients were divided into high ovarian response group (>15 oocytes retrieved) and normal ovarian response group (4-15 oocytes retrieved) [9,14]. The inclusion criteria were as follows: (1) patients aged 20-40 years with normal menstrual cycles (21-35 days); (2) patients undergoing IVF-ET; (3) patients with complete clinical data. Patients were excluded if met any of the following criteria: (1) patients who were diagnosed with polycystic ovary syndrome (PCOS) according to Rotterdam criteria; (2) patients who had primary ovarian insufficiency; (3) patients who had ovarian-related surgery before (such as laparoscopic ovarian drilling, ovarian dissection for endometriosis, unilateral oophorectomy); (4) patients who had hormonal contraceptives before study cycle; (5) patients who had taken other investigational drugs or was participating in other clinical studies within 1 month before study enrolment; (6) physician considered patients who were inappropriate to participate in the clinical investigator.
The study was approved by the Ethics Committee of Affiliated Renhe Hospital of China Three Gorges University with the number of 2020K05. Signed informed consent was obtained from all patients before enrollment.
Gonadotropin-releasing hormone (GnRH) agonist was subcutaneously injected on the 21st day of menstruation to downregulate the function of pituitary gland. After 14 days, COH was initiated with the appropriate amount of Gn (150-300 U/d) according to age, body mass index (BMI), and baseline FSH. The development of follicles was monitored by ultrasound. When 1-2 follicles were ≥18 mm in diameter, or 2-3 were ≥17 mm in diameter, 10,000 U of human chori-onic gonadotropin (HCG) was injected. Oocytes were retrieved 36 h after HCG injection, and embryo transfer was performed 3-5 days after retrieval, after which 80 mg of progesterone (P) was intramuscularly injected daily as luteal support. A positive serum HCG pregnancy test after 14 days was defined as biochemical pregnancy, and ultrasound confirmation of a gestational sac or heartbeat (fetal pole) 35 days after transfer was diagnosed as clinical pregnancy.
2.2. Analytic Methods. Baseline characteristics of patients were collected in our study, of which categorical variables included smoking history, type of infertility, and pregnancy history; continuous variables contained age, BMI, age at menarche, mean menstrual cycle, and duration of infertility; AFC, endometrial thickness, luteinizing hormone (LH) level, estradiol (E2) level, P level, FSH level, and AMH level on menstrual cycle day 3 (MC3); dosing days, initial dose, and total dose of Gn; endometrial thickness and hormone levels on the day of HCG injection. AFC was assessed by transvaginal sonography at MC3; endometrial thickness was observed on the day of injection of HCG; LH was defined as a hormone secreted by basophils in the anterior pituitary gland; E2 was a steroidal estrogen with the normal value of follicular stage (94-433 pmol/L), the normal value of luteal phase (499-1580 pmol/L), and the normal value during ovulation (704-1580 pmol/L); P was defined as main progesterone with biological activity secreted by the ovary; FSH was a hormone secreted by basophils in the anterior pituitary gland that promoted follicle maturation; AMH was defined as a hormone secreted by follicles in the predeveloping chambers or small chambers of the ovary. In addition, dosing days, initial dose, and total dose of Gn, endometrial thickness, and hormone level on the day of HCG injection were collected for analyzing the predictive factors of high ovarian response.
The measurement data was tested by the Kolmogorov-Smirnov test for normality. Normally distributed continuous variables were expressed as mean ± standard deviation (Mean ± SD), and Student's t test was used for comparison between groups; continuous variables with skewed distribution were expressed as median and quartile [M ðQ 1 , Q 3 Þ], and the Mann-Whitney U test was applied for comparison. Categorical variables were expressed as number of cases and constituent ratio [n (%)], and the Chi-square test and Fisher's exact test were used for comparison. The twosided test was performed for all statistical analyses, and P < 0:05 was considered statistically significant. All statistical analysis was performed by using SAS 9.4 and R 4.0.2 (model validation and drawing) statistical software. Prediction models were constructed by adopting SAS 9.4 (Logistic model) and Python 3.7 [Broad Learning System (BLS) model] software.
In the present study, the population was randomly divided into training set for developing the prediction logistic model and BLS model, and testing set for validating the models at the ratio of 7 : 3. Univariate and multivariate logistic regressions were carried out to explore the predictive factors of high ovarian response, and then, the prediction model was constructed. Nomogram was plotted for visualizing the model. Area under the receiver-operating characteristic (ROC) curve, the Hosmer-Lemeshow test, and calibration curve were used to evaluate the performance of the prediction model. The Youden index was used to calculate the cutoff point, which was identified as the cutoff point at a high risk for high ovarian response. Then, the comparison of predictive power was carried out between the logistic model and BLS model for high ovarian response.

Patient Characteristics.
Among the total 1,142 outpatients, patients who were over 40 years old (n = 63) were pretreated with hormonal contraceptives before study cycle (n = 54), had ovarian-related surgery before (n = 47), had primary ovarian insufficiency (n = 28), were diagnosed with PCOS according to Rotterdam criteria (n = 196), had low ovarian response (n = 264), or had missing information of age and BMI (n = 10) were excluded. 480 eligible outpatients were enrolled with 336 patients in the training set and 144 in the testing set eventually. Then, the patients were divided into the high response group (HR group, n = 239) and the normal response group (NR group, n = 241). The mean age was 31:36 ± 3:79 years, and the mean age at menarche was 13:16 ± 1:22 years. Only 6 (1.25%) patients reported the history of active smoking, 75 (15.63%) had the history of passive smoking, and the remaining 399 (83.13%) had no smoking history. The mean menstrual cycle was 29:23 ± 2:04 days, and the median duration of infertility was 3.00 (1.00, 4.00) years. Table 1 gives an overview of baseline characteristics of all patients. No significant differences were observed between the training set and the testing set in all variables (all P > 0:05).  Table 2 suggested that age, mean menstrual cycle, AFC, P at MC3, FSH at MC3, AMH at MC3, initial dose of Gn, total dose of Gn, LH on HCG day, E2 on HCG day, and P level on HCG day in the high response group were significantly different from those in the normal response group (all P < 0:05), which could be considered as potential predictors and included in the multivariate analysis (Table 3).

Development and Visualization of the Prediction Model.
To identify predictors for high ovarian response in IVF-ET, the multivariate logistic regression was performed. The results indicated that AFC, AMH at MC3, and P level on HCG day were independently associated with high ovarian response. For each additional AFC, the risk of high ovarian response increased by 0.671-fold (95% CI: 1.453-1.921, P < 0:001). For every 1 ng/mL increase in AMH at MC3, the risk of high response increased by 0.874-fold (95% CI: 1.404-  (Table 4). Then, the predicted risk of high ovarian response was calculated as follows: Ln ðP HR /ð1 − P HR ÞÞ = −11:094 + AFC * 0:513 + AMH at MC3 * 0:628 + P on HCG day * 0:668 (P HR represented the probability of high ovarian response). To visualize our model, the nomogram was plotted (Figure 1). For example, we randomly chose a patient whose P at HCG day was 0.53 nmol/L, AMH at MC3 was

Computational and Mathematical Methods in Medicine
3.6 ng/mL, and AFC was 18. The total point was 134, and the predicted probability of high ovarian response was 0.682 (Figure 2), which was higher than optimum cutoff point 0.491 (Table 5) and indicated a higher incidence of high ovarian response.   The calibration curve also confirmed the good calibration of our model (Figure 4).

Comparison for the Prediction Models.
As a novel neural network model based on random vector functional-link neural network (RVFLNN), BLS is suitable for processing relatively simple data and has a faster learning speed [15]. Therefore, we constructed a BLS model with 4 feature nodes per window, 5 feature node windows, 9 enhancement nodes, incremental steps (3), number of reinforcement nodes (50), coefficient of compressibility (0.7), and regularization coefficient (2 -30 ) and compared its predictive power with the logistic model. The predictors of our logistic model were put into the BLS model to assess the predictive power. The results showed that the AUC and accuracy of the BLS model were inferior to the logistic model (Table 7). It was indicated that the BLS model may be not suitable for simple data which included only three variables. The ROC curves for the training set and the testing set of the BLS model are displayed in Figure 5.

Discussion
High ovarian response can induce the risk of OHSS, leading to the increased discomfort in patients and even reducing prospects for pregnancy [16]. Up to 30% of cases with mild or moderate OHSS and 3-8% with severe OHSS were reported in IVF-ET cycles [17]. In this study, we aimed to develop a prediction model to predict the risk of high ovarian response in patients undergoing IVF-ET. Our results suggested that AFC, AMH at MC3, and P level on HCG day were the three effective predictors for high ovarian response in IVF-ET cycles. What is more, a combined prediction model with good performance was developed and validated: Ln ðP HR /ð1 − P HR ÞÞ = −11:094 + AFC * 0:513 + AMH at MC3 * 0:628 + P on HCG day * 0:668 (P HR represented the probability of high ovarian response). Simultaneously, we plotted a nomogram for visualizing our model; the AUC value of the combined prediction model reached 0.958, which suggested the good discrimination of the model, and the internal validation confirmed the accuracy and feasibility of the model. Further, the Hosmer-Lemeshow test and the calibration curve showed the good    In our study, both AFC and basal AMH were independently associated with high ovarian response, which was consistent with the results of previous researches [9,[18][19][20]. Aflatoonian et al. reported that AMH and AFC were considered as the accurate and reliable predictors of high ovarian response to COH and could identify the patients who had an increased risk of OHSS before stimulation [18]. The reason may be that AMH concentration correlates significantly with the number of sinusoidal follicles in the ovary before ovulation and the number of oocytes collected after treatment, and patients with high ovarian response have higher AMH concentration compared with patients with normal ovarian response. Not only that, AMH was regarded as an excellent predictor of high ovarian response and could identify the risk of OHSS better than AFC, which may be due to the fact that AMH has more stable periodicity and is less susceptible to exogenous steroid hormones; moreover, AFC requires skilled ultrasound operators to carefully identify, measure, and count ovarian eggs, probably resulting in more interobserver variability in AFC [7,18,[21][22][23][24][25]. However, few studies have analyzed the association between P level on HCG day and high ovarian response. In a retrospective study, P level at the first day of stimulation was recorded as a potential predictor, but no statistical significance was found [26]. Studies have shown that P level on the day of HCG administration varies among different ovarian responders [27][28][29]. Whether the P level can affect pregnancy rates and IVF-ET outcomes remains to be verified in further research. More importantly, our study showed that the model including all predictors had a more accurate predictive power for high ovarian response than the one con-taining independent predictors. The possible explanation may be that considering only one factor to predict the probability of high ovarian response and ignoring the existence of other factors may reduce predictive ability and increase the error brought by independent factor. However, our conclusion needs to be confirmed by more related studies.
The strengths of the study should be noted. We identified predictors for high ovarian response in IVF-ET and developed a prediction model with more accurate predictive power, which could help clinicians efficiently identify patients at a risk of high ovarian response and individualize treatment for these patients. However, there were also some limitations in our study. Firstly, women aged 20-40 years were enrolled in our study, which may be considered as nonrepresentative samples. This is mainly due to the fact that women of this age group have better fertility with more ideal stimulation effect. A wider range of ages could be considered in the future research to improve the universality of the model. In addition, small sample size and lack of external validation may affect the general applicability of our model. A multicenter study with large sample size and external validation is required to improve the accuracy and reliability of the model.

Conclusion
The developed prediction model had good discrimination and accuracy through the internal validation, which could help clinicians identify patients with high ovarian response, thereby improving pregnancy rates and clinical outcomes in IVF-ET cycles.

Data Availability
All the data utilized to support the theory and models of the present study are available from the corresponding authors upon request.

Conflicts of Interest
The authors declared no potential conflicts of interests.

Authors' Contributions
Xinsha Tan and Jing Yang designed the study and wrote the manuscript. Xinsha Tan, Honglin Xi and Wenfeng Wang collected and analyzed the data. Xinsha Tan contributed to literature search. Jing Yang critically reviewed and improved the drafts of the manuscript. All authors have read and approved the final manuscript.