Diagnosis of Idiopathic GHD in Children Based on Response to rhGH Treatment: The Importance of GH Provocative Tests and IGF-1

Purpose: Serum IGF-1 (Insulin like growth factor 1) and Growth Hormone (GH) provocative tests are reasonable tools for screening and diagnosis of idiopathic GH Deficiency (IGHD). However, the average cut-off points applied on these tests have a lower level of evidence and produce large amounts of false results. The aim of this study is to evaluate the sensitivity, specificity, and accuracy of IGF-1 and GH stimulation tests as diagnostic tools for IGHD, using clinical response to recombinant human GH (rhGH) treatment as diagnostic standard [increase of at least 0.3 in height standard deviation (H-SD) in 1 year]. Methods: We performed a prospective study with 115 children and adolescents presenting short stature (SS), without secondary SS etiologies such as organic lesions, genetic syndromes, thyroid disorders. They were separated into Group 1 [patients with familial SS or constitutional delay of growth and puberty (CDGP), not treated with rhGH], Group 2 (patients with suspicion of IGHD with clinical response to rhGH treatment), and Group 3 (patients with suspicion of IGHD without growth response to rhGH treatment). Then, they were assessed for diagnostic performance of IGF-1, Insulin Tolerance Test (ITT) and clonidine test (CT) alone and combined at different cut-off points. Results: Based on the ROC curve, the best cut-off points found for IGF-1, ITT, and CT when they were used isolated were −0.492 SDS (sensitivity: 50%; specificity: 53.8%; accuracy: 46.5%), 4.515 μg/L (sensitivity: 75.5%; specificity: 45.5%; accuracy: 52.7%), and 4.095 μg/L (sensitivity: 54.5%; specificity: 52.6%; accuracy: 56.9%), respectively. When we had combined IGF-1 with−2SD as cut-off alongside ITT or CT, we found 7 μg/L as the best cut-off point. In this situation, ITT had sensitivity, specificity and accuracy of 93.9, 81.8, and 90.1%, while CT had 93.2, 68.4, and 85.7%, respectively. Conclusion: Our data suggest that diagnosis of IGHD should be established based on a combination of clinical expertise, auxologic, radiologic, and laboratorial data, using IGF-1 at the −2SD threshold combined, with ITT or CT at the cut-off point of 7 μg/L. Additional studies, similar to ours, are imperative to establish cut-off points based on therapeutic response to rhGH in IGHD, which would be directly related to a better treatment outcome.


INTRODUCTION
Children whose stature is two height standard deviation (H-SD) below the mean for age and sex (1) or who have a height deficit greater than one H-SD relative to the family height should be referred for a complete short stature investigation (2). After considering and excluding other short stature (SS) etiologies such as familial SS (FSS), constitutional delay of growth and puberty (CDGP) and secondary causes (organic lesions, thyroid disorders), the investigation for Growth Hormone Deficiency (GHD) should be conducted (3).
Recently, Hussein et al. (4), evaluating 637 children and adolescents with SS, found FSS in 42% of them, CDGP in 16%, GHD in 12% and idiopathic SS (ISS), aside CDGP and FSS, in 2% (4). Nevertheless, to distinguish the last two conditions, we solely have available provocative GH tests and IGF-1, which present low evidence level and produce large amounts of false results (5,6) The diagnosis of idiopathic GHD (IGHD) is established by stimulation tests of GH secretion, such as Insulin Tolerance Test (ITT) and the Clonidine Test (CT) (7,8). Serum IGF-1 has also been used to evaluate the somatotropic axis function.
The current consensus statement and some authors recognizes that a satisfactory therapeutic response to recombinant human GH (rhGH) corresponds to an increase in H-SD of more than 0.3-0.5 after 1 year of treatment, thus confirming the hormonal deficiency (9)(10)(11). In addition, increases in predicted height and changes in growth rate are useful to analyze clinical response (12).
Therefore, the aim of this study is to evaluate sensitivity, specificity and accuracy values of IGF-1 and GH stimulation tests as diagnostic tools for IGHD, using clinical response to rhGH as diagnostic standard.

Study Design and Patients
This prospective study aimed to evaluate the diagnostic performance of IGF-1 and provocative GH tests (ITT and CT) at different cut-off points (−1SD and −2SD for IGF-1 and 3, 5, 7, and 10 µg/L for GH). Additionally, it was verified whether the measurement of IGF-1 (at the two cut-off points cited) increases the sensitivity, specificity, and accuracy values of the stimulation tests when used together. We also aimed to define the best cut-off points for GH peaks in the provocative tests for the diagnosis of IGHD with a ROC curve.
Data were collected during medical follow-up and treatment of 115 prepubescent children and adolescents with short stature (patients who had <−2 SD height for age and sex and/or <−1 SD for target height) (1). Other SS etiologies such as organic lesions, genetic syndromes and thyroid disorders were excluded.
Were also excluded from the data analysis: children who entered puberty within the first year of clinical follow-up after starting rhGH treatment and those with low adherence as well as loss of follow-up.
The decision to start rhGH therapy was due to the compliance of at least one of the following clinical criteria: height below −3SD; height between −3SD and −2SD combined with growth rate below percentile 25 for its respective age and sex; or height above −2SD associated with growth rate below −1SD (13). Therefore, the provocative tests were not used to indicate rhGH treatment. Patients with FSS and CDGP were diagnosed based on family history of short stature and auxological criteria along with the presence or absence of bone age delay, indicating the probability of delayed growth and puberty (9,14).
Subjects were divided into three groups: Group 1 (n = 20) (9 patients with FSS and 11 with CDGP diagnosed according to consensus guidelines (9) that were not treated with rhGH), Group 2 (n = 62) [IGHD patients with clinical response to rhGH (confirmed diagnosis of IGHD by the increase of at least 0.3 SD in height at the end of a year of treatment with rhGH)], and Group 3 (n = 33) (patients who were previously diagnosed as IGHD but with no growth response to treatment with rhGH). After the treatment with rhGH, the last group was considered as having ISS aside CDGP and FSS. In summary, only groups 2 and 3 were treated with rhGH for at least 1 year, between 2010 and 2018. All patients underwent at least one provocative GH test and had normal skull Magnetic Resonance Image (MRI). The study was approved by ethics committee and written informed consent was obtained.

Summary of Study Design
In summary, we recruited 115 patients with short stature (SS), without secondary SS etiologies such as organic lesions, genetic syndromes, thyroid disorders, after that, we separated group 1 (N = 20), diagnosed with FSS or CDGP based on clinical and auxological criteria. Those patients were not treated with rhGH. The 95 remaining patients were assumed as having IGHD and treated with rhGH in similar range doses for at least 1 year. The decision to treat those patients with rhGH was based on auxological criteria, not by stimulation GH tests. Those patients, just after 1 year, were separated in groups 2 and 3, based on their response to rhGH treatment and considered group 2 (responders) as true IGHD. The group 3 (non-responders) was then, diagnosed as possibly ISS (aside CDGP and FSS) (4). Therefore, we assumed that, because of that, they did not have a so great response as group 2. Just after that, we were looking at results of IGF-1 and GH tests performed, to evaluate their utility to identify those groups before treatment.

Clinical and Laboratorial Data
The dose of rhGH used by the subjects of the study was 0.7-1 UI/kg/week during the first year of follow-up. Provocative tests used in this study were ITT and CT and were performed after 8-h overnight fasting, starting 30 min after placement of venous catheter with slow saline infusion. Blood samples were collected every 30 min between 0 and 120 min. Insulin was administered intravenously (0.05-0.1 U/kg) and clonidine was administered orally (0.15 mg/m 2 ). ITT was considered adequate for somatotropic axis assessment if hypoglycemia of 40 mg/dL or less was reached. All children underwent the second stimulation test on a separate day (at least 1 week apart). None of the subjects performed steroid priming. IGF-I was determined by random serum dosage. Among the 2 groups treated with rhGH, 48 patients underwent both provocative tests, 31 to ITT alone and 16 to CT only.
The following data were also collected from each patient: height, target height, chronological age, bone age, pubertal staging, TSH levels, free T4, FSH, LH, estradiol, total testosterone, IGF-1, and IGFBP-3. Patient's heights were measured in triplicate using the Harpender Stadiometer, as well as the height of their parents. The bone age was based on the analysis of left hand and wrist radiographs, using Greulich and Pyle's standard method (15).
Tanner method was used for pubertal staging (16,17). Target height was calculated by the Tanner method: (height of the father + height of the mother -13)/2 for females and (height of the father + height of the mother + 13)/2 for males and expressed in centimeters. Predicted heights were calculated, before and after 1 year of treatment, by the Bayley-Pinneau method (18,19) based on height and bone age of each patient.
Assays GH response to provocative tests (ITT and CT) and serum IGF-I were measured by chemiluminescent immunometric assay (Immulite 1000; Diagnostic Products Corp., Los Angeles, CA, USA). The calibration range for IGF-I was up to 1.6 µg/L against the WHO NIBSC 1st IRR 87/518 and the sensitivity of the test was 20 µg/L. Whether calibration range for GH assay was up to 40 µg/L (WHO 1st IS 80/505 and WHO 2nd IS 98/574) and the sensitivity of the test was up to 0.01 µg/L. Consistency of assay performance was assessed by regular use of internal controls. The GH intra-and inter-assay coefficients of variation were, respectively, 5.3-6.5% at GH levels of 1.7-31 µg/L and 5.5-6.2% at GH levels of 3-18 µg/L. The intra-and inter-assay coefficients of variation for IGF-I were <4.5% and <8.4% (20,21).

Statistical Analysis
Data concerning clinical and epidemiological characteristics were processed using descriptive statistic, expressed as Mean ± Standard Deviation, Confidence Interval of 95% and/or as absolute and relative frequencies, as appropriated, and presented in tables and/or graphics.
Paired student's t-test for dependent means or Wilcoxon signed-rank test were used to compare variables in each group before-and-after. ANOVA was used to compare variances of variables with normal distribution in more than two groups and Kruskal-Wallis was employed when the variables had non normal distribution. The p < 0.05 was considered statistically significant.
The best cut-off point was defined based on Youden Index (J) and, additionally, a ROC (Receiver Operating Characteristic) curve was constructed. The cut-off with maximum sensitivity and specificity in the ROC curve was defined as the minimum value in the equation √ [(1 -sensitivity) 2 + (1 -specificity) 2 ] and the accuracy was estimated based on the area under the ROC curve. Predictive values and likelihood ratios were also calculated from the values of sensitivity and specificity. H-SDS and PH-SDS were derived from World Health Organization (WHO) charts and tables for growth followup (22). Sensitivity, specificity and diagnostic accuracy were expressed as percentage. IGF-1-SDS were derived from Elmlinger et al. (23).
All tests were performed using the SPSS Statistics 22 R software (IBM Corp., Armonk, NY, USA). Further, results were considered significant if p < 0.05.
The serum GH levels in response to ITT and to CT were different between all groups. When we compared initial and final data of each group, we found a significant increase in H-SD, PH, and PH-SDS only in Group 2 after follow-up. The modifications in BMI were not significant for all three groups. We also found a significant rise in IGF-1, IGF-1 SDS, and IGFBP-3 in groups 2 and 3 (Tables 1, 2).
Comparing only initial data, group 2 differed from groups 1 and 3 for the variables age, bone age and PH-SDS. In addition, group 1 differed from groups 2 and 3 for initial H-SDS and BMI-SDS. When analyzing only final data, the three groups differed for H-SDS and group 2 was different from groups 1 and 3 when we compared PH and PH-SDS. Also group 1 differed from groups 2 and 3 for final BMI-SDS.
The sensitivity, specificity, and accuracy for IGF-1 and GH provocative tests, alone or combined in different cutoff values are shown in Tables 3, 4. Sensitivity for IGHD diagnosis using IGF-1 isolated was 20% for <−2SD and 36% for <−1SD, the specificity was 84.6% and 57.7% for <−2SD and <−1SD, respectively. When we had combined IGF-1 at −2SD cut-off with ITT or CT we found a threshold of 7 µg/L as the best one, with sensitivity, specificity and Based on ROC curve, the best cut-off points for IGF-1, ITT, and CT were −0.492 SDS, 4.515 and 4.095 µg/L, respectively (Figure 1).

DISCUSSION
Our data suggests the diagnosis of IGHD should be established based on a combination of clinical expertise, auxologic, radiologic, and laboratorial data, using IGF-1 at the −2SD threshold combined with ITT or CT at the cut-off point of 7 µg/L. As we are aware, our study is the first to establish optimal ITT, CT, and IGF-1 cut-off points to identify patients with IGHD using rhGH therapeutic response as diagnostic standard.
There are lots of limitations to detect IGHD in children. Cutoff values of GH peaks described in literature are controversial, ranging from 3 to 10 µg/L (24-26). The major problem in establishing the optimal cut-off point is the lack of a gold standard for GHD diagnosis and the overlapping of results in normal children (27). To deal with this, we used H-SD gain after 1 year of treatment with rhGH to confirm diagnosis (9,28,29).
A review by Paula and Czepielewski recommended that GHD should be confirmed by two GH stimulation tests with response lower than 5 µg/L (30). Guzzetti et al. (31), in a study with 74 patients with organic GHD, found 5.1 and 6.8 µg/L as cut-off values for ITT and CT, respectively    Frontiers in Endocrinology | www.frontiersin.org threshold. None of the authors considered as diagnostic standard the therapeutic response to rhGH, using a set of clinical and laboratory variables or radiological modifications. In fact, in our study, the provocative GH tests showed low specificity in all thresholds when used alone and it poorly improved when we combined both, CT and ITT. The best specificity was found when we combined IGF-1 with at least one provocative test, with the threshold of 7 mcg/L. This finding is also aligned with the current European trend that the ideal cut-off among the traditional ones should be near 7 µg/L for modern methods and references, which contradicts the tendency adopted in the 1990 decade, when most physicians rather arbitrarily accepted 10 µg/L as main threshold (9,33,34).
When assessed isolatedly, the diagnostic performance of both provocative tests are equivalent in all traditional cut-off points. However, when combined with IGF-1 at the −2 SD cut-off point, ITT showed higher specificity in the 5, 7, and 10 µg/L thresholds when compared to CT, suggesting that the combination of ITT and IGF-1 would be a better choice for IGHD diagnosis.
Levels of IGF-1 alone presented low accuracy for the diagnosis of GHD, with the cut-off point −2SD showing the best results due to higher specificity (84.6%), since it is a more relevant parameter to diagnose low prevalence illnesses (35). Our results are in accordance with a meta-analysis performed by Shen et al. (36), with 12 studies and 1,762 subjects, who reported a specificity of 69% for IGF-1 in the diagnosis of GHD, when using the −2SD as cut-off (36). Many studies recommend that IGF-1 isolated cannot be used to confirm GHD, however it should be applied with the stimulation tests as a complementary tool (37)(38)(39). In addition, some authors suggest that IGF-1 should be used, along with auxologic parameters, as screening test for IGHD and that provocative tests should only be performed as a next step in the investigation if serum levels of this exam are low (40,41). In our study, IGF-1 alone showed very low sensitivity, but we have reached reasonable accuracy performing IGF-1 plus at least one provocative test (ITT or CT) as first approach to diagnose IGHD, after excluding other SS causes.
In addition, based on the ROC curve approach, our study showed that the best cut-off point for IGF-1 alone would be −0.492 SDS (sensitivity: 50%, specificity: 53.8%, accuracy: 46.5%). Our ROC curve data showed that all tests (ITT, IGF-1, and CT) presented poorly results when used isolatedly. Therefore, analyzing possible test combinations to boost all diagnostic parameters, we found −2SD for IGF1 as the best cutoff point when associated with both ITT and CT to identify IGHD patients. When IGF1 was used combined with ITT or CT, it keeps high specificity and increases sensitivity and accuracy dramatically. Although, we point out that the ROC curve was just complementary data. The main study results are summarized in Tables 2, 4. IGF-1 and IGFBP-3 levels increased in both groups 2 and 3 during rhGH treatment. There was no difference in both measurements at the beginning of the study, even though GHD patients had shown relatively low IGFBP-3 levels for age and sex (23). In fact, it has been described that treatment with rhGH has resulted in elevation of both IGF-1 and IGFBP-3 levels in GHD and non-GHD patients and the most pronounced increases TABLE 4 | Sensitivity, specificity, and accuracy of GH peaks to GH stimulation tests isolated and in association with IGF-1 SDS for the diagnosis of IGHD. were observed 3 and 12 months after treatment started, but not later (42). In addition, there is not a clear relationship between height velocity, GH dose, and circulating IGF-1 and IGFBP-3 levels during GH treatment. In other words, GH/IGF-1/IGFBP-3 system cannot be assessed exclusively by blood levels (9,11,42). Finally, in our study, the increase in IGF-1 and IGFBP-3 levels cannot be used to distinguish good responders (group 2) and poor responders (group 3). In group 3, all subjects were diagnosed with Idiopathic Short Stature (ISS), aside CDGP and FSS, due to not filling the criteria of increasing at least 0.3 SD in height after a 1-year treatment with rhGH. ISS has a variety of causes associated to GH secretion disorders in combination with genetic factors that influence growth physiology. Therefore, for proper diagnosis, are considered a H-SD lower than −2SD for age and sex in addition to a subnormal growth rate, delayed bone age, no apparent medical cause for growth failure (brain injury history, systemic, endocrine, nutritional, and chromosomal abnormalities or being born small to gestational age), and normal growth hormone (GH) response to provocative testing (43). Children with ISS are of normal size at birth but grow slowly during early childhood, so height is within the range for ISS at school beginning (44,45). In 2003, the U.S. Food and Drug Administration approves the rhGH as a treatment for ISS and several studies have reported positive results in that approach (46)(47)(48). However, the height gain seems to be dosedependent, obtained in those receiving higher dose as reported by Albertsson-Wikland et al. (49). In this scenario, it becomes more imperative to discriminate these patients from IGHD as early as possible, to adequate the rhGH dose and reach a better final height.
The main limitation of the present study was not to perform priming in prepubertal boys older than 11 years and in prepubertal girls older than 10 years (7).
Although several studies indicate administration of sex steroid priming, as in Marin et al. (50), there is still controversy about its use (11,51). Also, there is no consensus about age of administration, type, dose or precise schedule for sex steroid priming during GH stimulation tests (52) as shown in a survey with members of the European Society for Pediatric Endocrinology that used sex steroid priming in 51% of boys and 41% of girls, demonstrating lack of consensus between specialists (53). For these reasons, the decision to prime with sex steroids is country dependent.
The study main strengths are: large number of subjects, consistent response to the treatment with rhGH, IGF-1 data and statistical combination of results, as well as being the first paper, as far as we are aware, to use therapeutic response as diagnostic standard to confirm the IGHD diagnosis. Additionally, our groups were composed only by SS cases whose diagnosis is harder, once there were no brain radiological findings.
Thus, stimulation tests remain reasonable tools, when associated with clinical evaluation, to diagnose children with GHD (11), despite being far from ideal (54). The data of the present survey confirms that the cut-off point for GH peak used in researches and clinical practice needs to be standardized. Seeking for efficiency and uniformity, our study presented the differential of being the first to use SD of height gain in the first year of treatment with rhGH as a parameter to confirm diagnosis of IGHD.

CONCLUSION
Our data suggest that diagnosis of IGHD should be established based on a combination of clinical expertise, auxologic, radiologic, and laboratorial data, using IGF-1 at the −2SD threshold combined with ITT or CT at the cut-off point of 7 µg/L. Additional studies, similar to ours, be imperative to establish cut-off points based on therapeutic response to rhGH in IGHD, which would be directly related to a better treatment outcome.

DATA AVAILABILITY STATEMENT
The datasets analyzed during the current study were available from the corresponding author on reasonable request.

ETHICS STATEMENT
All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008. This study was approved by University Hospital Joao de Barros Barreto ethics committee. This manuscript has not been published and is not under consideration for publication in any other journal. All authors approved the manuscript and consent to this submission. Informed consent was obtained from all patients for being included in the study.

AUTHOR CONTRIBUTIONS
KF and JF took part in conception and design of study. LJ, MM, NQ, AFi, AFa, LM, AA, DS, and AC were responsible for acquisition of data, while LJ, JF, NZ, FS, NS, WS, and JA have done the analysis and interpretation of data. ML, MS, LL, LS, IF, LF, and MO have drafted the manuscript together. All authors have revised the manuscript critically and approved the version to be published. All persons who meet authorship criteria are listed as authors, and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript.