Age-at-onset in Huntington disease

Background: In Huntington disease, the accurate determination of age-at-onset is critical to identify modifiers and therapies that aim to delay it. Methods: Retrospective data from the European Huntington’s Disease Network’s REGISTRY. Data (age, gender, CAG repeat length, parent affected, and Unified Huntington’s Disease Rating Scale motor score, total functional capacity) from at least three visits in 423 REGISTRY participants were included. Data based extrapolations of individual age-at-onset using generalized linear mixed models based on individual slopes of motor score or total functional capacity, and predictions using the Langbehn, or Ranen formula, were compared with clinicians’ estimates. Results: Concordance was best for the calculated onset using the REGISTRY UHDRS longitudinal motor scores. For total functional capacity, the investigator’s estimate was 4 years before the data derived age-at-onset. The concordance of predictions of probability of age-at-onset was ±20 years (difference in 25%tile). Conclusions: Estimating or predicting age-at-onset in Huntington disease may be inaccurate. It can be useful to 1) add in the manifest population motor score regression derived age-at-onset as additional motor onset and 2) add total functional capacity regression derived age-at-onset for the onset of functional impact of Huntington disease when patients are in midto late-stage.


INTRODUCTION INTRODUCTION
Age-at-onset (AAO) in Huntington disease (HD) describes the point in time when a carrier of the mutated gene develops unequivocal HD signs. The accurate determination of AAO is critical to find factors that modify AAO and to develop and evaluate therapies that aim to delay it. In manifest HD, an autosomal dominant disease with a highly penetrant CAG repeat expansion mutation in the HTT gene, [1] a clinician estimates AAO retrospectively based on information from manifest patients, relatives, and carers. For AAO predictions in the prodromal phase the formula of Langbehn and colleagues uses CAG repeat length and age because of their well known influences on AAO and calculates the time to a predefined degree of probability of manifesting signs of HD. [2] However, CAG repeat length accounts for only about 50-60% of the variability, so other factors not modelled in this formula likely influence AAO. [3] Another formula published by Ranen and colleagues uses CAG repeat length and parental onset age to estimate AAO. [4] , [5] This may accommodate for some other inherited factors as an advantage over the Langbehn formula. However, it was derived from a small sample of affected parent-child pairs and needs to be validated in larger numbers of patients. [5] INTRODUCTION Age-at-onset (AAO) in Huntington disease (HD) describes the point in time when a carrier of the mutated gene develops unequivocal HD signs. The accurate determination of AAO is critical to find factors that modify AAO and to develop and evaluate therapies that aim to delay it. In manifest HD, an autosomal dominant disease with a highly penetrant CAG repeat expansion mutation in the HTT gene, [1] a clinician estimates AAO retrospectively based on information from manifest patients, relatives, and carers. For AAO predictions in the prodromal phase the formula of Langbehn and colleagues uses CAG repeat length and age because of their well known influences on AAO and calculates the time to a predefined degree of probability of manifesting signs of HD. [2] However, CAG repeat length accounts for only about 50-60% of the variability, so other factors not modelled in this formula likely influence AAO. [3] Another formula published by Ranen and colleagues uses CAG repeat length and parental onset age to estimate AAO. [4] , [5] This may accommodate for some other inherited factors as an advantage over the Langbehn formula. However, it was derived from a small sample of affected parent-child pairs and needs to be validated in larger numbers of patients. [5] Figure 1. Illustration of age-at-onset extrapolation from longitudinal UHDRS motor, or total functional capacity (TFC), scores. Linear regression analysis calculates the age for a motorscore of 5 (motor score threshold), or a TFC of 12 (TFC threshold). The vertical lines illustrate the age of the participant at threshold (motor estimate, TFC estimate) and how these calculated onsets compare to the rater estimate.

REGISTRY participants
Data from 423 REGISTRY participants were included (205 male). The mean CAG repeat expansion was 44 (SD 4), this was similar in men and women. In 193 participants (46%) the gene was inherited from the mother and in 177 (42%) from the father, REGISTRY is collecting data in Europe from symptomatic and pre-HD HTTmutation expansion carriers (with known CAG repeat length?36). [6] , [7]Participants had at least three visits where age, gender, CAG repeat length, parent affected, and clinical data (Unified Huntington's Disease Rating Scale motor score, total functional capacity) were available. [8]All REGISTRY data were from participants with manifest HD. This was defined as carrying the HD gene mutation and having a motor phenotype that with ? 99% certainty was unequivocal for HD (diagnostic confidence of 4 on motor UHDRS). [8] Participants gave informed written consent according to the International Conference on Harmonisation-Good Clinical Practice (ICH-GCP) guidelines (http://www.ich.org/LOB/media/MEDIA482.pdf). Ethical approval was obtained from the local ethics committee for each study site contributing to REGISTRY.

Data analysis and statistics
AAO was calculated using generalized linear mixed models based on individual slopes of motor or TFC scores. This means for each participant a linear regression was calculated including intercept and slope of the independent factor across visit dates ( Figure 1). Visit was included as random factor. Dependent variables were the date of examination of the motorscore, or TFC, independent variables were "motorscore" or "TFC." The estimates of the regression were then used to extrapolate the AAO by calculating the age for a motorscore of 5, or a TFC of 12, our definitions of manifest disease (Figure 1).
For AAO predictions, the formula of Langbehn with the predicted probability of signs exceeding 0.6, 0.4 and 0.2 (i.e., "Langb 0.6," "Langb 0.4," and "Langb 0.2"), or the formula of Ranen and colleagues ("Ranen"), was used. [5] We emulated the prodromal stage in our manifest participants by going back in time to when each participant was pre-manifest. We arbitrarily chose an age of 10 years. Secondly, we calculated the disease burden from a participant's age and CAG repeat length ((CAG n -35.5) X age = disease burden) and using the Langbehn formula predicted AAO at an age that corresponded to a disease burden of 200 ("LB DB 0.6"). [9] One additional model was based on the CAG repeats, age of affected parent, the motorscore (population-slope) and gender as independent variables and the rater estimate as dependent variable ("Ranen extended"). A second model was estimated analogous to the Ranen formula ("Ranen analog") based on CAG repeats and age of affected parent. Linear regression was used to assess the importance of these factors for the rater estimate. The regression estimates were used to populate the formula.
The pairwise Pearson correlation coefficients were estimated to investigate the linear correlation of each pair of formulae on "AAO." We compared the different models for predicting AAO by calculating agreement rates with Clopper-Pearson 95% confidence intervals between pairs of estimates. If both methods arrived at the same result within a ±5 year bracket the agreement was defined as '1'. If the results were more than 5 years different the agreement was defined as '0'. For all participants, the agreement rate was expressed as % agreement within accepted range.

Figure 1.
Illustration of age-at-onset extrapolation from longitudinal UHDRS motor, or total functional capacity (TFC), scores. Linear regression analysis calculates the age for a motorscore of 5 (motor score threshold), or a TFC of 12 (TFC threshold). The vertical lines illustrate the age of the participant at threshold (motor estimate, TFC estimate) and how these calculated onsets compare to the rater estimate.

REGISTRY participants
Data from 423 REGISTRY participants were included (205 male). The mean CAG repeat expansion was 44 (SD 4), this was similar in men and women. In 193 participants (46%) the gene was inherited from the mother and in 177 (42%) from the father, REGISTRY is collecting data in Europe from symptomatic and pre-HD HTTmutation expansion carriers (with known CAG repeat length?36). [6] , [7]Participants had at least three visits where age, gender, CAG repeat length, parent affected, and clinical data (Unified Huntington's Disease Rating Scale motor score, total functional capacity) were available. [8]All REGISTRY data were from participants with manifest HD. This was defined as carrying the HD gene mutation and having a motor phenotype that with ? 99% certainty was unequivocal for HD (diagnostic confidence of 4 on motor UHDRS). [8] Participants gave informed written consent according to the International Conference on Harmonisation-Good Clinical Practice (ICH-GCP) guidelines (http://www.ich.org/LOB/media/MEDIA482.pdf). Ethical approval was obtained from the local ethics committee for each study site contributing to REGISTRY.

Data analysis and statistics
AAO was calculated using generalized linear mixed models based on individual slopes of motor or TFC scores. This means for each participant a linear regression was calculated including intercept and slope of the independent factor across visit dates ( Figure 1). Visit was included as random factor. Dependent variables were the date of examination of the motorscore, or TFC, independent variables were "motorscore" or "TFC." The estimates of the regression were then used to extrapolate the AAO by calculating the age for a motorscore of 5, or a TFC of 12, our definitions of manifest disease (Figure 1).
For AAO predictions, the formula of Langbehn with the predicted probability of signs exceeding 0.6, 0.4 and 0.2 (i.e., "Langb 0.6," "Langb 0.4," and "Langb 0.2"), or the formula of Ranen and colleagues ("Ranen"), was used. [5] We emulated the prodromal stage in our manifest participants by going back in time to when each participant was pre-manifest. We arbitrarily chose an age of 10 years. Secondly, we calculated the disease burden from a participant's age and CAG repeat length ((CAG n -35.5) X age = disease burden) and using the Langbehn formula predicted AAO at an age that corresponded to a disease burden of 200 ("LB DB 0.6"). [9] One additional model was based on the CAG repeats, age of affected parent, the motorscore (population-slope) and gender as independent variables and the rater estimate as dependent variable ("Ranen extended"). A second model was estimated analogous to the Ranen formula ("Ranen analog") based on CAG repeats and age of affected parent. Linear regression was used to assess the importance of these factors for the rater estimate. The regression estimates were used to populate the formula.
The pairwise Pearson correlation coefficients were estimated to investigate the linear correlation of each pair of formulae on "AAO." We compared the different models for predicting AAO by calculating agreement rates with Clopper-Pearson 95% confidence intervals between pairs of estimates. If both methods arrived at the same result within a ±5 year bracket the agreement was defined as '1'. If the results were more than 5 years different the agreement was defined as '0'. For all participants, the agreement rate was expressed as % agreement within accepted range. Figure 1. Illustration of age-at-onset extrapolation from longitudinal UHDRS motor, or total functional capacity (TFC), scores. Linear regression analysis calculates the age for a motorscore of 5 (motor score threshold), or a TFC of 12 (TFC threshold). The vertical lines illustrate the age of the participant at threshold (motor estimate, TFC estimate) and how these calculated onsets compare to the rater estimate.

REGISTRY participants
Data from 423 REGISTRY participants were included (205 male). The mean CAG repeat expansion was 44 (SD 4), this was similar in men and women. In 193 participants (46%) the gene was inherited from the mother and in 177 (42%) from the father, 2 PLOS Currents Huntington Disease in 30 (7%) no signs of HD were reported in either parent, and in 23 (5%) the inheritance was unknown for both parents or for at least one parent, where the other parent was not affected. Eight participants (2%) had a juvenile onset (before the age of 20), and 24 (6%) had a late onset of HD above age 60. A motor onset was present in 303, other onset types in 120. The average motor score at enrolment was 35 (out of a possible 124), 158 (37%) participants were in stage 1, 120 (28%) in stage 2, 111 (26%) in stage 3, 30 (7%) in stage 4 and 4 (1%) in stage 5. Patients were first seen by investigators a median of 6 years after the estimated onset. The medium number of visits was 3 (range 3-18).

AAO extrapolation from longitudinal data
Sixty-six participants with a negative slope of the motor score were excluded because no individual AAO could be calculated. In the remaining 357 participants, on average the motor score increased linearly by 2.57 points per year. The mean AAO of data based extrapolated motor signs was 47 (range 13-81, SD 11.67).
Forty-four participants were excluded because the slope of the TFC was positive. This meant no individual TFC onset could be extrapolated based on longitudinal data. In the remaining 379 participants, on average the TFC score decreased linearly by 0.52 points per year. The TFC of the 251 patients in disease stages 1 or 2 decreased by 0.75 points per year compared to 128 patients in disease stages 3, 4, or 5 who on average lost 0.26 points per year. The mean extrapolated AAO was 48 (range 14-81, SD 12, Figure 2).
The investigator estimated AAO was 3 years before the motorscore calculated AAO (Table 1, Figure 2) with an agreement rate of 0.57 ( Table 2). The calculated TFC onset was 4 years later than the investigator's estimate (Table 1, Figure 2) with an agreement rate of 0.52 (Table 2).  in 30 (7%) no signs of HD were reported in either parent, and in 23 (5%) the inheritance was unknown for both parents or for at least one parent, where the other parent was not affected. Eight participants (2%) had a juvenile onset (before the age of 20), and 24 (6%) had a late onset of HD above age 60. A motor onset was present in 303, other onset types in 120. The average motor score at enrolment was 35 (out of a possible 124), 158 (37%) participants were in stage 1, 120 (28%) in stage 2, 111 (26%) in stage 3, 30 (7%) in stage 4 and 4 (1%) in stage 5. Patients were first seen by investigators a median of 6 years after the estimated onset. The medium number of visits was 3 (range 3-18).

AAO extrapolation from longitudinal data
Sixty-six participants with a negative slope of the motor score were excluded because no individual AAO could be calculated. In the remaining 357 participants, on average the motor score increased linearly by 2.57 points per year. The mean AAO of data based extrapolated motor signs was 47 (range 13-81, SD 11.67).
Forty-four participants were excluded because the slope of the TFC was positive. This meant no individual TFC onset could be extrapolated based on longitudinal data. In the remaining 379 participants, on average the TFC score decreased linearly by 0.52 points per year. The TFC of the 251 patients in disease stages 1 or 2 decreased by 0.75 points per year compared to 128 patients in disease stages 3, 4, or 5 who on average lost 0.26 points per year. The mean extrapolated AAO was 48 (range 14-81, SD 12, Figure 2).
The investigator estimated AAO was 3 years before the motorscore calculated AAO (Table 1, Figure 2) with an agreement rate of 0.57 ( Table 2). The calculated TFC onset was 4 years later than the investigator's estimate (Table 1, Figure 2) with an agreement rate of 0.52 (Table 2).  in 30 (7%) no signs of HD were reported in either parent, and in 23 (5%) the inheritance was unknown for both parents or for at least one parent, where the other parent was not affected. Eight participants (2%) had a juvenile onset (before the age of 20), and 24 (6%) had a late onset of HD above age 60. A motor onset was present in 303, other onset types in 120. The average motor score at enrolment was 35 (out of a possible 124), 158 (37%) participants were in stage 1, 120 (28%) in stage 2, 111 (26%) in stage 3, 30 (7%) in stage 4 and 4 (1%) in stage 5. Patients were first seen by investigators a median of 6 years after the estimated onset. The medium number of visits was 3 (range 3-18).

AAO extrapolation from longitudinal data
Sixty-six participants with a negative slope of the motor score were excluded because no individual AAO could be calculated. In the remaining 357 participants, on average the motor score increased linearly by 2.57 points per year. The mean AAO of data based extrapolated motor signs was 47 (range 13-81, SD 11.67).
Forty-four participants were excluded because the slope of the TFC was positive. This meant no individual TFC onset could be extrapolated based on longitudinal data. In the remaining 379 participants, on average the TFC score decreased linearly by 0.52 points per year. The TFC of the 251 patients in disease stages 1 or 2 decreased by 0.75 points per year compared to 128 patients in disease stages 3, 4, or 5 who on average lost 0.26 points per year. The mean extrapolated AAO was 48 (range 14-81, SD 12, Figure 2).
The investigator estimated AAO was 3 years before the motorscore calculated AAO (Table 1, Figure 2) with an agreement rate of 0.57 ( Table 2). The calculated TFC onset was 4 years later than the investigator's estimate (Table 1, Figure 2) with an agreement rate of 0.52 (Table 2).

Predicting AAO
The Langbehn formula (LB 06) predicted an AAO of 46.82 (SD 10.5, range 23-89) when 10 years of age; with age at a disease burden of 200, the predicted average AAO was 46.65 (SD 10.14, range 23-74). The agreement rate with the investigator's estimate was 0.46 (Table 2). In 280 participants (140 women) with the necessary data the Ranen formula predicted an AAO of 41.9 years (SD 7.9, range 8-61). In that group of participants this is similar to the investigator's estimates of 42 years. The agreement rate was 0.46 (Table 2).

Comparing formulae derived AAO
In a total of 206 REGISTRY patients (109 women) with complete data the best agreement rate was between the extrapolated AAOs on the longitudinal TFC and motor score data (0.76, Table 2). Using parent AAO and CAG repeats resulted in the following formula that best predicted the investigator's AAO estimate (Ranen analog): AAO=90.3918+0.3293*parent AAO-1.3996*CAG repeats In a next step we entered all available data into a regression model for the prediction of the investigator's estimate of AAO (Ranen extended). Motor score, parent AAO, CAG repeats, and gender significantly contributed to the regression model resulting in the following formula: AAO=88.7546+0.0430*motorscore+0.3546*parent AAO-1.4311*CAG repeats+1.0124*gender (male=1, female=0). We then entered the dataset of each participant into this formula to arrive at the calculated AAO.
The data driven formula resembles the Ranen and Ranen analogous formula with high agreement rates (0.97, Table 2) while the agreement rates with other AAO calculations was much lower ( Table 2).
We then assessed 145 participants with a motor onset. The findings were similar to the analyses including all participants regardless of major symptom at onset (Table 2); the rater estimated an earlier onset than the AAO from the regression analyses.

Predicting AAO
The Langbehn formula (LB 06) predicted an AAO of 46.82 (SD 10.5, range 23-89) when 10 years of age; with age at a disease burden of 200, the predicted average AAO was 46.65 (SD 10.14, range 23-74). The agreement rate with the investigator's estimate was 0.46 (Table 2). In 280 participants (140 women) with the necessary data the Ranen formula predicted an AAO of 41.9 years (SD 7.9, range 8-61). In that group of participants this is similar to the investigator's estimates of 42 years. The agreement rate was 0.46 (Table 2).

Comparing formulae derived AAO
In a total of 206 REGISTRY patients (109 women) with complete data the best agreement rate was between the extrapolated AAOs on the longitudinal TFC and motor score data (0.76, Table 2). Using parent AAO and CAG repeats resulted in the following formula that best predicted the investigator's AAO estimate (Ranen analog): AAO=90.3918+0.3293*parent AAO-1.3996*CAG repeats In a next step we entered all available data into a regression model for the prediction of the investigator's estimate of AAO (Ranen extended). Motor score, parent AAO, CAG repeats, and gender significantly contributed to the regression model resulting in the following formula: AAO=88.7546+0.0430*motorscore+0.3546*parent AAO-1.4311*CAG repeats+1.0124*gender (male=1, female=0). We then entered the dataset of each participant into this formula to arrive at the calculated AAO.
The data driven formula resembles the Ranen and Ranen analogous formula with high agreement rates (0.97, Table 2) while the agreement rates with other AAO calculations was much lower ( Table 2).
We then assessed 145 participants with a motor onset. The findings were similar to the analyses including all participants regardless of major symptom at onset (Table 2); the rater estimated an earlier onset than the AAO from the regression analyses.

Predicting AAO
The Langbehn formula (LB 06) predicted an AAO of 46.82 (SD 10.5, range 23-89) when 10 years of age; with age at a disease burden of 200, the predicted average AAO was 46.65 (SD 10.14, range 23-74). The agreement rate with the investigator's estimate was 0.46 (Table 2). In 280 participants (140 women) with the necessary data the Ranen formula predicted an AAO of 41.9 years (SD 7.9, range 8-61). In that group of participants this is similar to the investigator's estimates of 42 years. The agreement rate was 0.46 (Table 2).

Comparing formulae derived AAO
In a total of 206 REGISTRY patients (109 women) with complete data the best agreement rate was between the extrapolated AAOs on the longitudinal TFC and motor score data (0.76, Table 2). Using parent AAO and CAG repeats resulted in the following formula that best predicted the investigator's AAO estimate (Ranen analog): AAO=90.3918+0.3293*parent AAO-1.3996*CAG repeats In a next step we entered all available data into a regression model for the prediction of the investigator's estimate of AAO (Ranen extended). Motor score, parent AAO, CAG repeats, and gender significantly contributed to the regression model resulting in the following formula: AAO=88.7546+0.0430*motorscore+0.3546*parent AAO-1.4311*CAG repeats+1.0124*gender (male=1, female=0). We then entered the dataset of each participant into this formula to arrive at the calculated AAO.
The data driven formula resembles the Ranen and Ranen analogous formula with high agreement rates (0.97, Table 2) while the agreement rates with other AAO calculations was much lower ( Table 2).
We then assessed 145 participants with a motor onset. The findings were similar to the analyses including all participants regardless of major symptom at onset (Table 2); the rater estimated an earlier onset than the AAO from the regression analyses.

DISCUSSION
The present study assessed how a data derived age-at-onset compares to the rater's estimate, the commonly used definition of AAO. Using longitudinal data from the REGISTRY large observational study, our results suggest that it can be useful to 1) add in the manifest population motor score regression derived AAO as additional motor onset; 2) when patients are in mid-to latestage HD add TFC regression derived AAO for the onset of functional impact of HD; 3) predictions of AAO suggest a later onset than the actual emergence of unequivocal motor signs of HD.

Calculate age-at-onset in manifest HD
The current gold standard of AAO is a clinician's estimate integrating data from the patient's history, collateral history of family or carers and the examination of the patient. We compared a data derived AAO with the estimated AAO. We first used a simple regression analysis of longitudinal UHDRS motor score data and calculated the patient's age when the motor score was 5 or greater, a cut-off used in TRACK-HD, a large longitudinal multi-centre study. [10], [11] The AAO from regressing motor scores was 3 years later than the median AAO estimated by REGISTRY investigators. The agreement rates between onset in REGISTRY data and the calculated motor onset revealed a difference of about 20 years in the 25%tile. REGISTRY participants were enrolled when they already had manifest HD. This means that at the REGISTRY enrolment visit the investigator may have had to judge AAO after many years of manifest disease. This AAO estimate would be less accurate than AAO observation. However, in REGISTRY an earlier estimated onset than that calculated from longitudinal motor scores may also suggest that investigators are not guided by motor signs alone.
Many studies of genetic modifiers relate their effect to a general onset of HD. It is possible that there are domain specific onset modifiers. Such domain specific modifying effects may be overlooked unless domain specific onsets are defined. Our data suggest that it is possible to use longitudinal motor score data to extrapolate a motor domain onset even in patients with a nonmotor onset. This approach has the added advantage that it is based on data collected by certified UHDRS motor scale raters directly examining the patient. This removes some of the variability introduced by judging an onset retrospectively.
The UHDRS TFC reflects a general impact on the ability to work, handle finances, and the activities of daily living. [12] The calculated TFC onset was about 4 years later than the rater estimate indicating that it may take a number of years before HD manifestations have a substantial impact on daily life. The good agreement rate between the rater estimate and the calculated 'TFC onset minus 4 years' suggests this interval of 4 years is fairly robust and does not depend on the major sign at onset. A 'TCF onset' may add an important endpoint based on function. While in the prodromal phase new scales may have to be devised, in midstage HD patients the calculation of an onset of functional impairment using the TFC may help identify the factors that are most relevant to maintain function. [13] Predicting age-at-onset in prodromal HD Assuming our manifest participants were prodromal we compared AAO estimations in the REGISTRY data using the Langbehn, or Ranen, formula with the rater estimates of AAO. While the medians agree well, in a substantial proportion of cases the difference was as large as ±20 years. Overall, however, the Langbehn formula estimates the AAO later than the raters' observed AAO. This suggests that unequivocal motor signs of HD manifest earlier, sometimes many years, than predicted using the Langbehn formula.

DISCUSSION
The present study assessed how a data derived age-at-onset compares to the rater's estimate, the commonly used definition of AAO. Using longitudinal data from the REGISTRY large observational study, our results suggest that it can be useful to 1) add in the manifest population motor score regression derived AAO as additional motor onset; 2) when patients are in mid-to latestage HD add TFC regression derived AAO for the onset of functional impact of HD; 3) predictions of AAO suggest a later onset than the actual emergence of unequivocal motor signs of HD.

Calculate age-at-onset in manifest HD
The current gold standard of AAO is a clinician's estimate integrating data from the patient's history, collateral history of family or carers and the examination of the patient. We compared a data derived AAO with the estimated AAO. We first used a simple regression analysis of longitudinal UHDRS motor score data and calculated the patient's age when the motor score was 5 or greater, a cut-off used in TRACK-HD, a large longitudinal multi-centre study. [10], [11] The AAO from regressing motor scores was 3 years later than the median AAO estimated by REGISTRY investigators. The agreement rates between onset in REGISTRY data and the calculated motor onset revealed a difference of about 20 years in the 25%tile. REGISTRY participants were enrolled when they already had manifest HD. This means that at the REGISTRY enrolment visit the investigator may have had to judge AAO after many years of manifest disease. This AAO estimate would be less accurate than AAO observation. However, in REGISTRY an earlier estimated onset than that calculated from longitudinal motor scores may also suggest that investigators are not guided by motor signs alone.
Many studies of genetic modifiers relate their effect to a general onset of HD. It is possible that there are domain specific onset modifiers. Such domain specific modifying effects may be overlooked unless domain specific onsets are defined. Our data suggest that it is possible to use longitudinal motor score data to extrapolate a motor domain onset even in patients with a nonmotor onset. This approach has the added advantage that it is based on data collected by certified UHDRS motor scale raters directly examining the patient. This removes some of the variability introduced by judging an onset retrospectively.
The UHDRS TFC reflects a general impact on the ability to work, handle finances, and the activities of daily living. [12] The calculated TFC onset was about 4 years later than the rater estimate indicating that it may take a number of years before HD manifestations have a substantial impact on daily life. The good agreement rate between the rater estimate and the calculated 'TFC onset minus 4 years' suggests this interval of 4 years is fairly robust and does not depend on the major sign at onset. A 'TCF onset' may add an important endpoint based on function. While in the prodromal phase new scales may have to be devised, in midstage HD patients the calculation of an onset of functional impairment using the TFC may help identify the factors that are most relevant to maintain function. [13] Predicting age-at-onset in prodromal HD Assuming our manifest participants were prodromal we compared AAO estimations in the REGISTRY data using the Langbehn, or Ranen, formula with the rater estimates of AAO. While the medians agree well, in a substantial proportion of cases the difference was as large as ±20 years. Overall, however, the Langbehn formula estimates the AAO later than the raters' observed AAO. This suggests that unequivocal motor signs of HD manifest earlier, sometimes many years, than predicted using the Langbehn formula.  Table 2. Agreement rates (± 5years) between different formulae. Agreement rates between pairs of estimates were calculated with Clopper-Pearson 95% confidence intervals. If both methods arrived at the same result within a ±5 year bracket the agreement was defined as '1'. If the results were more than 5 years different the agreement was defined as '0'. For all participants, the agreement rate was expressed as % agreement within accepted range. Abbreviations: Langb db 06: calculated at an age corresponding to disease burden of 200, age when the predicted probability of signs exceeds 0.6. R. analog: Ranen analogous. R. extended: Ranen extended.

DISCUSSION
The present study assessed how a data derived age-at-onset compares to the rater's estimate, the commonly used definition of AAO. Using longitudinal data from the REGISTRY large observational study, our results suggest that it can be useful to 1) add in the manifest population motor score regression derived AAO as additional motor onset; 2) when patients are in mid-to latestage HD add TFC regression derived AAO for the onset of functional impact of HD; 3) predictions of AAO suggest a later onset than the actual emergence of unequivocal motor signs of HD.

Calculate age-at-onset in manifest HD
The current gold standard of AAO is a clinician's estimate integrating data from the patient's history, collateral history of family or carers and the examination of the patient. We compared a data derived AAO with the estimated AAO. We first used a simple regression analysis of longitudinal UHDRS motor score data and calculated the patient's age when the motor score was 5 or greater, a cut-off used in TRACK-HD, a large longitudinal multi-centre study. [10], [11] The AAO from regressing motor scores was 3 years later than the median AAO estimated by REGISTRY investigators. The agreement rates between onset in REGISTRY data and the calculated motor onset revealed a difference of about 20 years in the 25%tile. REGISTRY participants were enrolled when they already had manifest HD. This means that at the REGISTRY enrolment visit the investigator may have had to judge AAO after many years of manifest disease. This AAO estimate would be less accurate than AAO observation. However, in REGISTRY an earlier estimated onset than that calculated from longitudinal motor scores may also suggest that investigators are not guided by motor signs alone.
Many studies of genetic modifiers relate their effect to a general onset of HD. It is possible that there are domain specific onset modifiers. Such domain specific modifying effects may be overlooked unless domain specific onsets are defined. Our data suggest that it is possible to use longitudinal motor score data to extrapolate a motor domain onset even in patients with a nonmotor onset. This approach has the added advantage that it is based on data collected by certified UHDRS motor scale raters directly examining the patient. This removes some of the variability introduced by judging an onset retrospectively.
The UHDRS TFC reflects a general impact on the ability to work, handle finances, and the activities of daily living. [12] The calculated TFC onset was about 4 years later than the rater estimate indicating that it may take a number of years before HD manifestations have a substantial impact on daily life. The good agreement rate between the rater estimate and the calculated 'TFC onset minus 4 years' suggests this interval of 4 years is fairly robust and does not depend on the major sign at onset. A 'TCF onset' may add an important endpoint based on function. While in the prodromal phase new scales may have to be devised, in midstage HD patients the calculation of an onset of functional impairment using the TFC may help identify the factors that are most relevant to maintain function. [13] Predicting age-at-onset in prodromal HD Assuming our manifest participants were prodromal we compared AAO estimations in the REGISTRY data using the Langbehn, or Ranen, formula with the rater estimates of AAO. While the medians agree well, in a substantial proportion of cases the difference was as large as ±20 years. Overall, however, the Langbehn formula estimates the AAO later than the raters' observed AAO. This suggests that unequivocal motor signs of HD manifest earlier, sometimes many years, than predicted using the Langbehn formula.

PLOS Currents Huntington Disease
We next used the data to model predictions of the rater estimate of AAO. Integrating the parent AAO and the individual CAG repeat length resulted in a formula in very good agreement with the Ranen formula. The agreement rates of the Ranen, or Ranen analogous, formula with the rater estimate or the extrapolated AAO revealed a large window of ±20 years. Overall, the difference of the population based analyses, i.e. median, is unbiased, but on individual levels high deviances from the rater estimate were found. This suggests that other factors may also influence the AAO on an individual level that average out on a group level.

Conclusions and limitations
A methodological limitation relates to the assumption of linear progression. Rates of decline of TFC in our study agree reasonably well with previous data in documenting the rate of decline is slower in the later stages of the disease than in the earlier stages probably reflecting a floor effect. [12] Progression of HD may not be linear in all stages. Since the population was too small further studies need to investigate rates of progression of HD in large data sets using for example non-linear statistical models. We had to exclude a substantial proportion of participants from the data based extrapolations because the slopes were in the wrong direction. Such individuals present a challenge for the proposed technique of estimating individual onset by extrapolating backwards.
Our results suggest inaccuracies of the concept of a rater estimated AAO. Observing the onset may result in a more accurate AAO than estimating the AAO from a distance of many years even if experienced HD clinicians use all available information from patients, relatives, and carers. We do not have an objective measure of AAO, or the truth, so we cannot reliably say which of the AAOs we have evaluated is the best. AAO implies it is possible to identify a point in time when HD signs manifest while clinically it would be more appropriate to refer to a transition period of sometimes several years from the prodromal to the manifest stage of HD. Based on our results we suggest to add in manifest HD cohorts a data derived AAO, especially for the motor domain. This may be particularly important when using REGISTRY data where the rater estimate of AAO seems less reliable. For predictions of AAO in the prodromal phase of HD, our data suggest that the Langbehn formula works better than the formula integrating parental AAO. The challenge for the future is to find objective means to define AAO, both predicted and retrospective. REGISTRY and PREDICT-HD, and also TRACK-HD, offer large collections of longitudinal data that can be used to meet this challenge. [10] We next used the data to model predictions of the rater estimate of AAO. Integrating the parent AAO and the individual CAG repeat length resulted in a formula in very good agreement with the Ranen formula. The agreement rates of the Ranen, or Ranen analogous, formula with the rater estimate or the extrapolated AAO revealed a large window of ±20 years. Overall, the difference of the population based analyses, i.e. median, is unbiased, but on individual levels high deviances from the rater estimate were found. This suggests that other factors may also influence the AAO on an individual level that average out on a group level.

Conclusions and limitations
A methodological limitation relates to the assumption of linear progression. Rates of decline of TFC in our study agree reasonably well with previous data in documenting the rate of decline is slower in the later stages of the disease than in the earlier stages probably reflecting a floor effect. [12] Progression of HD may not be linear in all stages. Since the population was too small further studies need to investigate rates of progression of HD in large data sets using for example non-linear statistical models. We had to exclude a substantial proportion of participants from the data based extrapolations because the slopes were in the wrong direction. Such individuals present a challenge for the proposed technique of estimating individual onset by extrapolating backwards.
Our results suggest inaccuracies of the concept of a rater estimated AAO. Observing the onset may result in a more accurate AAO than estimating the AAO from a distance of many years even if experienced HD clinicians use all available information from patients, relatives, and carers. We do not have an objective measure of AAO, or the truth, so we cannot reliably say which of the AAOs we have evaluated is the best. AAO implies it is possible to identify a point in time when HD signs manifest while clinically it would be more appropriate to refer to a transition period of sometimes several years from the prodromal to the manifest stage of HD. Based on our results we suggest to add in manifest HD cohorts a data derived AAO, especially for the motor domain. This may be particularly important when using REGISTRY data where the rater estimate of AAO seems less reliable. For predictions of AAO in the prodromal phase of HD, our data suggest that the Langbehn formula works better than the formula integrating parental AAO. The challenge for the future is to find objective means to define AAO, both predicted and retrospective. REGISTRY and PREDICT-HD, and also TRACK-HD, offer large collections of longitudinal data that can be used to meet this challenge. [ We next used the data to model predictions of the rater estimate of AAO. Integrating the parent AAO and the individual CAG repeat length resulted in a formula in very good agreement with the Ranen formula. The agreement rates of the Ranen, or Ranen analogous, formula with the rater estimate or the extrapolated AAO revealed a large window of ±20 years. Overall, the difference of the population based analyses, i.e. median, is unbiased, but on individual levels high deviances from the rater estimate were found. This suggests that other factors may also influence the AAO on an individual level that average out on a group level.

Conclusions and limitations
A methodological limitation relates to the assumption of linear progression. Rates of decline of TFC in our study agree reasonably well with previous data in documenting the rate of decline is slower in the later stages of the disease than in the earlier stages probably reflecting a floor effect. [12] Progression of HD may not be linear in all stages. Since the population was too small further studies need to investigate rates of progression of HD in large data sets using for example non-linear statistical models. We had to exclude a substantial proportion of participants from the data based extrapolations because the slopes were in the wrong direction. Such individuals present a challenge for the proposed technique of estimating individual onset by extrapolating backwards.
Our results suggest inaccuracies of the concept of a rater estimated AAO. Observing the onset may result in a more accurate AAO than estimating the AAO from a distance of many years even if experienced HD clinicians use all available information from patients, relatives, and carers. We do not have an objective measure of AAO, or the truth, so we cannot reliably say which of the AAOs we have evaluated is the best. AAO implies it is possible to identify a point in time when HD signs manifest while clinically it would be more appropriate to refer to a transition period of sometimes several years from the prodromal to the manifest stage of HD. Based on our results we suggest to add in manifest HD cohorts a data derived AAO, especially for the motor domain. This may be particularly important when using REGISTRY data where the rater estimate of AAO seems less reliable. For predictions of AAO in the prodromal phase of HD, our data suggest that the Langbehn formula works better than the formula integrating parental AAO. The challenge for the future is to find objective means to define AAO, both predicted and retrospective. REGISTRY and PREDICT-HD, and also TRACK-HD, offer large collections of longitudinal data that can be used to meet this challenge. [10]