Eigenvector index for two female fertility traits based in random regression coefficients matrix in Holstein cows

A total of 71,518 days open (DO) and number of services per conception (NSC) records of 28,523 Iranian Holstein cows were analysed by random regression model. Akaike’s information criterion and likelihood ratio test suggested that a model with quadratic Legendre polynomials for additive genetic and permanent environmental was best. Heritability in different parities ranged from 0.103 to 0.045 in first parity and 0.054 to 0.030 in sixth parity for DO and NSC, respectively. Estimated genetic correlations between parities decreased continuously with increasing distance between parities for both DO and NSC. The first eigenfunction explained 89.99% and 97.22 % of the total genetic variance of DO and NSC, while the second eigenfunction accounted 6.24% and 3.18%, respectively. Different selection indices were constructed for DO and NSC. Genetic response to improve overall fertility was greater when the index included the first eigenvector than the response obtained from indices excluding it. Similar genetic gains were obtained from the first eigenvector, which had a nearly flat associated eigenfunction along lactations and from selection by the intercept of the random regression. The first eigenvector indices were responsible of changes in the level of DO and NSC in a similar manner for all parities, without altering the shape of the response curve across parities. The second and third eigenvector indices modified the shape of this curve but the improvement in genetic gains by including them in the selection index were small (DO) or negligible (NSC) due to the small amount of variability associated with these components. Additional keywords: dairy cattle; eigenfunctions; selection indices. Abbreviations used: AI (Artificial Insemination); AIC (Akaike’s Information Criterion); DO (Days Open); logL (Logarithm of the Likelihood); LRT (Likelihood Ratio Test); NSC (Number of Services per Conception); RRM (Random Regression Model). Authors’ contributions: Both authors contributed equally in design, data analysis and drafting the manuscript. Citation: Ghiasi, H.; Carabaño, M. J. (2018). Eigenvector index for two female fertility traits based in random regression coefficients matrix in Holstein cows. Spanish Journal of Agricultural Research, Volume 16, Issue 1, e0401. https://doi.org/10.5424/sjar/201816112396 Received: 07 Oct 2017. Accepted: 05 Mar 2018. Copyright © 2018 INIA. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International (CC-by 4.0) License. Funding: Payame Noor University. Competing interests: The authors have declared that no competing interests exist. Correspondence should be addressed to Maria J. Carabaño: mjc@inia.es


Introduction
Fertility is a trait with a high economic value in most selection schemes in dairy cattle because of the negative economic consequences of low fertility (Bagnato & Oltenacu, 1994) and because of the deterioration in reproductive performance associated with intense selection by production traits (Liu et al., 2007). Nowadays, fertility is included in national selection index of many countries (Miglior et al., 2005) and improvement on daughter pregnancy rates has been found from the use of genomic selection (García-Ruiz et al., 2016). However, selection for improved fertility is a complex task. On the one hand, fertility is a complex trait, which results in a variety of measures being used in the genetic evaluations across the world (see, e.g., http://www.interbull.org/ib/geforms). This lack of a unique measure of fertility is due to the fact that fertility is composed of different traits (reproductive precocity, time to re-cycle after calving and ability to conceive once the animal is inseminated) and to the diverse type of information available to assess the reproductive success (e.g. availability of artificial insemination information (AI) or not) (González-Recio & Alenda, 2005;Jorjani, 2007). If results of AI in each insemination are available, traits such as days open (DO), interval from calving to first insemination, number of inseminations per conception, the interval between first and last insemination or conception rate have been used. When information of result of inseminations is not available, calving interval (or an approximate measure of days open obtained from calving interval and average pregnancy duration in the population) provide a measure of the overall reproductive efficiency in each parity and it is widely used because it can be easily obtained from milk recording schemes. However, this measure is highly influenced by farmer interventions (mainly, voluntary waiting period) and by problems regarding the low quality of fertility measurements such as censored records (Donoghue et al., 2004;VanRaden et al., 2004;González-Recio & Alenda, 2005). Although methods have proposed for handling censored information in fertility recording , most of the available genetic evaluation software cannot handle this type of information.
On the other hand, a wide number of statistical models have been used in the genetic evaluations of fertility traits with no clear consensus as to what approach is best for each trait. One of the features shared by most fertility traits is that several measures are available along the cow's productive life. Treatment of longitudinal traits in animal breeding has followed several approaches from the simplest repeatability model (González-Recio & Alenda, 2005;Ghiasi et al., 2011) to the most complex multiple trait models (Sun et al., 2010;Gutierrez, 2010). An intermediate stage is the use of random regression models (RRM) that allow for the use of a more parsimonious model compared with the multiple trait setting while keeping the realistic assumptions of heterogeneity of variances and incomplete correlation along time. The drawback is the need to define a priori a function that describes changes in the random variables along time. The use of random regressions has an added value in genetic selection which is that selection can not only be practiced to improve the level of the trait but also its persistency along time. The most common use of random regressions in dairy cattle can be found for production traits, which are analysed under the socalled test day model in many countries. Moreover, selection of lactation persistency has also been subject of many studies (Togashi & Lin, 2006;Savegnago et al., 2013), in milk production traits selection for persistency is a within lactation phenomenon but for fertility traits there is only one record in each lactation thus persistency for fertility traits is across lactation phenomenon. Of particular relevance is the use of the eigendecomposition of the covariance matrix associated to the RRM to obtain independent selection criteria for the level and persistency of the trait (Togashi & Lin, 2006). Random regression models (RRM) have been found to outperform repeatability models in the analysis of days open in fertility traits by Gutiérrez (2010). Other studies have found evidence of differences in patterns of response to fertility traits along parities (Nishida et al., 2006;Menéndez-Buxadera et al., 2013) in some fertility traits. However, none of the previous studies have dealt with the combined selection for fertility level and fertility persistency along lactation.
In this study, the use of RRM in the analyses of DO and number of services per conception (NSC), as representative traits for two types of information commonly available for genetic improvement of fertility, has been explored to compare the use of alternative regression orders of the RRM and to develop selection indices for level and persistency of fertility along the productive life of the cows.

Material and methods
A total of 71,518 records of DO and NSC collected from parity 1 to 6 during the period 1981 to 2007 were used in this study. These records came from 28,523 female Holstein cows sired by 925 bulls in 15 large Holstein herds of Iran. These herds of Holstein cows located in 10 different provinces of Iran. If NSC was greater than 10, then NSC was assigned to 10, and DO was required to be between 30 and 330 days. Single trait RRM were used to estimate the genetic parameters of DO and NSC. For analyzing traits with RRM, Legendre polynomials for the regression on parity number were used. Two models with first (linear) and second (quadratic) order of fit were compared. The logarithm of the likelihood (logL) and the Akaike's Information Criterion (AIC) were used for model comparison.
Estimations were done using the Wombat package (Meyer, 2007).
The RRM model in matrix notation was:

y=Xb+Za+Wpe+e,
with y = vector of observations for DO or NSC; b = vector of fixed effect that included age at previous calving, parity, herd-year-season of calving; a and pe = vectors of random regression coefficients for additive genetic and permanent environmental effects, respectively; e = vector of residual effects; X = incidence matrix and Z and W = design matrices containing values of the covariables of Legendre polynomials for the standardized value of the parity corresponding to each observation. Standardization to the [-1,1] interval of a given value of a parity (T=1,…,6), t, was done as: t*= 2 * ((t-min(T))/(max(T)-min(T)) -1, and covariates of the Legendre polynomials up to the second order were [1, t*, 0.5(3t* 2 -1)] The model assumptions were: where K a and K pe = matrices of (co)variance between random regression coefficients for additive genetic and permanent environmental effects, respectively; A = additive relationship matrix; I nr = identity matrix; nr = number of animals with records; = Kronecker product, and R = diagonal matrix with a homogeneous residual variance on the diagonal. The eigendecomposition of K a was obtained from, where E= matrix containing the eigenvectors of K a as columns (E'E = EE'= I) and D = diagonal matrix containing the eigenvalues (λ Corresponding eigenfunctions were obtained as: where ψi(t) = value of the i th eigenfunction at parity t, ei= i th eigenvector of K a and z(t*) = vector of covariates of the Legendre polynomial evaluated at value t*. Selection to improve the overall fertility performance for the first six lactations was considered for both DO and NSC by defining as the aggregate genotype, where 1 is a (6×1) summing vector of ones, Z is a (6×k) matrix of Legendre polynomial coefficients evaluated at parities 1 to 6, and, a is a (k×1) vector of additive genetic random regression coefficients; k is the number of regression coefficients fitted (k=3 for the quadratic polynomial).
Alternative selection indices were: i) The index based on the k eigenvectors of K a , I k .
where b is a (k ×1) vector of index weights. Index weights in b, obtained by maximization of the correlation between I k and H are: The genetic gain in each fertility trait from parity 1 to 6 was then estimated as: where i is the selection intensity that was set to one in this study and G (6x6) = ZK a Z' =covariance matrix among fertility values in different lactations ii) Partial selection indices based on one eigenvector, e i , or in two eigenvectors, e i , and ej, with D ij = diagonal matrix with the i th and j th eigenvalues, where k o = vector of covariances between the intercept and the other random regression coefficients and k oo = variance of the intercept iv) Selection by breeding value for first lactation, I L1 with z 1 = Z (t=1)

Genetic parameters
Descriptive statistics of DO and NSC in parities 1 to 6 are given in Table 1. The phenotypic mean of DO declined from parity 1 to 3 and then increased up to parity 6 (p<0.01). In agreement with our results, a decline in DO for primiparous vs. later lactations (Meikle et al., 2004) and a decline for the first four lactations followed parities while permanent environmental variance increased. Following the pattern of estimated genetic variances, heritabilities decreased with parity from 0.103 and 0.045 in the first parity to 0.054 and 0.030 in the sixth parity for DO and NSC, respectively (Tables 4 and 5). Estimated genetic and phenotypic correlations between different parities are shown in Tables 4 and 5 for DO and NSC, respectively. High genetic correlations (ranging from 0.66 to 0.99) were estimated for each trait in different parities but phenotypic correlations were low (< 0.13). Estimated genetic correlations between two parities decreased continuously with increasing distance between parities in both traits. For both traits, estimated genetic correlations with the 6 th parity decreased abruptly, except for the first parity.

Genetic eigenvalues, eigenfunctions and selection indices
The three eigenfunctions associated with the additive genetic covariance matrices are shown in Figure 2. The first eigenfunction explained 89.99 and 97.20% of the total genetic variability of DO and NSC across parities, while the second eigenfunction accounted for 6.24 and 2.79%, respectively. The first eigenfunction of DO and NSC is positive and nearly constant throughout all parities.
by increases afterwards (Elmetwally et al., 2016) have been observed in other studies. The phenotypic mean of NSC linearly increased at consecutive parities and this change only was statistically significant (p<0.01) but differences between mean of NSC in parity 2, 3 and 4 was not statistically significant (p>0.05). An increasing trend for NSC has also been found in other studies (Nishida et al., 2006;Asimwe & Kifaro, 2007;Khan et al., 2015). The descriptive statistics indicate that cows in earlier parities had more chances for good fertility performance than cows in later parities. The number of records decreased from parity 1 to 6 due to voluntary and involuntary culling of cows.
Model comparison statistics, logL, AIC and likelihood ratio tests (LRT) are presented in Table 2. A quadratic fit is favoured by the two criteria for both DO and NSC. Genetic parameter estimates that are later shown belong to the quadratic fit.
The estimated covariances and correlations between the additive genetic random regression coefficients for DO and NSC are shown in Table 3. The estimated correlation between the intercept and the linear coefficient was negative for both traits. Estimated additive genetic and permanent environmental variances for each parity obtained from the covariance functions are shown in Figure 1 for DO and NSC. Estimated additive genetic variances decreased along   The pattern of the second eigenfunction of DO is similar to the third eigenfunction of NSC. The second eigenfunction of DO almost linearly decreased from parity 1 to 4 and then stabilized. Estimated genetic response to alternative selection indices for DO and NSC are shown in Table 6. Genetic responses shown are positive, indicating an increase in both traits, DO and NSC. By changing the sign of the selection index, gains in the desired direction could be obtained. The overall estimated genetic response to selection (i.e., the sum of the expected response from all lactations) by the indices that included the first eigenvector (I e1 ,I e1e2 ,I e1e3 ,I k ) or selection based on the intercept (I ao ) was much larger than selection from indices based on the second and third eigenvalues, which account for changes in the trait level across lactations (I e2 ,I e3 ,I e2e3 ). The lowest overall genetic gain was observed for I e2 for DO and for I e3 for NSC. The improvement of adding variables related to changes in the trajectory of the trait across lactations was small for DO and negligible for NSC. For the index showing the largest overall response (I e1e2 ), responses for parity 1 to 6 were 18, 17.2, 16.2, 15,13.4 and 11.6 days, and 0.28, 0.27,0.27, 0.25, 0.24 and 0.23 units, for DO and NSC, respectively.
Using I e2 (including only the second eigenvector) resulted in positive genetic gains for DO in parity 1 and 2 and negative genetic gains in parity 3, 4, 5 and 6. Selection by I e2 for NSC caused negative genetic gains in parity 2, 3 and 4 and positive genetic gains in parity 1, 5 and 6. Including third and second eigenfunction together in selection indices (I e2e3 ) resulted in positive genetic gains in parities 1, 2, 5 and 6 and negative genetic gains in other parities for DO, and negative genetic gains in parities 2, 3 and 4 and positive genetic gains in other parities for NSC.

Discussion
Both, the LRT and AIC criteria indicated that the model fitting a quadratic Legendre polynomial for additive genetic and permanent environmental effects was better than a linear model for analyzing DO and NSC in terms of goodness of fit. Menéndez-Buxadera et al. (2013) also found that the quadratic Legendre polynomial provided a better fit than the linear polynomial when analyzing DO and other fertility traits in the Holstein cattle population of The standard errors were less than 0.005.   The negative correlation between the intercept and the linear coefficient indicate that animals with small genetic values for the level of the trait are expected to show an increase in the trait along parities. Thus, cows with good genetic potential for fertility (small values for DO or NSC) will tend to show a decline in fertility (increased DO and NSC) performance along lactations and vice versa.
In our study for DO and NSC, the estimated additive genetic variances decreased along parities while permanent environmental variance increased. Patterns of response in other studies differ. Gutierrez (2010) reported increasing heritability of DO from age 24 to 72 months while Menendez-Buxadera et al. (2013) found a bell-shaped response with maximum values of heritability in the 4 th parity and Nishida et al (2006), in agreement with our results, showed estimated heritability of NSC that decreased from first to sixth parity and then increased up to 10 th parity. For the same data, Ghiasi et al. (2011)  In this study, high genetic correlations (ranging from 0.66 to 0.99) were estimated for each trait (DO or NSC). The high estimates of genetic correlations between different parities indicate that selection to decrease DO or NSC at any parity will decrease DO or NSC throughout all parities. However, because the estimated genetic correlations are not unity, differences in the pattern of response of individual animals across parities are expected. In other words, persistency of DO and NSC genetic merit across parities may differ among animals.
The first eigenfunction of DO and NSC in this study is positive and nearly constant throughout all parities. This means that selecting for the genetic value associated to the first eigenfunction will result in a similar improvement in all parities, deriving from the high and positive correlation found between different parities. The first eigenfunction is therefore responsible for scaling up and down the curve of DO and NSC without changing its shape and explains most of the observed variability in this data. But, selection for the second eigenfunction could be used to decrease DO from first parity to subsequent lactations. Selection based on the third eigenfunction of NSC would reduce the level of NSC from parity 1 to 4 but the amount of genetic response will be low because the third eigenvalue explains a low portion of the overall additive genetic variance of NSC. The pattern of the third eigenfunction of DO and the second eigenfunction of NSC showed a more complex response. The proportion of variance explained by the first eigenfunction in other studies dealing with quadratic RRM for fertility traits varied from 66% in Nishida et al. (2006) for NSC to 92 % in Menendez-Buxadera et al. (2013) for DO.
The overall estimated genetic response to selection (i.e., the sum of the expected response from all lactations) by the indices that included the first eigenvector (I e1 ,I e1e2 ,I e1e3 ,I k ) or selection based on the intercept (I ao ) was much larger than selection from indices based on the second and third eigenvalues, which account for changes in the trait level across lactations (I e2 ,I e3 ,I e2e3 ). The lowest overall genetic gain was observed for I e2 for DO and for I e3 for NSC. Including the second or third eigenvector to the selection index along with the first eigenvector index had a small impact on the direction and amount of genetic gains. This negligible impact of including the second and/or third eigenvariables in the selection index for NSC might be associated with the fact that the amount of variability explained by those variables was very small, 3%, while the same variables explained around 10% for DO. This derives from the larger genetic correlations between more distant lactations for NSC, especially for first parity, which showed a 0.9 correlation with parity 6 for this trait compared with DO, for which that correlation was 0.69. Both the very small of variability explained by the second and third eigenvariables and the large correlation between lactations imply little re-ranking of animals for different lactations and small changes in the trait across lactations. In the same way, the amount of overall genetic gain obtained from selection by first lactation performance, I L1 , was around 12% lower than genetic gains obtained from selection by I e1 ,I e1e2 ,I k and I ao for DO and the same for NSC. Under all indices involving e 1 or a o , the amount of genetic gain was largest for first parity and decreased along parities. Selection by an index combining the three eigenvectors provided more even responses across lactations but also an overall smaller response. These results are in agreement with results obtained by Togashi & Lin (2006) that reported the first eigenvector index for milk production resulting in constant and linear genetic gains in each day of lactation. Based on results obtained in this study, indices including second and third eigenvectors are responsible for changing the shape of the curve of DO and NSC along lactations. The results obtained for the second eigenvector index contrast with the results obtained by Togashi & Lin (2006). In particular, the mentioned study reported genetic responses for milk production in the second eigenvector that increased linearly throughout lactation. However, results obtained for the third eigenvector index in the current ΔG b Trait Index a study are in agreement with the results obtained by Togashi & Lin (2006) who concluded that genetic gains obtained for milk production from the selection on the third eigenvector index were positive in early and late of lactation and negative in mid-lactation. Genetic gains for NSC using I e3 were near zero in all parities therefore this index could not change the shape of NSC across parities. Togashi & Lin (2006) reported that genetic gains for milk production from selection by the fourth and fifth eigenvector indices were around zero. Eigenfunction and eigenvector indices obtained for DO and NSC indicate that it is possible to increase or decrease the amount of these traits along the different parities by making use of either the first eigenvector or the intercept of the random regression. However, the results of this study revealed that if the breeding goal were based on the second and third eigenvectors fertility performance would be improved in one parities and deteriorated in other parities. Thus, no clear benefit from using indices that include both level (first eigenvector) and shape of the trajectory along lactations (second and/or third eigenvectors) over using only variables associated with the level of the traits (first eigenvector or intercept of the random regression) was found.