GENERATING INTEREST-RATE SCENARIOS FOR FIXED-INCOME PORTFOLIO OPTIMISATION

One of the main sources of uncertainty in the analysis of the risk and return properties of a portfolio of fixed-income securities is the stochastic evolution of the shape of the yield curve. The authors have estimated a model that fits the South African yield curve, using a Kalman filter. The model includes four latent factors and three observable macroeconomic variables (capacity utilisation, inflation and the repo rate). The goal is to capture the dynamic interactions between the macroeconomy and the yield curve in such a way that the resulting model can be used to generate interest-rate scenario trees that are suitable for fixed-income portfolio optimisation. An important input into the scenario generator is the investor’s view on the future evolution of the repo rate. In this paper, details of the model are provided and the results of the estimation and scenario generation are reported.


INTRODUCTION 1.1
One of the main sources of uncertainty in the analysis of the risk and return properties of a portfolio of fixed-income securities is the stochastic evolution of the shape of the yield curve.Many yield-curve models (e.g.Knez et al., 1994, Duffie & Kan, 1996and Dai & Singleton, 2000) consider models in which unobserved factors explain the entire set of yields.(In this paper 'yield' means the yield to maturity on a zero-coupon bond.)These factors are often given the labels 'level', 'slope' and 'curvature'.The factors in these models are however not linked to macroeconomic variables.Examples where a latent-factor model is used to characterise the yield curve and that explicitly include macroeconomic factors can be found in Ang & Piazzesi (2003), Hördahl et al. (unpublished) and Wu (unpublished).These examples, however, only consider a unidirectional linkage between the macroeconomy and the yield curve.Kozicki & Tinsley (2001), Dewachter & Lyrio (unpublished) and Rudebusch & Wu (unpublished) allow for implicit feedback.Tilley (1992) provides an actuarial layperson's guide to building stochastic interest-rate generators.
1.2 Bernaschi et al. (2008) shows that, besides the relation with the official interest rate of the European Central Bank, it is extremely difficult to find, using simple linear regression analysis, a convincing relation among the parameters that describe the Italian yield curve and macroeconomic variables that drive the dynamics of the yield curve itself.Bernaschi et al. (op. cit.) concludes that a possible solution is to resort to more complex interaction models based on non-linear impulse-response functions.Ang & Piazzesi (op. cit.) further argue the importance of describing the joint behaviour of the yield curve and macroeconomic variables for bond pricing, investment decisions and public policy.They state that although many yield-curve models use latent factors to explain yield-curve movements and may afford some interpretation of the meaning of these factors (e.g.level, slope and curvature), the factors do not correspond explicitly to macroeconomic variables.These models describe the effect the latent factors have on the yield curve rather than describing the economic sources of the shocks.Ang & Piazzesi (op. cit.) consider a unidirectional linkage between the macroeconomy and the yield curve.

1.3
De Pooter et al. (unpublished) argues that models that include macroeconomic variables seem more accurate in sub-periods where there is substantial uncertainty about the future path of interest rates.Furthermore, models that do not include information about the macroeconomy perform well in sub-periods where the yield curve has a more stable pattern.

1.4
Inspired by the research of Diebold, Rudebusch & Aruoba (2006), the authors estimate a model that fits the South African yield curve, using a Kalman filter approach.Diebold, Rudebusch & Aruoba (op. cit.) characterise the yield curve using three latent factors, namely level, slope and curvature.To model the dynamic interactions between the macroeconomy and the yield curve, they also included observable macroeconomic variables, specifically real activity, inflation and a monetary-policy instrument.

1.5
To capture the dynamics of the yield curve, Diebold, Rudebusch & Aruoba (op. cit.) do not use a no-arbitrage factor representation such as the typically used affine no-arbitrage models (see e.g.Duffee, 2002 and Brousseau, unpublished) or canonical affine no-arbitrage models (see e.g.Rudebusch & Wu, op. cit.).Instead of using a noarbitrage representation Diebold, Rudebusch & Aruoba (op. cit.) suggest using a threefactor yield-curve model based on the yield-curve model of Nelson & Siegel (1987), as used in Diebold & Li (2006), and interpret these factors as level, slope and curvature.Diebold & Li (op. cit.) propose a two-step procedure to estimate the dynamics of the yield curve.The procedure firstly estimates the three latent factors and secondly estimates an autoregressive model for these factors.Diebold & Li (op. cit.) use these models to forecast the yield curve.Diebold, Rudebusch & Aruoba (op. cit.) propose a one-step approach by introducing an integrated state-space modelling approach which is preferred over the two-step Diebold-Li approach.This Kalman-filter approach allows for a bidirectional linkage between the macroeconomy and the yield curve and simultaneously fits the yield curve and estimates the underlying dynamics of these factors.The model also incorporates the estimation of the macroeconomic factors and the link between the macroeconomy and the latent factors driving the yield curve.

1.6
In the South African yield curve context, Maitland (2002) provides principlecomponent analysis to interpolate the South African yield curve.The method proposed by Maitland (op. cit.) provides a way in which the yield curve can be interpolated from a restricted number of modelled yields, and at the same time minimises the number of yields from which to estimate the remainder of the curve.Given the first and second principal components, Maitland (op. cit.) shows that the short rate and the long-bond yield could be used to reconstruct the South African yield curve.Stander (unpublished) discusses bond indices in South Africa.Using a survey, Stander (op.cit.) establishes inadequacies in the indices as well as possible changes that should be considered.Stander (op. cit.) further addresses criticism of the Bond Exchange-Actuaries yield curve and presents alternative empirical yield-curve models and equilibrium models.These contributions focus on the characterisation of the yield curve and do not consider forecasting or scenario generation.

1.7
In Section 2, the authors describe Kalman-filter state-space modelling for the basic three-factor 'yields-only model' (YO3F) proposed by Diebold, Rudebusch & Aruoba (op. cit.).Their model uses only three latent factors of the yield curve and does not include macroeconomic factors.The model estimation for the South African yield curve is described and a four-factor model (YO4F) based on the Svensson (1994) yield curve model is introduced.It is shown that the Nelson & Siegel (op. cit.) model is not flexible enough to get an acceptable fit to the South African yield curve and a four-factor model is therefore introduced.

1.8
In Section 3, macroeconomic variables (capacity utilisation, inflation and reporate) are incorporated into the 'yields-macro model' (YM4F).The goal is to capture the dynamic interactions between the macroeconomy and the yield curve in such a way that the resulting model can be used to generate interest-rate scenario trees that are suitable for fixed-income portfolio optimisation.Section 4 describes the approach adopted.An important input into the scenario generator is the investor's view on the future evolution of the repo rate.In practice these views are produced by means of an economic scenario generator (ESG) or expert opinion.The existence of arbitrage in the scenario trees is discussed and a method to eliminate arbitrage opportunities is proposed.Concluding remarks are offered in Section 5.

YIELDS-ONLY MODEL
In this section the factor model representation of the yield curve is introduced.Following Diebold, Rudebusch & Aruoba (op. cit.), the discussion starts with the YO3F model using the three-factor representation of Nelson & Siegel (op. cit.) and this is used as a benchmark for the four factor representation of Svensson (op. cit.).By using the more flexible four-factor model, a better cross-sectional fit of the South African yield curve is obtained.Since all the models that are described in this section are fitted using a Kalman filter, this section starts with an overview of the Kalman filter.

2.1
THE KALMAN FILTER 2.1.1 The Kalman filter, introduced by Kalman (1960), is a popular technique used in signal processing, control engineering and other fields.The main idea is to represent a dynamic system in terms of states (the unobserved underlying Markov process).The 'state equation' (or 'transition equation') describes the dynamics of this process while the 'observation equation' (or 'measurement equation') relates the observables to the unobserved states.The advantage of using a state-space representation (defined below) is that it allows the modeller to infer the properties of the unobserved yield-curve drivers from the observed yields.
2.1.2Following Hamilton (1994: Chapter 13), let y t denote a vector of variables (yields in our case) observed at date t that can be described in terms of f t , a vector of unobservable states.The 'state-space representation' of the dynamics of y is then given by the following system of equations: where the matrices A, B and Λ have appropriate dimensions and x t is a vector of exogenous variables.Equation ( 1) is the transition equation and equation ( 2) is the measurement equation.The disturbances  t and ε t are vector white-noise processes such that: ( ) for 0 otherwise; and

H
where the matrices Q and H have appropriate dimensions.The disturbances  t and ε t are also assumed to be uncorrelated at all lags: ( ) 0, for all and .
The Kalman filter is a sequential algorithm that calculates the best predictor of the unobserved states, given all previous observations-see details below.

FACTOR REPRESENTATION 2.2.1
The main aim of the factor model is to represent the yield curve (a large set of yields with various maturities) as a function of a smaller set of unobservable factors.The Nelson-Siegel (op. cit.) representation produces reliable and reasonable estimation results and has become one of the popular approaches adopted by central banks for yield curve estimation. 1 The Nelson-Siegel model, derived from a parametric functional form for the forward rates, uses only four parameters to define a parsimonious and stable representation of the whole yield curve: , where y(τ) is the yield with maturity τ and β 1 , β 2 , β 3 and λ are the model parameters.As demonstrated by Diebold & Li (op. cit.), the parameters β 1 , β 2 and β 3 of the Nelson-Siegel representation of the yield curve, can be interpreted as level, slope and curvature; the terms that multiply these factors are called the 'factor loadings'.The parameter λ determines the shape of the curve and does not have a direct economic interpretation.
To give meaning to the parameters β 1 , β 2 and β 3 , Diebold & Li (op. cit.) rewrite the representation as ( ) where L t , S t and C t are the time-varying parameters, and β 1 , β 2 and β 3 are considered unobserved factors.2.2.2 Diebold, Rudebusch & Aruoba (op.cit.) describe the state-space system as follows.The dynamics of the unobservable factors L t , S t and C t are modelled as a vector autoregressive process of the first order (a 'VAR(1)' process), which forms a state-space system.The autoregressive moving-average state-vector dynamics may be of any order, but the VAR(1) assumption is maintained for transparency and parsimony.The dynamics of the state vector is governed by the transition equation: where t = 1,…,T.
1 Zero-coupon yield curves: technical documentation.Bank for International Settlements, Switzerland, 1999

2.2.3
For a fixed value of the parameter λ (specified below), the measurement equation, which relates a set of N yields of the yield curve, with maturities τ 1 ,…τ N , to the three unobserved factors, are: where t = 1,…,T.The state-space system can be written in matrix notation as , The white-noise disturbances in the transition and measurement equations are required to be orthogonal to each other and to the initial state for the linear least-squares optimality of the Kalman filter: ( ) 2.2. 4 Diebold,Rudebusch & Aruoba (op. cit.) assume that the Q matrix is non-diagonal to allow the shocks to the three yield-curve factors to be correlated.The H matrix is assumed to be diagonal, which implies that the deviations of the yields of various maturities from the yield curve are uncorrelated.This is quite standard and, as in estimating no-arbitrage yield curve models, independently and identically distributed (i.i.d.) 'measurement errors' are added to the observed yields.Given the large number of observed yields, this is required for computational tractability as well (Diebold, Rudebusch & Aruoba, op. cit.).

2.3.2
The variation in the level of the yield curve is visually apparent, as is the variation in its slope and curvature.Descriptive statistics for the yields (mean, standard deviation, minimum, maximum and autocorrelations for 1, 12 and 30 months) are provided in Table 1.It is clear that the typical yield curve is hump-shaped with a negative hump at about 20 months and a positive hump at about 120 months.The short rates are less volatile than the long rates but comparison of the autocorrelation with a lag of 12 months shows that they are less persistent.This is opposite to the U.S. yield curve (see Diebold & Li, op. cit.).The level is persistent and varies moderately relative to its mean and the slope and the curvature are the least persistent.The slope is highly variable relative to its mean, as is its curvature.In Figure 2 the median yield curve together with point-wise interquartile ranges are displayed.The hump-shaped pattern, with short rates less volatile than long rates, is apparent.), the YO3F model forms a state-space system, with a VAR(1) transition equation summarising the dynamics of the vector of latent variables, and a linear measurement equation relating the observed yields to the state vector as described above.In the entire model there are 46 parameters that need to be estimated by the numerical optimisation of the relevant likelihood function.Let ψ be the vector of all parameters that need to be estimated.These parameters are the nine parameters contained in transition matrix A, the three parameters contained in the mean state vector μ, and the one parameter λ contained in the measurement matrix Λ.Furthermore, the transition-disturbance covariance matrix Q contains six parameters, and the measurement-disturbance covariance matrix H contains 27 parameters (one variance for each of the 27 yields).Given that the matrices A and Λ are affine and assuming that the distributions of  t , ε t and f 0 are normal, the model is referred to as a 'linear Gaussian state-space model' (Lemke, 2006).The Kalman-filter algorithm is provided in Appendix A. 2.3.4 The parameters are estimated by maximising the log-likelihood function (see Appendix A), using either the Nelder-Mead simplex or Newton-Raphson algorithm.For more details on Kalman filtering see Harvey (1989) and Lemke (op. cit.).Non-negativity constraints are imposed on all the variances.As in Diebold, Rudebusch & Aruoba (op. cit.), starting parameters are obtained using the two-step Diebold-Li method and initialising the variances to 1,0.As in Diebold & Li (op. cit.) the value of λ is initialised at 0,0609 to maximise the loading on the curvature factor at exactly 30 months, i.e. the maturity at which the hump occurs in the yield curve.
2.3.5 Tables 2 and 3 show the estimation results for the YO3F model.In those tables bold entries denote parameters estimates significant at 5% and standard errors appear in parentheses.In Table 2 the estimate of the A matrix indicates the highly persistent dynamics of L t , S t and C t , the estimated own-lag coefficients being 0,945, 0,987 and 0,953 respectively.Cross-factor dynamics between S t and L t and between S t and C t appear to be important with statistically significant effects.The mean of the level is approximately 8,5%.The means of the slope and of the curvature do not seem statistically significant different from zero and appear to be reasonable in comparison with the mean values of the empirical estimates in Table 1.The largest eigenvalue of the A matrix is 0,96, which ensures the stationarity of the system.In Table 3  of the Q matrix indicate that transitional shock volatility increases as we move from L t to S t to C t as measured by the diagonal elements.There are no significant covariance terms in the Q matrix.The estimate for λ is 0,0916, which implies that the loading on the curvature factor is maximised at a maturity of 19,85 months.This can be seen in Figure 2, where the first (inverted) hump occurs near the maturity of 20 months.

2.3.6
Table 4 shows the means and standard deviations of the predicted errors (also called measurement errors, which are measured as the excess of the actual yields over the predicted model yields) for the yields-only model and the yields-macro model (presented in section 3 below).The YO3F model fits the yield curve reasonably well in the short maturities but less so in the longer maturities, the standard deviation also increasing for longer maturities.The results are similar to the yields-only model of Diebold, Rudebusch & Aruoba (op. cit.).

2.3.7
The Kalman-filter fixed-interval smoothing algorithm (see Appendix A) is used to obtain optimal extractions of the latent level, slope and curvature factors.Figure 3 plots the three smoothed estimated factors together and Figures 4 to 6 plot the three factors together with various empirical proxies and related macroeconomic variables.The level factor in Figure 3 is in the neighbourhood of 8% and displays persistence.The slope and the curvature factors vary around zero with positive and negative values and appear less persistent.The slope factor is more persistent than the curvature factor but has a lower variance.This seems consistent with the mean and autocorrelation values of the empirical estimates in Table 1.  4 displays the estimated level factor and two related comparison series.The first one is a commonly used empirical proxy for the level factor, namely the average of the short-, medium-and long-term yields: ) The second is the annual percentage change in the consumer price index.There is a high correlation of 0,89 between the level factor and the empirical proxy.The correlation between the level factor and the inflation rate is 0,51.As stated by Diebold, Rudebusch & Aruoba (op. cit.) this is consistent with the Fisher equation, which suggests a link between the level of the yield curve and inflation expectations.

2.3.9
In the estimates of the empirical level, slope and curvature the 228-month yield was used and not the 120-month yield as in Diebold, Rudebusch & Aruoba (op. cit.).This has very little effect on the results.
2.3.10 Figure 5 also displays the estimated slope factor and two related comparison series.The first is the empirical proxy for the slope factor, namely the excess of the short-term yield over the long-term yield: y(3)-y( 228).The second is an indicator of macroeconomic activity, namely the demeaned manufacturing capacity utilisation.There is a high correlation of 0,97 between the slope factor and the empirical proxy.The correlation between the slope factor and capacity utilisation  Diebold, Rudebusch & Aruoba (op. cit.) suggest that, as with the level factor, there is a connection between the yield curve and the cyclical dynamics of the economy.

2.3.11
Figure 6 shows the curvature factor and the empirical proxy for the curvature of the yield curve, viz.: 2y(24)-y(3)-y(228).There is a correlation of 0,79 between the curvature factor and the empirical proxy.Diebold, Rudebusch & Aruoba (op. cit.) report no reliable macroeconomic links to the curvature factor.4 shows that the YO3F model fits the yield curve reasonably well in the short maturities but less well at the longer maturities.The YO3F model is extended to a four-factor model using the Svensson (op.cit.) representation of the yield curve: ( ) where y(τ) is the yield to maturity τ and β 1 , β 2 , β 3 , β 4 , λ 1 and λ 2 are model parameters.
Figure 7 illustrates a fit of both the Nelson-Siegel curve and the Svensson curve on an arbitrary yield curve in the dataset.It is clearly visible that the Svensson curve is more flexible and provides a better fit to the South African yield curve than the Nelson-Siegel curve.

2.4.2
As for the Nelson-Siegel parameterisation, the Svensson representation may be rewritten as: ( ) C are interpreted as level, slope, curvature 1 and curvature 2 respectively.The state-space system can be extended as: ;

and
, , , . The dimensions of A, μ,  t and Q are increased as appropriate.Λ is changed to be .
Tables 5 and 6 show the estimation results for the YO4F model.As before, bold entries denote parameters estimates significant at 5% and standard errors appear in parentheses.In Table 6 the estimate of the A matrix indicates high persistent dynamics of L t , S t , 1 t C and 2 t C , the estimated own-lag coefficients being 0,981, 1,019, 0,809 and 0,992 respectively.Some cross-factor dynamics seem significantly important.As with the YO3F model (see Table 2) the level of persistence is higher in S t than in L t and 1 t C .Also we see that the level of persistence is higher in 2 t C than in L t and 1 t C .The mean of the level is approximately 5.9 percent and is statistically significant different from zero.The mean of the slope is 2.612 percent, the mean of the first curvature factor is -0.529 percent, which is not statistically significant different from zero.The mean of the second curvature factor 8.009 percent and is statistically significant different from zero.The largest eigenvalue of the A matrix is 0.986 and ensures the stationarity of the system.In Table 6 the estimates indicate an increase in the transitional shock volatility as we move from L t to S t to 1 t C to 2 t C .The estimate for λ 1 is 0.088 which implies that the loading on the first curvature factor is maximised at a maturity of 20.38 months and the estimate for λ 2 is 0.015 which implies that the loading on the second curvature factor is maximised at a maturity of 119.55 months.Again in Figure 2 it can be seen that the first hump is at about 20 months and the second hump at 120 months.

2.4.4
As shown in Table 4, the YO4F model improves on the means of the predicted errors, especially for the long maturities.Figures 8 and 9 plot the estimated smoothed level and slope factors against empirical proxies and macroeconomic factors.The curvature factors are omitted as there is no reliable macroeconomic link to them. Figure 8 plots the estimated level factor against the empirical proxy and annual percentage change in the inflation index.There is a correlation of 0,67 between the estimated level factor and the empirical proxy.The correlation between the estimated level and the inflation rate is 0,28, which again suggests that inflation is linked to the dynamics of the yield curve. Figure 9 shows the estimated slope curve together with the empirical proxy     and demeaned manufacturing capacity utilisation.There is a 0,84 correlation between the estimated slope factor and the empirical proxy, and a 0,30 correlation between the estimated slope factor and capacity utilisation.This also suggests a link between capacity utilisation and the dynamics of the yield curve.

MACROECONOMIC MODEL
In this section we relate the four unobserved factors, level, slope and the two curvature factors, which provide a good representation of the yield curve, to macroeconomic variables.This can be done by extending the state-space model in the previous section.We also present out-of-sample forecasting results to assess how well the YM4F model forecast the dynamics of the yield curve.

3.1
YIELDS-MACRO MODEL 3.1.1 The following three macroeconomic variables are included: -manufacturing capacity utilisation (CU t ), which represents the level of real economic activity relative to potential; -the annual percentage change in the inflation index (IF t ), which represents the inflation rate; and -the repo rate (RR t ), which represents the South African monetary-policy instrument.According to Diebold, Rudebusch & Aruoba (op.cit.) these three macroeconomic variables are considered to be the minimum set of fundamentals needed to capture the basic macroeconomic dynamics (see also Rudebusch & Svensson, 1999 and Kozicki & Tinsley, op. cit.).

3.1.2
We extend the YO4F model, to the YM4F model, to incorporate the three macroeconomic variables.This is done by adding the macroeconomic variables to the set of state variables.The state-space system is extended as follows:  ( ) , , , Where A 11 , A 12 , A 21 , A 22 , μ, ν,  t , γ t , Q, K and J have appropriate dimensions.Λ stays unchanged.This is consistent with the view that only four factors are needed to distil the information in the yield curve (Diebold, Rudebusch & Aruoba, op. cit.).In the YM4F model the matrix is assumed to be non-diagonal and H is assumed to be diagonal.

3.1.3
The extension of the Kalman-filter algorithm to include the macroeconomic variables is presented in Appendix B.
3.1.4Tables 7 and 8 show the estimation results for the YM4F model.The estimate of the A matrix again indicates high persistent dynamics for S t , 1 t C , 2 t C , CU t and IF t .Some of the cross-factor dynamics are significantly important for most factors.The estimates also indicate an increase in the transitional shock volatility as we move from L t to S t to 1 t C to 2 t C all being statistically significantly different from zero, and a decrease in the transitional shock volatility as we move from CU t to IF t to RR t , all being statistically significantly different from zero.There are small changes in the mean of the slope and the two curvature factors in comparison with the yields-only model.The largest eigenvalue of the A matrix is 0,98, which ensures the stationarity of the system.None of the covariance terms in the Q matrix are significantly different from zero.
3.1.5As shown in Table 4 the YM4F model reduces most of the means slightly or not at all, but it does reduce the standard deviations of the predicted errors, indicating a better fit.The means and standard deviations of the predicted errors for the YM3F model are also shown; again the YM4F model fits the yield curve better than the three-factor yields-macro model.The estimates for the level, slope and two curvature factors of the YM4F model are very similar to those of the YO4F model.

3.1.6
Figure 10 shows the mean predicted error and the associated upper and lower 95% confidence bands for the YO4F and the YM4F models.Here we can clearly see that there is little difference in the means of the two models.But as mentioned before, there is less variance in the yields-macro model, especially in longer maturities, indicating a better fit.The analysis below uses the YM4F model.For scenario generation it is important not only to capture the dynamics of the yield curve well in-sample, but also to forecast the dynamics of the yield curve well out-of-sample.For this reason the YM4F model is estimated on truncated or curtailed datasets.Using the estimated parameters the yield curve is forecast repeatedly for one, two, three and four years ahead over the period of February 2004 to February 2009, using monthly intervals.For the purpose of asset and liability management it would be of importance to use longer periods for out-of-sample testing, but the lack of data for model fitting restricts this period.Diebold & Li (op. cit.) model and forecast the Nelson-Siegel factors as univariate AR(1) processes for one month, six months and twelve months ahead.The model proposed by Diebold & Li (op. cit.) outperforms other models for yield-curve forecasting at all maturities.Thus the Svensson factors are modelled and forecast as univariate AR(1) processes in order to compare their model against the YM4F model in this paper.
3.2.2Tables 9 to 12 show the out-of-sample forecasting results for maturities 3, 12, 36, 60, 120, 180 and 288 months.The forecast errors at time t + h are defined to be: − ; where t is the time of parameter estimation and h the length of the period forecast.The mean and standard deviation of the forecast errors are reported.The YM4F model outperforms the AR(1) model.The standard deviations for the AR(1) model are also larger than those of the YM4F model.In particular, the four-year-ahead forecast of the YM4F model is better than that of the AR(1) model.

3.2.3
In practice most financial institutions have views on the macroeconomy.These views are produced by means of an ESG or expert opinion.These ESGs produce forecasts only for macroeconomic variables, for example the repo rate, and not a complete yield curve.By using the Kalman filter to model the yield curve bidirectionally, as mentioned in the introduction, it is possible to close this loop and to produce a full yield curve given a set of macroeconomic forecasts.This is done by including the macroeconomic forecasts produced by such an ESG in the forecasting of the yield curve rather than the forecast macroeconomic variables of the model.Either all three macroeconomic variables or only a selection of them can be replaced.Because of the lack of real ESG forecasts for the repo rate the actual repo rate is included.Tables 9 to 12 out-of-sample forecasting results are also shown, where the actual repo rate was included in the forecasting instead of the forecast repo rate from the model.As can be seen, the forecasting error reduces, especially for the longer maturities, in comparison with the other models.Thus, by including these forecasts, a better yield-curve forecast can be made.

3.2.4
Figure 11 shows the quantile-quantile plots for maturities 3, 60, 120 and 228 months.The quantiles of the empirical distribution are set against the quantiles obtained by averaging over a set of scenarios generated by the YM4F model.The 5th and 95th percentiles are also plotted.The YM4F model reproduces the empirical distribution in the medium-term rates better than in the short and long rates.In Figure 12 the quantilequantile plots for 3, 60, 120 and 228 maturities are shown, but here the scenarios were generated by sampling from the residuals instead of using normal errors.As can be seen, little improvement is gained by sampling from the residuals as opposed to using normal errors.

SCENARIO GENERATION
In this section the scenario generation algorithm that we use to generate yieldcurve scenario trees for fixed-income portfolio optimisation problems is described.The YM4F model is used to generate yield-curve scenarios.The existence of arbitrage in the scenario trees is discussed and a method of eliminating arbitrage opportunities is proposed.By means of back-testing it is also demonstrated that the scenarios are stable.

YIELD-CURVE SCENARIO GENERATION 4.1.1
This section starts with a description of a procedure based on the parallel simulation and randomised clustering approach proposed by Gülpinar et al. (2004) to generate a scenario tree, which is the input for financial optimisation problems.The basic data structure is the scenario-tree node, which contains a cluster of yieldcurve scenarios.One of these is designated the 'centroid' or 'representative'.The final tree consists of the centroid of each node, and its branch probabilities.Gülpinar et al. (op. cit.) introduced a randomised clustering algorithm.This differs from the approach proposed by Dupacova et al. (2000), which determines clusters that are optimal by some measure.The approach here is to assign the scenarios to equal groups in preference to a clustering approach, as the latter may necessitate a very large number of scenarios to be generated at the root node to ensure sufficient scenarios at the leaf nodes.

4.1.2
The scenario tree here is a yield-curve scenario tree.A T-period scenario tree structure is represented as a 'tree string', which is a string of integers specifying for each stage s = 1,2,…,T the number of branches (or branching factors) for each node in that stage (see Dempster et al., 2006).This gives rise to balanced scenario trees, in which each sub-tree in the same period has the same number of branches.Let k s denote the branching factor for stage s.Then Figure 13 illustrates a scenario tree with a (3, 2) tree string, i.e. k 1 = 3 and k 2 = 2. 4.1.3 The generated trees are non-recombining.Given that four latent factors and three macroeconomic factors are used in the yield-curve model, it is very difficult to construct a recombining tree.Even with only three latent factors it is notoriously difficult to construct a recombining tree.Furthermore, non-recombining trees are used in fixedincome portfolio optimisation problems as the portfolio composition is path-dependent.
4.1.4Figure 14 illustrates the methods of scenario simulation, namely parallel and sequential.The parallel method for simulation is used here as this method produces more realistic extreme events in the scenario tree.The reason for this is that, with the number of simulations growing smaller down the tree in the parallel method, the centroids that eventually represent the scenario groups are drawn from a smaller sample size.In the sequential method, at every stage the simulated scenarios in all of the clusters  are discarded, and the next simulation restarted from the centroid, which will prevent any extreme variation (Gülpinar et al., op. cit.).

4.1.5
In order to group the scenarios a measure of relative position is used.The 'distance' between the discounting factors of the yield curve and that of the average is calculated as: where y(τ) is the yield to maturity τ and y M (τ) the average yield to maturity τ.Note that the relative distance D can be negative or positive, which means that a yield curve can be positioned to the left or to the right of the average yield curve.This is to ensure realistic extreme events.Chueh (2002) discusses several other distance methods for interest-rate sampling.The relative distance method used here relates closely to the relative present value distance method in that paper.

4.1.6
It is necessary to represent each group of scenarios with a single point, which becomes the datum in the scenario tree.Gülpinar et al. (op. cit.) argue that to prevent the scenario tree from containing scenarios that are not consistent with the simulation parameters, the centroid should not be taken to be the centre of the group, but rather the simulated scenario closest to the centre.The mean of the group is used here as the notion of the centre; other notions of the centre that can be used are the median and the mode. 4.1.7 The main steps of the algorithm used are as follows: Step 1: At s = 0 create a root node group containing N scenarios.Generate all the scenarios using Monte Carlo simulation and the YM4F model.Each scenario is equally likely and consists of T + 1 sequential yield curves with the same starting point, the current yield curve (in total (T + 1) × N yield curves are generated).
Step 2: Set s = s + 1 and for each group in the previous stage, calculate the average scenario and calculate the distance (i.e. the relative position as defined above) of each scenario with respect to the average.Step 3: For each group, sort the scenarios in descending order by the distance and group them into k s equal-sized groups.
Step 4: For each new group, find the scenario closest (in absolute value) to the average of the group, and designate it as the centroid.To each centroid assign the probability: ( ) Step 5: If s < T, go to step 2; otherwise stop.Filipović (1999) and other researchers such as Diebold, Rudebusch & Aruoba (op. cit.) show that the Nelson-Siegel family of yield-curve models does not impose absence of arbitrage, although these models estimate and forecast the yield curve better than arbitrage-free models.(Duffee (op. cit.) noted that canonical affine arbitrage-free models demonstrate poor out-of-sample performance.)In light of this, the scenarios generated are not arbitrage-free.Klaassen (2002) shows that arbitrage opportunities can either be detected ex post by checking for solutions to a set of linear constraints, or be excluded by including non-linear constraints in the scenario generation process.Christensen et al. (unpublished a, unpublished b) derive a class of arbitragefree affine dynamic yield-curve models that approximate the Nelson-Siegel yieldcurve specification.They extend these models to include the Svensson extension of the Nelson-Siegel yield curves.

4.2.2
In this article the authors propose a method of reducing the presence of arbitrage ex post, without extending their models to the class of arbitrage-free models.They reduce the presence of arbitrage ex post, as opposed to excluding it by means of including non-linear constraints during the scenario generation process.This approach has no additional effect on the computational difficulty of the model estimation process and the data requirements.As the scenario generation process is a discrete approximation of the continuous evolution of the yield curve, the extension of the models, used in the simulation process, to a class of arbitrage-free models would not ensure the exclusion of arbitrage in the generated scenarios.
4.2.3Klaassen (op.cit.) proposes linear constraints for two types of arbitrage.Ingersoll (1987) distinguishes these two types of arbitrage.The first type is an opportunity to construct a zero-investment portfolio that has non-negative payoffs in all states of the world, and a strictly positive payoff in at least one state.The second type is an opportunity to construct a negative investment portfolio (i.e.providing an immediate positive cash flow) that generates a non-negative payoff in all future states of the world.

4.2.4
Following the notation of Klaassen (op.cit.), let , 1 r + be the return on asset class k (k = 1,…,K) between times t and t + 1 if state n (n = 1,…, N ) of the world materialises at time t + 1. Klaassen (op.cit.) mentions a useful result, that if the set of equations ( ) has a strictly positive solution v n for all n, then no arbitrage opportunities of the first or second type exist (see also Ingersoll, op. cit.).Taking , 1 n t r τ + to be the yield to maturity k = τ, then: ; ( ) ( ) for all maturities τ, has a strictly positive solution v n for all n, then no arbitrage opportunities of the first or second type exist in these yield-curve scenarios.

4.2.5
The class of arbitrage-free affine dynamic yield-curve models that Christensen et al. (unpublished a, unpublished b) derive, for the Nelson-Siegel family of yield curves, differs from the Diebold, Rudebusch & Aruoba (op.cit.) models only in the inclusion of an additional yield-adjustment term, which depends only on the maturity of the zero-coupon bond.As this term is dependent on the maturity of the bond, it can be seen as a shift in the slope of the yield curve.Now let for all n (n = 1,…,N ) .
( ) Then, if we can find yield curve shifts c t + 1 (τ) such that ( 1) 1 for all maturities τ, then no arbitrage opportunities exist in the yield-curve scenarios.Thus, if, for every maturity τ, the mean of the calculated values of a zero-coupon bond with that maturity, using different scenarios, equals the price, then no arbitrage opportunity exists in the yield-curve scenarios.(This is consistent with the no-arbitrage literature.)4.2.6Given the small size of the branching factors of the scenario trees generated it may not be possible to find realistic solutions to the yield-curve shifts c t + 1 (τ).Thus, to eliminate most of the arbitrage opportunities in the scenario trees the following algorithm is proposed: Step 1: At the root node create a group of N scenarios.Generate all the scenarios using Monte Carlo simulation and the YM4F model (as for the scenario tree).Each scenario is equally likely and consists of T sequential yield curves.
Step 2: At each branching time of the scenario tree calculate the average of the N generated scenarios (at the root node the current yield curve is used).
Step 3: Then for each average yield curve and the corresponding one-period-ahead scenarios solve ( 1) 1 for all maturities, to obtain the yield curve shifts c t + 1 (τ).
Step 4: Add the amount c t + 1 (τ) to the original scenario-tree yield curves. 4.2.7 The described method removes most of the arbitrage opportunities in the scenario tree, with a few opportunities left in sub-trees.For scenario trees with a short horizon all opportunities may be removed.This reduction of arbitrage opportunities is considered sufficient, since portfolio constraints in optimisation problems, such as the restriction of short-selling and the inclusion of bid-ask spreads, will eliminate the remaining arbitrage opportunities.

4.3.1
To test their scenario generation method, the authors implemented the multi-stage stochastic optimisation problem described by Dempster et al. (op. cit.).In that paper an asset-and liability-management framework is proposed and numerical results are given for a simple example of a closed-end guaranteed fund where no contributions are allowed after the initial cash outlay.The design of investment products with a guaranteed minimum rate of return focusing on the liability side of the product is demonstrated in that paper.A detailed discussion of the asset-and liability-management framework is beyond the scope of this paper.The multi-period stochastic programming model of Dempster et al. (op. cit.) is included in Appendix C. In this paper the authors' scenario generation approach is used to generate the input scenarios for the optimisation problem.The YM4F model is fitted to market data up to an initial decision time t and scenario trees are generated from time t to some chosen horizon t + T. The optimal firststage or root node decision is then implemented at time t and the success of the portfolio implementation is measured by its performance with historical data up to time t + 1.This whole procedure is rolled forward for T trading times.At each decision time t, the parameters of the YM4F model are re-estimated using the historical data up to and including that time.
4.3.2Back-tests were performed over a period of five years, from February 2004 to February 2009, and different tree structures were used with approximately the same number of scenarios.The tree structures are defined in Table 13.Bonds with 5, 7, 10, 15 and 19 year maturities as well as the FTSE-JSE Top-40 index are included in the portfolio.(Dempster et al. (op. cit.) include bonds with different maturities and an equity index.)In order to generate scenarios for the Top-40 index, the index is modelled using a simple linear regression model incorporating the three macroeconomic variables.The expected average shortfall is minimised for an annual guarantee of 9% and transaction costs are included.15 illustrates the back-testing portfolio values and the minimum guarantee for all three scenario sets.The results are consistent with those in Dempster et al. (op. cit.).In that paper the expected average shortfall is minimised and the expected terminal wealth of the portfolio is maximised, optimality being achieved with reference to a risk-aversion parameter.The model performs well, staying above the guarantee at all times, although the system involves the inclusion of transaction costs, which puts downward pressure on the portfolio wealth.

4.3.5
The scenario generation is further tested by again solving the model for 100 different scenario sets and for different number of final nodes, 120, 500, 1000 and 2000.Dempster et al. (op. cit.) minimises the expected average shortfall and maximises the expected terminal wealth of the portfolio, and distinguishes between them using a risk aversion parameter ('alpha').For each scenario set the model is solved ranging the risk aversion parameter from 0 to 1 in steps of 0,1 (1 being the most risk-averse).Table 15 shows the mean and standard deviation for each number of final nodes.Figure 16 shows the mean frontier, derived by averaging the objective function obtained over the 100 different scenario sets, and the confidence bands covering 95% of the results.(Kaut  (2007) and Consiglio & Staino (unpublished) display similar results for scenario and model stability testing.)The frontier is a decreasing function of the risk-aversion parameter alpha.If the value of alpha is closer to 1, more importance is given to the shortfall of the portfolio and less to the expected wealth, and hence a more risk-averse portfolio allocation strategy will be taken, and vice versa.In the extreme case where alpha is 1, only the shortfall will be minimised and the expected wealth will be ignored, and where alpha is 0, the unconstrained case only maximises the wealth.For 1000 final nodes the 95% region, at its maximum (alpha at 0), is 4,9% wide (a reduction of 2% from 500 final nodes), ensuring that the randomisation error is bounded enough.From Table 15 it may also be observed that the standard deviation reduces as the number of final nodes increases.The reduction is less or none at all when the number of final nodes is increased from 1000 to 2000, again ensuring that the randomisation error is bounded enough, and achieves stability.Although back-testing assumes that the past describes the future and can in no means guarantee the success of the outcomes of these models in practice, it provides a way to assess the algorithm proposed.Through back-testing it may be seen that the proposed scenario generation algorithm performs well on a portfolio optimisation problem in the literature; similar results are obtained to those of Dempster et al. (op. cit.).It may also be seen that stability in the objective is obtained by increasing the number of scenarios.The amount of the final number of scenarios necessary to achieve this stability may depend on the optimisation problem in question.

CONCLUSION 5.1
In this paper the estimation and characterisation of the South African yield curve with respect to macroeconomic variables is considered, as well as its use in scenario generation for fixed-income portfolio optimisation.A yield-curve model that incorporates four yield-curve factors (level, slope and two curvature factors) and three macroeconomic variables (real activity, inflation and the stance of monetary policy) has been estimated.The estimated model fits the yield curve reasonably well in-sample as shown in the results.The model was tested on out-of-sample forecasting for horizons up to four years.For the purpose of asset and liability management it would be of importance to use longer periods for out-of-sample testing, but the lack of data for model fitting restricts this period.The model also performs reasonably well in out-of-sample forecasting.It has been shown that better performance can be realised by including forecasts for the macroeconomic variables generated by an ESG.For lack of such forecasts the actual repo rate was used.

5.2
A parallel simulation approach for yield-curve scenario tree generation has also been proposed.The procedure was tested and the performance was measured by outof-sample back-testing in terms of the value of a fixed-income portfolio-optimisation problem described in the literature.Although back-testing assumes that the past describes the future and can in no means guarantee the success of the outcomes of these models in practice, it provides a way to assess the algorithm proposed.Through back-testing it has been shown that the proposed scenario generation algorithm performs well on a portfolio-optimisation problem in the literature.It has also been shown that stability is obtained by increasing the number of scenarios.The amount of the final number of scenarios necessary to achieve this stability may depend on the optimisation problem in question.

5.3
The existence of arbitrage in the scenario trees has been discussed and a method of eliminating arbitrage opportunities ex post has been proposed.Consideration may be given to other methods of eliminating arbitrage opportunities either during simulation or ex post.This is left to future research.Σ , F t are the sequences of conditional means and covariance matrices.These quantities can be obtained by employing the Kalman filter for a given set of parameters ψ.
Step 2: ; and Step 3: y t has been observed.Compute: Step 4: If t < T, set t = t + 1, and go to step 2; otherwise, stop.

A.4
Hence the Kalman filter delivers the sequence of means and covariance matrices for the conditional distributions of interest for a given set of parameters ψ.The Kalman filter is initialised by setting 0 f and 0 Σ to the unconditional mean and unconditional covariance matrix of the state vector respectively.Under the normality assumption, the distribution of y t conditional on Y t-1 is the N-dimensional normal distribution with mean | 1 ˆt t y − and covariance matrix F t .The conditional density of y t given Y t-1 and ψ can be written as (see Lemke, op. cit.): Accordingly, the log-likelihood function becomes ( ) is the vector of prediction errors.

A.5
For a given set of parameters ψ, the Kalman filter is used to compute the prediction errors v t and their covariance matrix F t , after which the log-likelihood function is computed.

A.6
The Kalman-filter fixed-interval smoothing algorithm is used to obtain optimal extractions of the latent level, slope and curvature factors.The algorithm consists of a set of recursions, which start with the final quantities given by the Kalman filter and work backwards (Harvey, op. cit.) ; and ( ) where G t = {y 1 ,…,y t ,x 1 ,…x t } is taken to be the sequence of observations available for estimation.These quantities can be obtained by employing the Kalman filter for a given set of parameters ψ.
Step 2: Step 3: y t and x t have been observed.Compute: ( ) ; and ( ) Step 4: If t < T, set t = t + 1, and go to step 2; otherwise, stop.

Figure 2 :
Figure 2: Median yield curve with the point-wise interquartile range

Figure 3 :
Figure 3: YO3F model: estimates of the level, slope and curvature factors

Figure 7 :
Figure 7: Nelson-Siegel fit versus Svensson fit of the yield curve

FigureFigure 13 :
Figure 14: Two methods of simulating scenarios at time t of a zero-coupon bond with maturity τ.Thus if the set of equations

Figure
Figure 15: Scenario back-test results

Table 1 :
Descriptive statistics of the yield curve

Table 4 :
Summary of statistics for predicted errors of yields (%)

Table 9 :
One-year out-of-sample forecast results

Table 10 :
Two-year out-of-sample forecast results

Table 11 :
Three-year out-of-sample forecast results

Table 12 :
Four-year out-of-sample forecasting results

Table 13 :
Tree structure for different back-tests

Table 14 :
Macroeconomic portfolio allocation: stability statistics

Table 15 :
Macroeconomic efficient frontier: stability statistics Stander, Y (unpublished).Bond Indices in South Africa.Unpublished dissertation for M.Sc. at the University of the Witwatersrand, Johannesburg, 2000 Svensson,LEO (1994).Estimating and interpreting forward rates: Sweden 1992-4.CEPR Discussion Paper Series No. 1051, October Taylor, JB (ed.)(1999).Monetary Policy Rules.University of Chicago Press, Chicago Tilley, JA(1992).An actuarial layman's guide to building stochastic interest rate generators.Transactions of the Society of Actuaries 34, 421-59 Wu, T (unpublished).Monetary policy and the slope factors in empirical term structure estimations.
. The equations are are given values, but y t and x t has not been observed yet.