Can the modernisation of a public employment service be an effective labour market intervention? The Hungarian experience, 2004-2008

The Public Employment Service often delivers much of the employment policy including active labour market programmes in many member states in the EU, yet we know little about its effectiveness in general. This paper provides a quantitative assessment of the potential impact of the modernisation programme of the Hungarian Public Employment service between 2004 and 2008. Using data at the level of local offices, I calculate programme effects using a difference-in-difference estimator. Results show that the programme has increased re-employment rates significantly, by 6%. The modernisation was thus a moderately effective but relatively inexpensive intervention, similar in terms of cost-effectiveness to the better active labour market programmes in Hungary.


Introduction
The Public Employment Service (PES) is an important player in the labour market of almost all European countries, yet we know very little about the effectiveness of its operation and development. Within the matching theories of Blanchard and Diamond (1989) and Pissarides (2000), the aim of a PES is to facilitate the operation of the matching technology. This can involve activities such as administering payments to clients, negotiating and supervising job-search agreements, counselling, observing compliance and the administration of labour market programmes. The PES has always been an important building block of the European Employment Strategy and forms part of the Europe 2020 Strategy (European Commission, 2010). The PES is a central institution in the so-called flexicurity framework and is assigned the role of supporting the transition between labour market states (Wilthagen, 2008). Although most member states spent more than 0.1 per cent of their GDP on the PES and administration in 2010 (OECD data, LMPEXP), there is relatively scarce evidence on the effect of PES services on outcomes for the unemployed. This paper addresses the question of improving effectiveness by looking at the modernisation of a PES on one of its main outcomes, opportunities for re-employment.
The evaluation of active labour market programmes is recognised today as very important and the field has accumulated an impressive body of evidence. These studies nevertheless address the issue of programme effectiveness for the most part and are strongly separated from the operations of the institution which delivers it. One exception is "services and sanctions", an intervention delivered by and integrated with the PES and shown by Kluve (2010) to be one of the most successful labour market programmes. The situation is no different in Hungary, a country where the employment rate is among the lowest in the EU. With a budget of around HUF 20 billion (around EUR 70 million) per year, the Hungarian PES serves 450000-600000 registered jobseekers, which is about 11 to15 percent or the active population. The institution has absorbed more than HUF 10 billion in the 2000s in order to modernise its operation, but no quantitative assessment has been made about this process so far.
The contribution of this paper to the scarce literature on PES effectiveness and also to the better developed literature on the evaluation of labour market programmes is delivering evidence using econometric techniques on the potential effectiveness gains through the modernisation of a PES. The questions I would like to answer are whether and to what extent the modernisation project contributed to increasing the chances of its clients in finding a job. In what follows, I use a difference-in-difference (DiD) econometric estimation method to estimate the possible effect of the developments between 2004 and 2008, using aggregate data relating to the local PES offices. First I briefly describe the development programme itself. Then I move onto the theoretical and methodological considerations and the data used for the analysis. Thirdly I describe the estimation results. Finally, I put them in context and present some conclusions.

Evidence on PES effectiveness
Expenditure on labour market policy and the PES is relatively high in the EU, but also varies considerably, from 3.48 and 0.51 per cent of the GDP respectively in Belgium to 1 and 0.1 per cent respectively in new member states (OECD data, LMPEXP). Despite such differences in spending, the need for a public employment service was mostly supported since the post-second world war period (Baldwin 1951) to today (OECD, 2006), and reinforced after the onset of the economic crisis (ILO, 2009).
Critiques of the PES often refer to the ineffectiveness of the institution and such doubts are reflected in both spending on and organisation of the PES. Doubts about PES effectiveness have their roots in theory. Zweifel and Zaborowski (1996) ask the question if public or private employment services are better. This theoretical analysis suggests that private agencies might fare better largely because the public agency is not allowed to be selective in its user base and not as worried about placements as private agencies. Similar analysis also suggest that structure of the market for employment-related institutions matter for effectiveness (Campens and Tanguy, 2006). In the presence of privately-owned employment offices, otherwise positive performance of public ones can decrease.
The best arrangement for the employment services is sought after in practice in various ways. Denmark has reformed its entire PES by privatising a large part of it (Koning 2004). Although it does not directly recommend a complete privatisation, the OECD is advising its members to introduce market-based signalling systems so that contracting-out of services (Bruttel, 2005) can help improving overall effectiveness (Fay, 1997). A less disruptive step towards taking outcomes into account is the introduction of management by objectives (Mosley, Schütz, and Breyer, 2001), a reform followed by many PES, including the Hungarian one.
Management of performance is possible without a major change of organisational structure, but it requires constant and appropriate monitoring and evaluation of PES-related operations (OECD, 2005). Very few PES monitor effectiveness directly (or at least openly), perhaps because such measurement has its own problems. Among the few studies embarking on direct effectiveness measurement, we find Vassiliev et al. (2006) and Ramirez and Vassiliev (2006). Both use a form of production frontier methods to identify offices falling below the expected effectiveness threshold, and interpret finding offices below this as those potentially able to increase their output given their inputs.
Besides administering and delivering active labour market programmes, the PES often provides "Services and Sanctions, a category comprising all measures aimed at increasing job search effectiveness, such as counselling and monitoring, job search assistance, and corresponding sanctions in case of non-compliance" (Kluve, 2010). Although relatively few, 21 of the observed 137 studies look at this type of programme, the meta-analysis of the author yields robust results indicating that "Services and Sanctions" is one of the most effective and less costly types of ALMP. Out of 3+5 specifications, services and sanctions always increases the likelihood of an evaluation finding positive treatment effects and almost always highly significantly. This performance is rivalled only by private sector incentive schemes, such as wage subsidies, while direct employment programmes for example perform much worse. These results are quite important from the point of view of the PES, as unlike other programmes, the performance of services and sanctions depend highly on its competencies.

Institutional background
The Hungarian PES plays the role of both an authority paying financial support to the registered unemployed and that of a supporting organisation by providing counselling and delivering various active measures to the clients. Beside its core duties, it also performs many tasks somewhat loosely related to unemployment, including the administration of casual work, administering a large part of the rehabilitation process of disabled workers, or assisting public employment. Resources of the institution did not however keep up with the proliferation of duties. As part of the austerity measures, the number of employees in local offices had begun to shrink already in 2006 and did not increase after the onset of the 2008 crisis either. While one officer attended an average of 206 clients in 2006, the same number had increased to 273 by 2009 (figures from direct HPES communication).
The pressure on the PES was relieved to a certain extent by a series of development projects started in 2003 and still on going. Its main aim was to carry out a general reform of operations in order to boost its performance in improving clients' re-employment potential. At the end of the 1990s, the operation of local PES offices was characterised by very formal and unfriendly spatial arrangements, out-dated IT infrastructure and further difficulties. Officers were lacking a general overview of competency areas, while clients were not only served by but also dependent on them, interested mostly in collecting unemployment insurance and benefit payments. Such a situation constrains the ability of the PES to improve outcomes for clients and was therefore important to change. The modernisation efforts have dealt with all the above-mentioned areas in 20, 60 and another 60 local offices in the three phases of the modernisation process, respectively. It also affected the National Employment Office, the methodological and coordination centre of the PES.
Here I focus in the so-called HRDOP 1.2 measure (with a budget of HUF 9.3 billion), which was the second among the three phases of modernisation. The aims of the development process are mapped onto projects -often overarching actual measures or programmes -whose combined effect is what I consider here as the intervention to be analysed. A total of 89 projects were targeted on introducing a new model of service provision with client profiling, internal remodelling of the local offices, installing self-help computer terminals, introduction of a quality assurance system, staff training and the introduction of an integrated information system. We expect that all of these had an effect on participating offices, while development of the integrated information system had an effect on the operations of the whole PES, regardless of programme participation. For we need a comparison group to measure programme effects, only outcomes for participating offices can be measured.

The principle for assessing impact of the modernisation
The current analysis aims at measuring the effect of the programme on a specific indicator -such as the re-employment chance of the registered unemployed -on the office-level, similarly to the structure put forward by Nagy (2006) in an earlier proposal to estimate programme effects. Being interested in the actual outcome of the programme, here I look at the average treatment effect on the treated (ATT). Such an indicator and approach helps us to answer questions and clarify doubts raised by Hárs and Nagy (2009) in relation to the original indicators of the programme. Because the analysis is focussed on local labour markets in which the PES offices are located, this measurement provides us with an estimate of the net effect of the programme, which is the combined effect of the direct effect and indirect effects. It also does not count with the possible displacement effects the programme generates. Given however that the programme effects extend to every registered unemployed, this error is likely to be modest therefore the gross effect is likely to be not very different. I calculate programme effects using a difference-in-differences (DiD) method from raw data, later correct it using linear regression, first applied directly to the affected groups of offices, then using matching to homogenise them. The motivation for introducing a technique for sample homogenisation is that despite the indication of available information, the comparison group might be different from the treated group and matching is a preferred method of getting rid of a part of these differences explained by observable characteristics (Heckman, Ichimura and Todd, 1998). The total of four versions of the estimates can be used to cross-check them, similarly to the ones in the dated but comprehensive evaluation of active labour market programmes in Hungary (O'Leary, 1998).
In order to get rid of the time-invariant effects possibly correlated with programme participation, I have written the estimating equation in differences-form: where OUT it is the indicator of outcomes, τ is a constant measuring the autonomous rate of change in this, p i is an indicator of programme participation, X it is a set of variables indicating relevant observable characteristics of the local offices, while u it summarises characteristics that are not correlated with these observables. The relationship is defined over PES offices observed in different time-points, i being an index for an office, t being an index for a specific time period. The difference (∆) operator takes time-difference of a variable between the same month in the before and after period -note that office-specific fixed-effects has already been swept from this equation. Our interest centres on δ, the coefficient on the p i indicator for programme-participation, which delivers the programme effect in this context. One can show that the equation in this form is a direct implementation of the DiD idea, generalised to the multiple-regression case.
The first set of estimates come from OLS regression and is based on the assumption that the participant and non-participant groups are similar. This assumption relies on information on programme design which attempted to select offices for modernisation from every county, without direct connection to the state of the particular office. Variable ∆X it ensures that we take into account the differences developing over time between the participant and non-participant group of offices, and thus we do not confuse these with the effect attributable to the programme itself. Not taking these into differences would give us the simplest raw DiD estimates of the programme effect.
The estimation strategy explained above runs into difficulties if the assumption about the initial similarity of the groups fails and there are differences between participant and non-participant groups that are correlated with the ∆X it variables or with the indicator of participation. We can treat this problem if we have a sufficiently large number of variables at our disposal that can actually explain these differences to a great extent. Given that that are detailed data on the clients in each office, this seems to be possible in our case. To produce the second set of estimates, I first perform propensity-score matching following the idea of Rosenbaum and Rubin (1983). This procedure amounts to predicting programme participation using a large set of pre-participation variables and use the predicted propensity to find observations for every programme participant that are close to it in some way. After this, I create a counterfactual realisation as a weighted average of the neighbours of the given office and compare the actual outcome to that. A substantial difference of this approach compared to the previous one is that the equation is estimated on a database containing data for the treated group and the counterfactual realisation generated from the control group. We get a reliable DiD estimate of the ATT under the current working assumptions by subtracting participants' time-difference in outcomes from the same difference for their set of pairs assigned through matching, then averaging them (Heckman, Ichimura and Todd, 1998). It is important to stay on the common support of the observable characteristics during matching, that is only those observations should be matched that actually have similar values of variables. Although a similar effect could be achieved by homogeneising with OLS -that is including the variables used for matching in levels to the estimating equation -the technical power of matching over OLS is the flexibility that it introduces hidden non-linearity in controlling for differences. Because of this however, the usual way of calculating standard errors for the estimator would be misleading. Although there is no clearly preferred solution to this problem to my understating, bootstrap estimates are often used and this is what I am calculating.
Matching with simple averaging produces an estimate that is similar to a simple DiD estimate in that it does not take care of the over-time changes in characteristics between groups happening after the onset of the programme, that is without the ∆X it variables. If a control for over-time changes is needed, it has to be done before matching. As a result, we are not matching on original re-employment rates, but rather on residuals from a first-step regression similar to the one used in simple OLS estimation but without the programme-participation indicator (this process is analogous to the residual-regression approach in the simple OLS context). Standard errors have to be calculated using bootstrap as before, for which the presence of the generated variables are another reason here.
The above techniques allow us to control for differences between the treatment and the control group both before and after the introduction of the programme in observable characteristics. There can however still be correlation between observed and unobserved effects. I assume the lack of such effects which I cannot prove, only argue for later. Should this argument fail in practice, the amount of the inconsistency in such cases depends greatly on the size and direction of effects governing such selection.

Data and the estimation method
This study uses data primarily from the central database (Integrated System, in Hungarian: "Integrált Rendszer", IR) developed within the framework of the HRDOP 1.2 measure itself, containing individual data on the registered unemployed since 2000. In order to estimate the programme effect, I was provided with these data aggregated at the level of the local offices, done by the Employment Office. Using spatial identifiers of the offices, I have attached to these records data relating to local labour markets, coming from the database of the Hungarian Central Statistics Office in individual settlements (T-STAR).
Individual-level data on registered clients in the IR contain information on sex, age, education, and occupational code of previous job as well as an indicator of disability. Aggregate indicators calculated from these data play the role of X variables in the estimating equation, characterising the PES offices (with their postprogramme values) on the one hand as well as the role of controlling for initial observable differences between participant and non-participant groups (with their pre-programme values) in the matching process on the other. The indicators are all defined as the share of a particular type of registered client within all registered clients.
In the case of the registered unemployed staying in touch with the local PES office, we know the direction of exit at the end of the registered status. The possible directions of exit are the following: (1) employment (open market); (2) public works; (3) supported employment (various forms of wage subsidy); (4) training; (5) not known due to lack of cooperation with the PES.
The share of registered clients exiting towards either of these directions is an estimate of exit probability, an indicator of a certain outcome. Given that the primary goal of the PES is to facilitate matching on the labour market, the most directly relevant measure of effectiveness is the share of clients exiting the registry towards unsupported employment on the open labour market, the rate of reemployment. Even though the data are aggregate, they are directly related to individual behaviour: the number of exits relative to the unemployment pool is an analogy of the individual probability of exit. Because the PES produced the exit data for different groups of the registered unemployed defined over individual characteristics such as age, education or disabled status, I run the regressions for all of these groups. Differentiating behaviour on the basis of these groups allowed me to assess the heterogeneity in the impact of the programme even with aggregate data.
A great advantage of using data coming from the administrative records of the PES is that they are part of a complete account of the event history of the registered unemployed, but the administrative nature has drawbacks too. Based on the register only, we know little about the labour market history of those unemployed who have not been eligible for financial support. Also, because not all jobseekers are strictly required to report the direction of exit, this information is not available for everyone. Contact and thus reporting is required in principle and one loses eligibility for financial benefits administered by the PES without it. Yet, penalty is not severe if the client does not contact the office either due to no initial eligibility in the first place or due to having exhausted such benefits, as the unemployed loses eligibility for benefit for 3 months. In relation to the current analysis, this means that direction of exit is measured without error only for those eligible for benefit. We have to note also that this error can be correlated with the factors determining the chance of exit and we have no outside information to assess its size. If the measurement error is strong, it lowers the value of the outcome variable by not counting every successful exit to the open labour market. However, we have no reason to suppose that the reverse can happen, so I expect that the error does not increase the counted number of exits.
In order to apply the DiD method here, we have to chose an appropriate before and after period. Considering that the HRDOP 1.2 measure was rolled out between the second half of 2004 and the first half of 2008 and also that the effects of the economic crisis were very apparent in the third quarter of 2008, I chose the first 6 months of 2004 to be the before and that of 2008 to be the after period. The evidence presented in Card, Kluve and Weber (2010) suggests that in order to assess the programme effects fully, one should ideally follow and observe programme participants for years after the end of the programme -this is definitely a longer period than what is possible here. On the one hand, the effect of the economic crisis was very asymmetric regionally and extending the observation period would run a risk of confounding the effect of the crisis with that of the programme. On the other hand, development of the PES has continued through the SROP 1.3.1 project, and this has basically eliminating the control group.
There are 158 local PES offices in the analysis -only those existing both in the first half of 2004 and 2008. I have omitted two outlier offices, the one specialised in helping homeless people (on Haller Street) and the one specialising in helping higher education graduates (the one on Andrássy Street) in Budapest. I have omitted also those two offices where all steps of the modernisation were completed in the previous round.
If we are looking only at the formal definitions, we can consider offices modernised during the HRDOP 1.2 measure as participants and those not modernised in either during the previous phase or during the HRDOP 1.2 measure itself as nonparticipants. However, as only 7 offices adopted the quality assurance framework during the previous phase and the rest (13) did so only during the HRDOP 1.2 measure, I consider also the latter as participants for current purposes. Note that this does not affect the number of non-participants, as those having participated in the previous phase are not counted towards them. The end result is that out of the total 158 offices, we have 71 participants and 85 non-participants as their controls.
As I have already mentioned, participant offices for the development project were selected from smaller and larger towns in every county, providing some randomness in selection to a certain extent. However, interviews conducted during a broader evaluation exercise indicated that participation chances were biased to some extent towards offices in worse shape. This observation warrants caution towards estimates that do not take such differences into account and prompts at least a comparative estimation with homogenisation of treatment and control groups based on initial differences between offices. Because it proved to be impossible to collect comprehensive information on the actual condition of the buildings or on a similar indicator for the pre-programme period, I used the characteristics of the clients as a proxy during matching. Apr 2004  As the programme elements were delivered at almost the same time to all offices (excluding the 13 offices where the self-help terminals and the new service model was already in place), we can look at only the combined effect of the installation of self-help terminals, the introduction of client profiling and adopting the quality assurance system. This means that if we do measure an effect, we cannot tell from which programme element it comes. However, if we do not measure an effect, we cannot tell if this means that all elements were ineffective or that there are powerful effects at work pointing at the opposite directions. Figure 1 shows monthly exit rates from the unemployment register between 2004 and 2008. Based on this evidence, the two major exit directions are 1) open market employment, with around 4 per cent rate by the end of the period, 2) not known due to lack of reporting back to the PES, with an average of around 8 per cent. These figures are similar to those observed in Eastern European countries on average (Kuddo, 2009). Besides the slight increase in exit to employment, we can observe a much stronger decrease in the exit rate to the not known state. This effect is already present from 2000 on (not visible on the graph) which indicates that this decrease is not to be attributed to the modernisation process. Exit rates towards all destinations also appear to show seasonal cyclicality. Figure 1 shows that re-employment rates grow particularly strongly during the summer and decreases during the winter -the reason for this is partly that a large number of seasonal jobs are offered during the summer and subsidies are made available during the spring, take-up rate topping by the end of summer.
Even though the process started well before 2004, the large and trending decrease in the rate of exit towards an unknown state raises the question if there was indeed measurement error present in the indicator of the exit route. Such a measurement error affects this analysis if the change in exit rates is correlated with the error affecting our chosen effectiveness indicator. In that case, those who were likely to report an unknown destination are becoming more likely to report exit to an unsupported job over time, for example. The most likely cause of this lack of information is the lack of motivation to keep in touch with the office. Clients are motivated either directly, when contact is required for benefit payment, among others, or indirectly, through the provision of services desirable to the client. The participation of an office does not affect administrative rules directly, but clients registered with modernised offices can feel contact to be more useful. This can lead to more frequent contact between the office and the client, to greater likelihood of reporting exit to a job and ultimately to the decrease of the measurement error. Such a process creates a negative correlation between participating in the modernisation programme and the measurement error in the exit rate, leading to the overestimation of the programme effect (by usual omitted variable arguments). Although we can be sure that such a distortion exists, I suspect that its size is likely to be small. It probably also appears together with other noises distorting the estimates in an unknown way -this is a reason I use different estimation methods and specifications for measurement. In the absence of this negative correlation, measurement error would merely decrease the precision of the estimates (appearing on the left hand side of the equation). Table 1 shows re-employment rates in the pre-and post-programme period based on office-level data, weighted by the number of the number of unemployed registered with the given office. The re-employment rate has increased greatly from 2004 to 2008. Programme participants experienced a 1% point increase, while the same was 0.8% point in the case of the control group. Using the DiD method, the programme-effect is the difference between these two numbers, 0.23. This number is not small compared to the overall re-employment rate, but is not significantly different from zero. Looking at the same thing from a different angle, we see that the initial gap between the participant and non-participant offices in re-employment rates in 2004 has basically vanished by 2008. Note: Without participants of the first phase of the modernisation process; averages are weighted by the number of unemployed registered with the local office. Source: Own calculations using data aggregated from the IR of the PES Using data aggregated to the level of the whole country, Figure 2 shows the changing share of vulnerable client groups over time. These include those without a maturity exam, those aged above 50 (the 50+), labour market entrants and disabled persons (counting them multiply, hence proportions add up to more than 100). The most pronounced change is the growth of the share of the 50+ among the registered clients, being a mere 15% in 2000, but growing by 5% points in 10 years. This is partly explained by the rise in retirement age, partly by the autonomous increase in their level of education. The share of those without a maturity exam decreased slowly but steadily, showing a strong seasonal pattern: it decreased rapidly during the summer months providing seasonal jobs, but increased during the winter. The share of labour market entrants shows more muted, but still strong seasonality, with a reversed time-pattern: their share increases mostly during the summer. Their record high share was 10% during 2006. The share of the disabled unemployed is stable below the level of 5% from 2002 on. These changes do vary substantially at the level of local offices and are an important part of the external effects we have to control for during estimation. Source: Own calculations using data aggregated from the IR of the PES When using a DiD method, it is important to have sufficiently similar participants and non-participants on average so that the latter form a valid control group of the former. Table 2 shows the average of indicators of offices' characteristics in the beginning of 2004, just before programme participation. There are three types of indicators: one set includes the characteristics of the registered unemployed, the second includes their exit rates towards different directions and the third includes characteristics of the local labour market. The latter were obtained from the on-line public database of the Hungarian Statistics Office on municipalities. Given that more than one municipality belongs to one local PES office, I have aggregated these data and then assigned them to the record of the appropriate local office using the matching file provided by the Employment Office. I have considered Budapest the capital as one labour market, so the same trends are matched to all offices there. These data enable us to control for external -such as business cycle -effects not captured already by the changing composition of the registered unemployed pool.
Participating and non-participating local offices appear to be very similar: there is no real difference either in re-employment chances, or in local labour market in terms of group means. The main difference is that there are almost twice as many clients registered with participating offices on average than in the case of nonparticipants whereas the share of better educated clients is larger in the latter case (with very low absolute shares). Not only the mean values are very similar, but also the spread of the indicators (not shown in the table), therefore the requirement of staying on the common support during the DiD analysis with matching was easy to satisfy. Source: Own calculations using data aggregated from the IR of the PES and TSTAR data from the Hungarian Statistics Office I work with aggregate data during estimation, in which observations appear more than once and this has a direct effect on the calculation of standard errors of the estimates. In order to take seasonal effects into account and increase efficiency at the same time, I use observations for 6 months separately for each office in the period before and after the programme, respectively. This way every observation contributes six times to the estimation, and the final estimate will be an average of the monthly effects. Since there is a high degree of autocorrelation between the time-periods, I calculate clustered standard errors. Aggregation of units with different number of observations in them creates a well-known form of heteroskedasticity, therefore I weight the regressions by the number of registered individuals.
Explanatory variables in the binary model for programme participation include the 2004 January values of the variables characterising local labour markets in the parametric estimating equations, as well as levels, squares and cross-products of outflow rates towards unsupported employment and unknown direction. I have calculated z-statistics using the bootstrap method, with 100 replications. I have used the PSMATCH2 Stata module for matching (Leuven and Sianesi 2003). I experimented with different averaging methods such as 1:1, k-nearest neighbour, kernel and local linear matching.

Estimation results
I start presenting results with estimated coefficients from simple OLS regression of the differenced estimating equation, using the method explained earlier, including restriction to the common support obtained from the participation equation in the matching estimator. It is worth noting that without this restriction, results are stronger than we shall see. Estimates related to the re-employment chances of an average registered unemployed person are shown in Table 3, the programme effect being the coefficient on the participation indicator in the first row. Results from the simplest specification (1) merely echo the results seen in Table 1, indicating a programme effect of 0.16%point, or 5% (the numerical difference is due to slight differences in aggregation). This estimate is not significant at conventional levels. Besides the lack of certain control variables, this can be due to the negative bias caused by measurement error. Although I am interested in the precise estimation of the programme effect rather than modelling of the change itself, it is interesting that the equation explains very little of the variation in the change in effectiveness. Specification (2) improves upon this situation by including separate indicators for all months to filter out seasonal effects. Although this has increased explanatory power to 14%, indicating the importance of seasonality in reemployment, neither the estimate of the programme effect nor its precision has changed.
Specification (3) includes even more information, most importantly the (difference of the) share of registered clients with particular characteristics: age, education, and labour market entrant status. Besides the rise in explanatory power, we observe an increase in the programme effect to 0.3% point and an improvement in precision that makes the estimate significant. The size of the effect is close to the one obtained with matching (see later), but is somewhat larger than the raw estimate. Variables included in this specification capture the changes in clients' composition over four years.
The coefficients attached to specification (3) should be interpreted with a caveat: their own effect is biased by their possible role of a proxy for the measurement error and I am not able to disentangle the two. In a further specification for checking robustness (not shown here) I included the rate of exit towards unknown state as a proxy for measurement error. This variable carries a lot of extra information and being a dominant share in the same population, it is very likely to "over-control" the programme effect. Including this variable, the programme effect has increased marginally and its significance decreased (but results remained significant).
In the next step, I have included variables in the estimation that are meant to capture the characteristics of the local labour market. Results obtained with the new specification (4) are similar to what we have got in the first two, with not only the programme effect, but significance dropping too. Along with the increase in the explanatory power of the model, this shows that there is insufficient information for this extension of the model: multicollinearity between the variables decreases precision more than the value of the extra information they bring in. Although the new estimate of the programme effect is smaller than it was before, the confidence interval around it is wide enough to include the previous estimate. For this reason, I use specification (3) as my preferred one in what follows. Estimated coefficients with p-values within parentheses below them. * significant at 0.10 level, ** significant at 0.05 level, *** significant at 0.01 level.
After the completely parametric estimates, I turn to matching to take into account possible initial differences between the local offices, which I have assumed away so far. Table 4 shows estimated programme effects and associated bootstrap zstatistics from simple DiD matching using various methods. The programme effect is the average effect on the treated, the average time-difference of the average difference between re-employment rates of participant local offices and their synthetic counterfactual realisations. Estimates constrained to the common support, programme effects are positive in all cases, but are insignificant and smaller in magnitude than raw effects in the case of averaging methods using all data, such as the kernel and local linear methods. Based on lessons from scenario 4 of Frölich (2004) relating to the analysis of groups of small and similar size, results using these non-parametric methods are the most credible and I consider these as a preferred specification. The next step is to combine matching and parametric estimation in order to control for both initial differences and changes in characteristics during the programme period using the two-step method outlined earlier. Based on earlier results, I use the kernel method in matching and include the parametric residual-generation in the bootstrap procedure used for the z-statistics of significance of the parameters. Results from controlled matching are shown in Table 5 following the structure adopted in Table 3. Controlling for seasonality made the smallest difference here, the programme effect increasing to 0.5% point and being significant at the 0.01 level. This value decreases slightly but not significantly after including the composition of the client pool of the local offices. Including characteristics of the local labour markets has a similar effect, decreasing both the level and the significance of the programme effect. For the same reasons explained earlier, the most credible and thus preferred results come from specification (3). Replacing raw numbers with those coming from a multivariate DiD method combined with matching has thus small but significant net effects. These estimates benefit from correcting for both initial differences and those developing over time, and can thus be regarded as more credible than those ignorant of such differences.
Working with numbers aggregated over the whole client pool, we could not so far look at the heterogeneity of the programme effect. Although we cannot separate effects that are attributable to different types of interventions due to the lack of appropriate data, we can attempt to estimate this composite effect local to different subgroups of clients. Given that some parts of the programme have targeted some types of clients, we can actually obtain some information that can be related to specific parts of the programme. One example of this is self-help terminals which are more targeted on the better educated clients: obtaining a positive programme effect of the latter makes it more likely that elements targeted at them could have worked better. The new service model on the other hand is more likely to benefit the less able, where we can apply the same argument. This makes it worthwhile to replicate the above analysis using figures that are aggregated for a specific type of client only. Another dimension of the heterogeneity of the programme effect is the direction of exit. It is possible for example that the new service model is more effective in directing clients towards training, but not so effective in directing them towards employment. For this reason, it is also worth replicating the analysis for different outcome indicators.
In order to take a look at the effect of modernisation on different groups of clients and with regard to different outcome indicators, I have replicated the analysis for all combinations of these using different populations and outcome indicators. Table 6 shows the essence of the results as a collection of estimates of the programme parameters from a regression specified as version (3) in Table 3. Figures marked with a star show coefficients that are significant at a minimum of 0.1 level. The main conclusion from Table 6 is that the programme helped open-market employment the most: two out of three significant effects are estimated in this case. The table shows that the 0.3% point average estimate of the programme effect is an average of larger and significant impacts on a few subpopulations with a large number of members on the one hand and smaller and less significant effects on more subpopulations with fewer members on the other. While we do not see a significant effect in the case of the young and the 50+ in the case of openmarket employment, the effect for the prime-age group is well above the average at 0.38% point. There is no real difference in terms of education attainment, but coefficients are rather imprecisely estimated in that case. Finally, the effect for those already on the labour market is significantly larger than the average. Other coefficients are not significant at conventional levels, except the exit rate towards ALMPs for those with higher education, where the programme effect is negative and significant. If this effect is real, it can be attributed to the better information provided and the selection mechanism put to work and can suggest that less participation in ALMPs might be appropriate for this group.

Conclusions
This study has provided additional evidence to the effectiveness of a public employment service as an employment policy measure. It presented a quantitative evaluation of the effect of the 2004-2008 phase of the modernisation of the Public Employment Service in Hungary, using exit chances from the unemployment registry as outcomes. The analyses used matching to control for initial differences between participant and non-participant local offices in terms of the composition of their clients and also takes over-time changes into account. Based on the results, we can conclude that the modernisation had a significant positive effect on reemployment chances and this is robust to various changes in the specification. Although data restrictions did not allow me to separate the effects of different programme elements, analysis of subgroups revealed that the programme effect was strongest in the case of prime-age workers.
The final numerical results include several corrections and are slightly larger than the one obtained from averaging raw numbers. During the period between 2004 and 2008, re-employment chances have risen from 3.86% to 4.91% in local offices that participated in the modernisation programme. I estimate that out of this almost 1 percentage point change, the impact of the programme was around 0.30-0.48 percentage points. In the first half of 2008, the number of registered unemployed was 450000, of which 263000 were registered with programme participant local offices. Given that 5% of them became employees on the open labour market after one month, approximately 800-1200 of them became employed as a result of the development programme. The approximately 5% exit rate measured in 2008 means that the average length of such a spell is 100/5 = 20 month (assuming a constant hazard of exit). In the counterfactual case of the programme not being rolled out, based on the change in exit probability as a result of the programme, we can calculate this duration to be 100/(5 -0.3) = 21.3 to 100/(5 -0.48) = 22.1 months. This means that the length of the unemployment spell was shortened by 1.3-2.1 months by the programme for clients registered with the participating local offices.
Because the development of the PES can be considered as a labour market programme similar to ALMPs, one might want to ask the question how the benefits from the modernisation effort compare to costs and to alternative programmes. The first question is not easy to answer, because parts of the programme are difficult to separate and even if this were possible, their costs are difficult to account for. In the extreme case of interpreting the programme as an impulse that creates an everlasting effect, the expense of HUF9100 million is equivalent to an annual cost of HUF273 million spent forever. If we assume that the modernisation is a "programme" in which all unemployed registered with the participant local offices benefit and divide this cost among them, this spending amounts to a yearly HUF1038 thousand, a monthly HUF86 thousand cost. Comparing this to monthly costs of training programmes and subsidies for self-employment, being a monthly HUF101000 and HUF177000 per capita respectively, the modernisation is not only effective, but does not appear to be costly either (see table 12.4 and 12.5 in Bálint, 2012).