Occupational Upgrading and the Business Cycle in West Germany

The occupational skill structure depends on the business cycle if employers respond to shortages of applicants during upturns by lowering their hiring standards. Devereux uses this implication to construct empirical tests for the notion of hiring standards adjustment (the so-called Reder hypothesis) and finds affirmative evidence for the U.S labour market. The authors replicate his analysis using German employment register data. Regarding the occupational skill composition they obtain somewhat lower but qualitatively similar responses to the business cycle despite of well known institutional differences between the U.S. and German labour market. The responsiveness of occupational composition wages to the business cycle is considerably lower in Germany. --

1 Introduction Reder (1955) tries to explain occupational wage differentials based on the notion of hiring standard adjustments. By claiming that "When applicants become scarce, employers tend to lower the minimum standards upon which they insist as a condition for hiring a worker to fill a particular job ..." (Reder, 1955: 834) he refutes wage competition as the only source of adjustment in the labour market. Though the difference between wage and quality adjustment appears to be innocuous at first glance, wage adjustment is compatible with standard neoclassical wage competition whereas quality adjustment may generate efficiency wage problems (see Thurow (1975) and Schlicht (2005) for expositions of the argument and Bewley (1999) for survey evidence). Mortensen (1970) shows (in the framework of matching theories) that firms may combine wage and hiring standards adjustment even if wages are flexible.
If employers respond to labour shortage during upturns by lowering their hiring standards instead of bidding up wages, the average skill level within occupations should decrease. Therefore an empirically testable implication of Reder's theory is that the average skill level for occupations should be counter-cyclical. Devereux (2002) implements a test using the identifying assumption that (at least in the short run) jobs in the same occupation are characterised by identical skill requirements. 1 He selects new hires (job starters and movers between firms) from the CPS data and forms occupation-year cells. The test is conducted by computing proportions of skilled employees for each cell and regressing them on the national unemployment rate and a rich set of control variables. A positive coefficient of the unemployment rate indicates counter-cyclicality of the new hires' skill levels. Devereux finds the theory confirmed for the U.S. data. 2 A second implication of Reder's theory is that the occupational composition of employment should change systematically over the business cycle. If employers 1 For tests based on different empirical implications see e.g., Ludsteck and Haupt (2007) who apply quantile regression techniques to identify effects of upgrading on the conditional wage distribution.
2 A related analysis on crowding out of unskilled workers in the business cycle is presented in Pollmann-Schult (2005). It is, however, not informative regarding Reder's theory because it does not control for cyclical between-occupation shifts. respond to labour shortage during upturns by lowering their hiring standards instead of bidding up wages, they create possibilities for employees to improve their wages by occupational upgrading, i.e. by moving from low wage occupations to higher paid ones. The opposite (downgrading) should be observed during downturns. Devereux (2002) implements a test by selecting a sample of job changes, computing a measure of occupational quality and regressing it on the unemployment rate and control variables. He finds that employees are likely to move to higher-paying occupations if the unemployment rate is falling, i.e. the occupational composition wage is procyclical in the U.S. Though Devereux's work for the U.S. is highly interesting, it cannot answer the question whether hiring standards adjustment is a distinctive feature of the otherwise highly flexible U.S. labour market. To assess whether Reder's theory captures a general aspect of adjustment in the labour market, one has to replicate Devereux's investigation for labour markets with a distinctively different institutional background. Because of its highly structured and regulated vocational training system, Germany suggests itself as a comparison candidate par excellence. If hiring standard adjustment takes place even if these standards are formally fixed in recruitment procedures and centralised wage agreements or remuneration schemes, it can be considered as a general and important adjustment mechanism.
We replicate Devereux's analysis on skill shares and on occupational upgrading using employment register data from the German Federal Employment Agency (Bundesagentur für Arbeit) which comprise detailed wage and demographic variables for all dependent employees covered by the social security system (about 80 percent of the work force). Particular advantages of our data are their huge size (about 20 to 25 million workers per year for West Germany) and the long time period covered . We use similar estimation methods but have to account for small differences in the data such as censored wages. The similarity of the data and the estimation approach render us with estimation results we regard as directly comparable to the US results.
In spite of the often emphasized differences between the U.S. and German labour markets, we find noteworthy similarities in the cyclicality of the occupational skill composition for both countries. The responses for West Germany amount to about 70 percent of the U.S. values. The analysis of occupational wage upgrading yields a similar result. We find that the occupational composition wage is procyclical, but in Germany the responsiveness is substantially lower than in the U.S. At first glance these similarities are surprising because the labour markets in both countries are characterized by quite different institutional frameworks. But a more detailed inspection of the related theoretical and institutional aspects of the issue will direct us to plausible explanations for the observed differences.
The plan of the paper is as follows. In the next section we provide a description of our data, explain data selection and processing steps and discuss differences to Devereux's data. Section 3 is on our analysis of skill proportions and Section 4 on occupational wages. Both sections start with a description of the econometric model which is followed by the estimation results. We conclude with a summary.

Data
Our analysis is based on the employment register of the German Federal Employment Agency that includes information of daily accuracy on all employees liable to social security in Germany. These register data stem from the employers' periodic notifications which are the basis for the calculation of individual social security contributions and social security claims such as unemployment benefits or pensions.
We choose all observations for the years 1980 to 2004 and create cross-sections for the reference date June 30. As our definition of new hires relies on information from the year before we are able to analyse 24 years from 1981 to 2004. The sample is restricted to employees who work in West Germany for two reasons. First, information for East Germany is not available before 1993. Second, the educational and vocational system in the former communist state differed considerably from the West German. More important, productivity of East German workers may have been lower in the past as they were trained and worked with different and outdated equipment in the communist economy. We keep full-time workers aged between 20 and 60 years and exclude apprentices. Marginal employees who are not liable to social security contributions because their wage is below a certain threshold are dropped because they are not included in the employment register until 1999. For employees with more than one job we keep the job with the highest wage which we consider the main job.
We select all employees who did not work in the same establishment on June 30 of the year before and consider them as new hires. They either come from the education system, from unemployment or from jobs in other establishments. Establishments are identified by the establishment id that is assigned to every establishment by the local employment agency. 3 We use information on occupation, age, gender, educational degree, nationality, and wages. The information on the educational degree shows some inconsistencies and missing values do not occur at random. We therefore replace missing values with values from other preceding or succeeding observations of the person (Fitzenberger et al. (2006); our imputation procedure is a simplified version of theirs). Employers tend to report the qualification according to the performed task. This could corrupt our estimates if it occurs when an employee is up-or downgraded. To avoid such errors the qualification is corrected by replacing all observations of an employee with his mode qualification. Since the qualification can be changed by training spells within the employment biography, the mode corrections relate to periods between training spells. Further remarks on data quality and the aggregation of the educational degree to 3 education levels are given in the appendix.
In analogy to Devereux (2002) our skill proportions analysis is based on the proportions of (a) high-skilled workers and (b) qualified workers (medium-and high-skilled) for each cell. The comparability of group (a) to U.S. college graduates is beyond dispute. Things are less clear regarding group (b). We think it can be considered comparable to employees with at least a high school diploma in the U.S. as "Apprentices in Germany occupy a similar position within the German wage structure as held by high school graduates in the U.S. labour market" (Harhoff and Kane (1997)). Over all years on average 7.6 percent of the new hires are highly skilled, 80 percent are qualified.
The occupational classification used in the employment register lists 331 occupations. We drop the occupations home-care nurses and household helpers as they are not included in all years. We further drop medical professions, pharma-cists, lawyers and architects because the frequencies of these professions change implausibly, especially in 1998. 4 Our final sample includes 324 occupations. 5 Two different samples are created, one for the skill composition analysis in Section 3 and one for the occupation wage analysis in Section 4. In the skill composition sample workers with missing education information (that cannot be replaced) are dropped, in the wage analysis sample they are included and assigned to a fourth 'dummy' skill group. Furthermore, in our wage analysis we have to observe the wage before the entry into the new firm. Therefore, the wage regression sample is restricted to all establishment movers who were employed for at least two months in the current year and the year before. Since 1984 bonus payments are included in wage records but cannot be identified. That is why we further exclude the years before 1984 from the wage estimation sample. Due to these selections the skill composition sample and the wage sample overlap but they are not nested. Table 1 list descriptives for the full sample, the skill composition sample and the wage sample for the years 1984 and 2004. The full sample column regards all employees liable to social security in West Germany who work full-time and are aged between 20 and 60. In 1984, 14.74 mill. observations in the full sample amount to 74.2 percent of all employees liable to social security, in 2004, 14.48 mill. observations amount to 68.3 percent. The decline of the share is mainly due to the rise in part-time employment. The share of observations in the skill composition sample rises from 15.3 to 15.7 percent. The share of observations in the wage sample rises from 6.8 to 8.1 percent. 4 See the appendix for further remarks about the data quality. 5 As a robustness check we repeated the analysis with the occupations aggregated to 82 occupational groups on the two-digit level. The results are not included here because deviations from the reported effects are small. They can be obtained from the authors on request. As can be expected new hires are younger on average than all full time employees. New hires in the skill composition sample are even younger than those in the wage sample. Average age is considerably higher in 2004 than in 1984. Compared to the full sample the proportion of women is higher in the skill composition sample but lower in the wage sample in both years. The education level is rising over time but shows only small differences between the samples. The imputation of missing values leads to a rise of the share of low and medium qualified employees. In 2004 this affects the median wage notably. In agreement with the age pattern the median wage is generally lower for new hires and lowest in the skill composition sample. The unemployment rate for West Germany in the years 1981 to 2004 is taken from the official employment statistics of the German Federal Employment Agency (BA-Statistik).
There are some minor differences between Devereux's and our data and definitions. First, we define prime age 20 to 60, Devereux 18 to 64. The upper limit is decreased in our study to 60 to avoid bias due to early retirement practices in Germany. Second, we identify new hires using establishment ids, Devereux uses job descriptions or industries. And third, Devereux's occupational classification scheme seems to be slightly finer than ours. 6 In general the samples can be considered very similar so that differences in results can be attributed to institutional differences between the U.S. and Germany.

Explaining the Occupational Skill Composition
As the main intention of our paper is to compare the U.S. and German labour markets, our estimation procedures follow Devereux (2002) closely. Several variations are introduced to check the stability of Devereux's modelling strategy. 6 As reported above, we use 324 different occupations. Devereux does not report this number. It can, however be inferred from the cell numbers given in his Table 2b. He uses 6508 occupation-year cells for 17 years. A balanced panel with 383 occupations and 17 years would amount to 6511 cells (some cells may be empty in some years).

Empirical Model
To investigate the cyclicality of the occupational skill composition, Devereux (2002) runs regressions of the proportion of qualified workers in occupation-year cells ot on the unemployment rate U t , a quadratic trend 7 and fixed occupation effects γ o : ε ot denotes a white noise residual and v t an unobservable time shock. 8 Direct estimation of this model using the standard OLS coefficient variance formula would yield severely biased standard errors because U t is constant for all cells within a year. 9 This problem can be solved either by computing the covariance matrix in a way that allows for clustering by year or by the application of a twostep procedure. In the first step the shares are regressed on occupation and time dummies: Each occupation-year cell is weighted by its number of individuals. In the second step the time dummy coefficients (which can be interpreted as compositioncorrected proportions) are regressed on a quadratic trend and the unemployment rate.φ Again each observation is weighted by the number of individuals. Amemiya (1978) shows that this two-step procedure is equivalent to one-step GLS. The fact that the second stage is a simple time series regression makes it simple to allow for serial correlation of residuals either by computing Newey-White standard errors 7 The trend eliminates e.g., supply shocks due to educational expansion and other secular changes. Note that we have to assume that these shocks are smooth and do not generate business cycles. Otherwise U t would become endogenous. 8 Note that v t cannot be estimated because of the dimension of U t . 9 See Moulton (1986) for an exposition of the issue.
or by including lags of the unemployment rate. Both extensions lead to negligible differences in the estimation results. However, since the dependent variable is a proportion, the linear model can only be regarded as an approximation. In a more structural approach, one would assume that the qualification proportions within cells are generated by the aggregation of individual decisions to the occupation level. The individual decisions (whether to employ a high-skilled worker in a particular occupation) follow Bernoulli sampling, which is why we estimate a grouped probit model. We do not repeat the details on the model here as it can be found in most mircoeconometrics textbooks (e.g., Greene (2002)).
The nonlinearity of the grouped probit model causes the marginal effect of the unemployment rate on skill shares to depend on all characteristics x ot and all coefficients: denotes the density of the standard normal distribution function and β u the coefficient of the unemployment rate. We compute the average marginal effect as where O and T denote the number of occupations and years. 10 Some problems also remain with the grouped probit model. First, nonlinear fixed effects models are inconsistent if the number of fixed effects increases proportionally with the sample size and sufficient statistics for the other parameters of interest are not available. 11 This bias should be negligible in our estimation with 24 observations (years) per occupation. 10 An alternative estimate often used in the literature (and implemented in the the marginal effect evaluated at the average characteristics vector. It evaluates to a different value because of the nonlinearity of φ (·). We report the ME(x) only as it is based on the more natural definition. The aggregate regressors problem is solved by application of a blocks bootstrap procedure (the blocks contain all observations from one year). 11 A sufficient statistic for the linear fixed effects model is the within-transformation because the transformed model is purged of the fixed effects. Papke and Wooldridge (2008) present an alternative way to incorporate fixed effects in fractional response models for panel data. They apply the Chamberlain device which avoids estimation of all individual fixed effects by the application of a conditional normality assumption for the fixed effects.
Second, the dependent variable is zero or one for some cells. 12 This generates surprisingly low standard errors in large samples and is a feature of the model. Fortunately this problem disappears when bootstrapped standard errors are used instead of asymptotic ones.
As a further shortcoming the grouped model does not allow to include individual level control variables (e.g., age, sex, establishment size). In order to check if this affects the results we also estimate the linear index models at the individual level. The (binary) dependent variables are -in analogy to the definitions of proportions -(1) 'the individual has a college or technical college degree yes/no' and (2) 'at least a completed apprenticeship yes/no'. Sex and second order polynomials of age and (log) establishment size are employed as individual level controls. As above, all regressions contain a full set of occupation dummies. To obtain consistent standard errors with aggregated regressors, the coefficient covariance matrix accounts for clustering by years.
Inspection of the time series of fixed effectsφ t points to a structural break in 1998/1999 for the high-skilled shares. This break is likely to be caused by changes of the reporting rules in 1999. To capture the break, a dummy for the years 1999-2004 and interactions between the linear and squared trend with this dummy are added to all models described in this section.
A last technical but possibly important issue to be addressed concerns the question whether higher precision of the estimates or additional information could be gained by exploiting regional level variation of unemployment rates. 13 We investigate this issue in detail. We present the analysis in the appendix because it shows that the aggregate model provides a good description of the relation between upgrading and the business cycle. This extension is therefore of minor relevance for our main objective, the comparison with the U.S. . All series are detrended to focus on the cyclical component. 14 The positive correlation is apparent for both the proportion of graduates and the proportion with vocational degree or more. Table 2 shows the results of the estimation of Equation 3 and the other specifications mentioned above. Our comparison with Devereux's results is based on the linear two-step specification since it is identical to his model. The marginal effect of 0.097 for the proportion of graduates means that this proportion among new hires increases by roughly 0.1 percentage points if the unemployment rate rises by one percentage point. The corresponding value for the share of employees with a vocational degree or more is 0.39 percentage points. All effects are significant at the 5 percent confidence level. A comparison with all other specifications suggests overall stability of the relation. To check the impact of the imputation procedure for the qualification variable, we run the same regressions using the raw qualification variable. The implied changes in the marginal effects are small and in line with expectations. The next rows in the table refer to the grouped probit and the individual level linear probability models. These changes of the specification hardly affect the results. Given the disclaimers regarding the comparability of the educational systems in the U.S. and West Germany, a comparison of Devereux's and our results reveals unexpected similarities. He obtains a marginal effect of 0.16 (with standard error 0.07) for the proportion of graduates and 0.53 (with standard error 0.10) for the proportion of employees with high school diploma or more. In both cases the U.S. point estimates exceed our estimates for West Germany by about 40-50 percent but the differences are statistically insignificant. These similarities are surprising if we consider that the U.S. labour market is almost free of occupational regulations whereas the German vocational training system is highly regulated.

Results
The most striking differences between both countries relate to 1) the wage bargaining system, 2) the occupational training system and 3) the role of the government with regard to regulations such as job protection laws. 15 A closer inspection of the impact of these institutions on the occupational skill composition reveals countervailing forces. To see the implications of a tight wage bargaining system and wage rigidities consider an economy hit by a positive product demand shock. For a homogenous production function (which should be a good approximation to reality) we expect equal increases in the demand for all factors of production, hence equal increases for skilled and unskilled workers. Firms would bid up wages in their recruitment efforts for all skill groups. Relative employment of the skill groups should remain (almost) unchanged as long as the supply elasticities of skilled and unskilled workers do not differ too much. If firms lower hiring standards instead of bidding up wages, relative employment of the unskilled will increase. Thus, we expect wage rigidities to increase the responsiveness of skill proportions to the business cycle, which implies that effects should be greater for Germany. More generous unemployment benefits act in the same direction by generating a de facto minimum wage affecting mainly the unskilled.
Thus the less pronounced responses found in the German data should rather be caused by a less permeable occupational system in Germany. This is confirmed by a closer look at the institutional conditions: Whereas occupational training in the U.S. is almost completely in the responsibility of the employer, vocational training is well-structured, strictly regulated and standardized in Germany. Training lasts between two and three years in the so-called 'dual system' and takes place in firms 15 As the differences between Germany and the U.S. are stressed frequently and explained in detail in the literature, we outline only the most important details. See Franz and Soskice (1995) or Harhoff and Kane (1997) for international comparisons of the occupational training system, and Soskice (1990) for a survey on the wage bargaining systems.
(about 3-4 days per week) and vocational schools (1-2 days per week). Generalized curricula which are binding for (specialized) vocational schools as well as for employers are defined by national committees and monitored by the chambers of commerce and industry. The training ends with standardized theoretical and practical examinations. Its paramount importance for the German labour market is due to the fact that the entry to many jobs in industry and trade de facto requires a certificate of completed apprenticeship and remuneration in most collective wage agreements is linked to vocational qualification. 16 Because of the importance of vocational degrees we expect low and medium skilled workers to be less substitutable in Germany which is why we expect less pronounced responses of the occupational skill composition to the business cycle in West Germany. Finally, responsiveness should be lower also due to job protection laws. They increase firing costs and the risks associated with bad matches between unskilled workers and complex tasks and thus make it less profitable to recruit unskilled workers for skilled jobs during upturns. To summarize: While wage rigidities and generous unemployment benefits strengthen the reaction of skill proportions to the business cycle, institutional rigidities should lower it. Note however, that the presence of institutional rigidities in Germany does not necessarily imply inefficiencies since they may be associated with more pronounced incentives to acquire occupation-or firm-specific human capital.
If tight standardizations and regulations related to vocational training lower the substitutability between skilled and unskilled jobs significantly, one would expect lower unemployment effects on the skill proportions for these occupations. Thus an indirect test of an explanation based on barriers caused by regulations can be conducted by restricting the estimation sample to occupations covered by the dual vocational training system (recognized occupations, 'anerkannte Ausbildungsberufe'). From Table 2 we find, however, that deviations from the base sample regression are negligible and insignificant. Though this is not a sharp test, the data give no clear indication that the vocational training system plays an important role in creating barriers to cyclical occupational upgrading.
A further possible explanation for the lower responses of the occupational skill composition in Germany may be found by separating the effects by establishment size. The last rows in Table 2 show marginal effects for small, medium size and large establishments. The responses are more pronounced for medium size and large establishments. Several reasons might explain these differences. First, large establishments have alternative jobs for unskilled workers if it turns out that the hired person does not meet the requirements of the particular job he was hired for. Second, helpers and handymen can be utilized better in larger teams because in teams they can specialize on certain tasks. 17 Devereux's analysis does not differentiate by establishment size. Thus we do not know whether the impact of establishment size on the response of skill proportions in the U.S. is similar to Germany. If it is, establishment size could explain part of the differences between the U.S. and Germany. Table 3 shows that Germany has considerably more small (1-4 employees) and less large (more than 500) establishments than the U.S. According to our results, the differences between establishment size groups are more pronounced for the proportions of graduates where the differences between Germany and the U.S. are greater, too. Though this check of the establishment size effect is indirect and a conclusive statistical test would require additional empirical information, it may serve as a starting point for further empirical investigations. The theoretical model suggests that skill proportions in occupations with generally low skill requirements should react stronger to the business cycle. To analyse this hypothesis, we again follow Devereux (2002) and distinguish occupations with different levels of skill requirements by grouping the occupations according to their general wage level. The latter is calculated as the median deflated wage per occupation over all years. 18 We group the occupations by median wage quintiles and run separate regressions for every quintile. Marginal effects for the grouped probit are shown in Table 4. As expected, all point estimates of the marginal effects are positive. They are significant at the 5 percent level for the lower four quintiles for graduates and for the lower three quintiles for employees with at least a vocational degree. Analogous to Devereux's results, our findings do not suggest a clear pattern for the proportion of graduates across quintiles. For the proportion of new hires with at least a vocational degree we find a significant difference between the first and the fifth quintile but the pattern is 'deformed' by the outlying second quintile coefficient. In contrast, Devereux estimates decline more evenly from 0.1 (0.14) in the 1st quintile to -0.21 (0.07) in the 5th quintile. In summary, regarding the hypothesis that the effect of the business cycle on hiring standards should be larger for occupations with generally low skill requirements we find similar evidence for West Germany as Devereux found for the U.S. high-skilled proportions are equally affected across occupation types but medium-or-more skilled proportions are more reactive in low wage occupations.

Explaining Occupational Composition Wages
In this section we isolate the component of the cyclical variation of wages that is due to occupational up-and downgrading. We answer the question "how would aggregate wages respond to the business cycle if wages remained constant within all occupations?" In this case all wage variation is caused purely by changes of the occupational employment structure (composition). Note that this question is complementary to the empirical literature on the cyclicality of wages which puts focus on gross wage changes. 19 To avoid misunderstandings we introduce a new label for this measure: occupational composition wage.

Empirical Model
As in the preceding sections, our analysis follows Devereux (2002). To investigate the cyclicality of the occupational composition wage he runs regressions of the change of an indicator for occupational quality on demographic control variables (dummies for black, married, white, graduate, high school, a cubic polynomial in experience), a time trend and the change of the national unemployment rate. The dependent variable is constructed as follows. Compute mean wages for every occupation Here U t denotes the unemployment rate, x oit contains individual level control variables for individual i, t is the time trend, v t is a time shock and ε iot a white noise residual. Specification (5) is similar to the one used frequently in the wage cyclicality literature 20 but inconsistent with the standard Phillips curve specification where the change of the unemployment rate is used instead of its level. We proceed with the specification above since it is clearly favoured by a simple test (see appendix for details).
In order to calculate wage differences we restrict the estimation sample to all job movers who were employed in the current year and the year before for at least two months. As in the previous section, Devereux implements the estimation in two stages. In the first stage, changes of occupational mean wages are regressed on individual characteristics and a full set of year dummies.
The coefficients of the year dummies can be interpreted as occupational composition effects by year. In the second stage they are regressed (using cell size as weight) on a linear time trend and the change of the unemployment rate. 21 The two-stage approach is computationally convenient to obtain unbiased standard errors in the presence of the aggregated regressor but not necessary. Alternatively, one can estimate in one step and cluster standard errors by year. We experimented with both approaches because the two-stage approach is slightly more flexible, 22 but report only the one-step results because the differences are small. Our specifications differs in two aspects from Devereux's model. First, we use a slightly different set of control variables. We do not include a dummy for marriage because it is not contained in our data. Variations of the set of individual level control variables, however, seem to have a small impact on the results. The second and more important difference to Devereux is, that in our data about 10 percent of the wages are right censored. Before we calculate the average wages for each occupation, we replace these censored wages with predictions of the unobserved wages. 23 Reder's theory predicts that the extent of occupational upgrading should vary over the wage distribution. If high wage occupations respond to increasing product demand by poaching workers from lower paying occupations, then in the lower paying occupations also the slots of the poached workers have to be filled. Therefore the possibilities (measured as vacancies) to move to better-paying occupations should be higher for employees with low wages and skills. To test this, Devereux sorts workers into quintile groups using a simple measure of personal skills. The measure is the predicted wage from a regression of (log) wages on personal characteristics. 24 We created two slightly different skill measures. The first one is the predicted wage from a regression including a cubic in experience, education dummies, and a set of year dummies (but year dummies do not enter the prediction). For the second measure, job characteristics such as a second order polynomial in log establishment size, a dummy for white collar workers and 27 sector dummies are added to the first specification, but as the year dummies these are not used for the prediction. The second specification should deliver a more precise estimate of personal productivity since correlations with establishment or 22 Robust standard errors are available then even in presence of serial correlation. 23 First we run tobit regressions of individual log wages on control variables (the same as in the final regressions) and year dummies separately for every occupation and sex. Then we predict censored wages and add residuals drawn from a truncated normal distribution with the standard deviation estimated by the tobit models. Finally means of the imputed wages are computed for every occupation. 24 Devereux uses education indicators, a cubic in experience, race dummies and a marriage indicator.
sector level variables are eliminated. The difference between both specifications is small with respect to the second step estimates. In the next section we show the results based on the second measure which is more conservative (i.e. it produces slightly less pronounced differences between quintile groups).

Results
The occupational composition wage depends on the mean occupation wages and the occupational composition of employment. As the mean occupation wages are by construction constant over time, the business cycle must create employment shifts between occupations to exert an impact on the occupational composition wage. Thus we start our investigation with a descriptive analysis of whether and to what extent occupation changes are induced by the business cycle.  Devereux (2002: 438), values for Germany are based on our establishment mover sample. Table 5 shows that up-and downgrading are about as common in general but occur more often for men. As expected, the numbers for the U.S. indicate a greater flexibility. Devereux finds a much higher share of occupation changers, only 26 percent remain in the same occupation. The relation between up-and downgrading and the business cycle can be shown graphically. We partition the sample of job changers into workers who move to better paid occupations (upgrade), lower paid occupations (downgrade) and workers who stay in the same occupation (no change). The shares of these groups are plotted against the unemployment rate in Figure 2. It is clear at a glance that the shares of upgraders and downgraders are procyclical whereas the share of stayers is countercyclical. This visual impression is confirmed and unemployment. It plots the coefficients of year dummies from the first step regression against the change of the unemployment rate. By construction, these coefficients represent the pure occupational composition effect. It is clear from the figure that rising unemployment coincides with falling occupational composition wages and that this correlation is strong for men and women. Table 6 contains the estimation results for several specifications of the model. Note that the large sample for the proportions models in Section 1 was necessary to avoid proportions of zero and one in the two-step models. As this problem does not arise here, the wage analysis in this section is based on a 25 percent random sample to reduce the computational burden. To start with consider column (1) relating to the base sample (all establishment movers). By definition of the dependent variable and other regressors, the constant gives the average percent change of the occupational composition wage for unskilled blue collar workers with zero years of experience for the estimation period [1985][1986][1987][1988][1989][1990][1991][1992][1993][1994][1995][1996][1997][1998][1999][2000][2001][2002]. It amounts to about 2.9 percent for men and 6.3 percent for women. This implies that there is (on average) a net flow from lower to higher paid occupations. The negative trend coefficient shows that this effect has diminished slightly. The big difference between men an women may be explained by the important role of maternity leave for young women, who apparently restart their carrier after maternity leave spells in low paid jobs and advance in the sequel. 25 Furthermore the average upgrading effect is significantly lower for employees with completed apprenticeship and college and technical college graduates. This is plausible as further upgrading becomes difficult if one starts from a favourable position. At a glance, the coefficient of the white collar dummy appears huge. Note, however, that this dummy is (in contrast to the other dummies) time-variable. Thus it seems to reflect promotions that go along with the occupation change. 26 Regarding the main objective of the study, we find highly significant but rather small effects of the business cycle on occupational upgrading. A one percent decrease of the (national) unemployment produces a 0.39 and 0.58 percent increase 25 Note that our sample contains only job-to-job movers, i.e. employees who had a job at the reference date of the previous year. Since maternity leave spells last longer than one year in most cases, its downgrade effect (women worked in a better paid occupation before the maternity leave period than afterwards) is not included in our analysis. 26 Typical examples are promotions of production workers to executive positions. Note: All coefficients and standard errors are multiplied by 100. Robust standard errors that allow for clustering by year are given below coefficients.
Results are based on a 25 percent random sample of persons from the full sample of all job movers. Legend: (1) base sample including all individuals who changed jobs, (2) workers remaining in the same sector only, (3) workers remaining in establishments of similar size only. The construction of the samples (2) and (3)  Our focus on occupations (as units defining homogenous skill requirements) appears sensible for this application. Nevertheless it is off the beaten track of the empirical literature which concentrates on industry wage differences. Since occupations are not evenly distributed over industries, and our model does not account for transitions between industries, the unemployment coefficients may capture sectoral upgrading effects. A further competing explanation for wage upgrading refers to firm size wage differentials. Large firms which pay rents to their employees may exploit this in upturns to poach workers from smaller firms. A simple way to check the relevance of both issues is to restrict the estimation sample to new hires who remain in the same two-digit industry (see columns with header (2) in Table 6) 27 or the same establishment size group (see columns with header (3)). The definition of the establishment size change indicator is described in the appendix. To make the industry and establishment size change indicators comparable, they were constructed such that they produce similar shares of movers: According to our definition, 52 percent the new hires in sample (1) change establishment size and 48 percent change the two-digit industry code. From columns (2) and (3) it is evident at a glance that sectoral upgrading explains the lion's share of cyclical occupational upgrading: The coefficients for sample (2) of new hires who stay in the same industry are much lower in absolute value, whereas sample (3) of new hires who stay in the same establishment size class shows negligible differences to sample (1). Note that the lower unemployment coefficient for sample (2) does not invalidate the role of occupations for wage upgrading. It only tells us that a good deal of occupational composition wage effects are intrinsically related to industry changes.
Devereux points to the problem that observed wages may not indicate the desirability of an occupation if compensating differentials play an important role or if wage differentials are noncompetitive. In the case of compensating differentials, higher occupation wages reflect higher risks or worse working conditions but not more productive or better skilled employees. In the case of a noncompetitive wage setting, e.g., efficiency wage problems may foster the transformation of small 27 We use a classification containing 28 groups. Because the industry classification changes in an incompatible form in 2002, we are urged to restrict the samples (2) to the period 1985-2002.
To check whether the period change has an impact on our results in columns (1), we reran these regressions for the period 1985-2002 but found only negligible differences.
(or unobserved) productivity or skill differences into large wage markups. To check for that, Devereux replaces the dependent variable (observed occupation wages) by 'occupation skill' wages. Occupation skill wages are obtained by replacing individual observed wages with predicted wages in the definition of the occupational wage, i.e.z o := (1/n o ) ∑ t ∑ i D(i, o,t) ln(ŵ it ), whereŵ it is the predicted wage from an auxiliary regression. 28 The occupation skill wage should be free of noncompetitive wage markups and components compensating for extra risk or bad working conditions. If the occupational composition wage effects found above are mainly due to compensation or noncompetitive wage differentials, they should vanish after replacing occupational wages by occupational skill wages. In our application, this replacement shrinks the sample (1) coefficients (in absolute value) from -0.390 to -0.064 for men and from -0.576 to -0.261 for women. For men the coefficient becomes insignificant (standard errors are 0.034 for men and 0.029 for women). This indicates that noncompetitive or compensating wage differentials are important determinants of the occupational composition wage effect. When Devereux replaces observed by occupation skill wages for the U.S. data, his estimates shrink from -0.91 to -0.37 for men and from -1.04 to -0.34 for women but remain significant at the five percent level. Thus, the relevance of compensating and noncompetitive wage differentials is similar in both countries.
Under a Reder competition regime, high wage occupations absorb employees from lower paid occupations during upturns. This generates additional demand in the lower paid occupations which in turn should increase the wage upgrading effect for employees in the lower part of the skill distribution. We test this by sorting all new hires in five quintile groups using predicted wages from an auxiliary regression of wages on personal characteristics and control variables (see footnote 28). The resulting unemployment change coefficients are listed in Table 7. To start with, consider columns with header (1). Here the first quintile response for men (-0.557 percent) exceeds the fifth quintile response (-0.219 percent) by a factor of about three. The relation is similar for women although the levels are higher. Furthermore, the differences between the first three and the fifth quintile are significant at the 28 Individual wages are regressed on personal characteristics and control variables. Personal characteristics are the education dummies and a cubic polynomial in (potential) experience. Control variables are year dummies, 27 sector dummies, a white collar dummy and a second order polynomial in log establishment size. Note that only personal characteristics enter the prediction.  (2) and (3)  As in the section above, columns relating to the employees remaining in the same two-digit industry (2) and the ones remaining in the same establishment size class (3) suggest that industry changes play an important role for the business-cycle component of occupational composition wages. Although wage effects become smaller for all quintiles, the ranking of the effects by quintile remains the same.

Conclusion
In this paper we estimate the responsiveness of the occupational skill structure and occupational composition wages to the business cycle and compare the estimates with corresponding results from a study using U.S. data (Devereux, 2002). This comparison is particularly interesting because of striking differences between U.S. and German labour market institutions. Whereas the German labour market is characterized by a highly regulated and standardized vocational training system and a canonical structure of occupations, a standardized vocational training system with approved examinations does not exist in the U.S., the occupational structure is less formalized and occupational mobility is much higher than in Germany.
Our estimates show that within occupations the skill level of new hires rises significantly in recessions and decreases in upturns. The effects for West Germany amount to about 70 percent of the corresponding U.S. results. They are, however, larger than expected given the striking institutional differences mentioned above. Separate estimation of the model by establishment size groups suggests that effects are lower for small establishments, implying that a good deal of the difference between both countries may already be explained by a greater share of small establishments in Germany. Further differentiation of the sample into low and high wage occupations reveals that the share of unskilled is affected stronger in low wage occupations than in high wage occupations whereas no clear pattern can be found for the high-skilled. Several checks show that the results are robust to changes of the occupational classification level, the choice of the estimation model, and the time period considered.
Our results regarding occupational composition wages also indicate a lower responsiveness to the business cycle than in the U.S. The estimates amount to about 30 and 40 percent of their U.S. counterparts for men and women, respectively. We should, however, be cautious to interpret this as a clear indication for more important wage rigidities in Germany. Responses of the occupational composition wage to the business cycle are based on two components. First, higher paying occupations can attract workers during upturns only if there exist noteworthy noncompetitive wage differentials. And second, the occupational system must be flexible enough to allow employees to switch between occupations. Effectiveness of the first component (noncompetitive wage differentials) requires rigidities, the second flexibility. Thus lower responsiveness of the German occupational composition wage may be either due to less pronounced noncompetitive wage differentials or due to a less permeable occupational system. U.S. transition probabilities between occupations approximately double their German counterparts. If these numbers are based on roughly comparable data, they already explain the lion's share between U.S. and German responses of the occupational composition wage to unemployment. Consequently, noncompetitive wage differentials do not appear to be much more pronounced in Germany. Finally, we should be keep in mind that greater occupational mobility in the U.S. does not necessarily imply efficiency. It comes at the cost of lower occupation-specific human capital which is likely to enhance productivity but this is out of regard in this analysis.
correction procedure (described briefly in section 2) accounts for this by eliminating 'erratic' qualification changes.

Definition of establishment size changes
It is impossible to provide a fully consistent and theoretically meaningful definition of establishment size changes for movers. An establishment size change from 1000 to 1001 is a change, but it is economically not meaningful for our application. In order to include meaningful changes only, our definition uses relative changes combined with thresholds depending on establishment size. Furthermore it is constructed to yield a number of establishment size changes that is similar to the number of two-digit industry changes. With our definition, 52 percent of the new hires change establishment size and 48 percent change two-digit industry. The establishment size change indicator used to define the sample in columns (3) of Tables 6 and 7 is constructed as follows: First we define five establishment size groups for 1-19, 20-49, 50-99, 100-199, and 200+ employees. Then establishment size change indicator I t depends on the meanē t := (e t + e t−1 )/2 of the previous and the current year's establishment size and the absolute value of the log difference g t := | ln(e t ) − ln(e t−1 )| in the following way: if 50 ≤ē t < 100 1 1(g t > 0.8) if 100 ≤ē t < 200 Here 1 1(·) denotes the boolean indicator function evaluating to one if its argument is true and zero otherwise.

Digression: District level analysis
Here we inspect whether higher precision of the estimates or additional information can be gained by the inclusion of regional (district) level unemployment rates. The aggregation issue could possibly be relevant since the aggregate unemployment rate is not necessarily a good proxy for cyclical shocks. It is not if cyclical shocks on regions or their effects are asymmetric and economic integration between regions are small or moderate. Take the demand for ice-cream and umbrellas as a somewhat far-fetched but instructive example. If production of umbrellas and ice-cream was located in the north and south of Germany, a hot year would decrease the demand for umbrellas but increase demand for ice-cream. Then we would observe a slowdown in the north and an upturn in the south but aggregate employment remains unchanged if both effects neutralize one another. In this extreme case we would not observe aggregate fluctuations at all. Thus a thorough investigation of the aggregation issue should start by inspecting whether cyclical shocks are asymmetric or not. A simple approach (as employed by Decressin and Fatas (1995)) is to regress the changes of regional unemployment rates on changes of the aggregate one, i.e. to run the regression ∆u rt = α r + β r ∆ū t + ε rt separately for every district. The mean of (adjusted)R 2 's from these regressions can be taken as simple measure of symmetry. We obtainR 2 = 0.65 for our sample implying that 65 percent of all shocks are 'shared' by all regions and the rest is region-specific. 30 This suggests that there might be additional regional variation of unemployment which could be exploited to improve the estimates.
However, a further requirement must be met to obtain additional or different results from the disaggregate model. First, employers must respond to purely regional cyclical shocks in the same way as to aggregate ones. This is unlikely if shocks are regionally localised and worker mobility is large enough, since employers may consider to increase wage offers to attract workers from neighbouring regions instead of adjusting hiring standards. Second, the extent of upgrading may depend on the size of cyclical shocks. If it occurs mainly for sizeable shocks and deviations between regional and aggregate demand changes are rather small, an analysis at a more disaggregate level will lead to minor changes. This is what we learn from our data.
The disaggregate estimation is performed using a straightforward adaption of the two step approach. In the first step, a linear probability model is estimated for each district separately. The specifications are identical to the linear probability model in Section 3. 31 The coefficients of the year dummies are regressed in the second step on all district dummies, the 1999 structural break interaction terms, a quadratic trend and regional and aggregate unemployment rates. District size differences are taken into account by using the respective observation numbers from the first step regressions as weights in the second step. Note that the aggregate unemployment variable is the same for all districts in one year. To account for this, clustering of the covariance matrix is necessary. We obtain consistent standard errors by application of a blocks bootstrap (draw year blocks containing all districts for one year) in the second step regression. The results are shown in Table 8. Note that both unemployment rates are included in the model, implying that each coefficient measures a partial effect, i.e. the aggregate rate coefficient gives the additional effect of aggregate fluctuations if the regional unemployment rate is held constant. Whereas local unemployment is more important for the proportions of graduates, aggregate effects dominate for the proportions of vocational degree. If one is not interested in the relative importance of pure regional and aggregate effects, one has to compute the sum of both partial effects. The comparison of this sum with the the main results in Table 2 shows only minor changes for the proportions of vocational degrees and tiny ones for graduates. Thus Devereux's aggregate specification can be considered as a suitable formalization of the upgrading relation. The disaggregate analysis reveals that this depends on the symmetry of cyclical shocks and the extent of economic integration of the considered regions.
A corresponding analysis for the wage regressions yields negligible results of the regional unemployment rates. If they are added to the specifications (1) of Table 6 and the dependent variable is computed as change of averages over occupation-district cells 32 , the aggregate unemployment rate coefficients change from -0.390 to -0.410 for men and from -0.576 to -0.653 for women. The respective (insignificant) coefficients of the regional rates are -0.044 and 0.055. Thus regional cycles play a minor role for occupational composition wages. The finding that regional cycles are important for skill proportions but not for occupational composition wages suggests that purely regional cycles induce mainly switches between occupations with similar wages.
A test of the wage cyclicality specification against the Phillips curve specification A simple way to test between the two specifications presented in Section 4 was proposed by Card and Hyslop (1997). Replace ∆U t in Equation 5 to obtain ∆z iot = α 1 + α 0 2 U t + α 1 2 U t−1 + α 3 x iot + α 4 t + v t + ε iot .
Then the H 0 related to the Phillips-Curve is H P 0 : α 1 2 = 0 whereas H C 0 : α 1 2 = −α 0 2 is compatible with the standard cyclicality formulation. Application of the test to our data delivers rather clear though not fully conclusive evidence in favour 32 We drop observations corresponding to moves between districts from the estimation sample because the related wage changes would possibly capture returns to mobility from regional wage differences. A more satisfying but computationally much more demanding approach to separate upgrading from regional mobility effects is to include dummies for all possible moves between districts. We did not pursue this since the deletion of district movers has negligible effects on the results. of the standard wage cyclicality formulation. The estimates for α 0 2 and α 1 2 (and standard errors in brackets) are -0.438 (0.064) and 0.345 (0.057) for men and -0.755 (0.067) and 0.420 (0.055) for women, respectively. Although the coefficients of the lagged unemployment rate are highly significant and thus clearly reject H P 0 , the difference between α 0 2 and α 1 2 is large enough to reject H C 0 , too. We proceed with the difference specification for two reasons. First, we regard the estimates as being more in line with H C 0 . And second, the estimation of a dynamic model would add several technical problems but should deliver similar results for our purposes.