Patterns of Overeducation in Europe: The Role of Field of Study

Abstract This study investigates the incidence of overeducation among graduate workers in 21 European Union countries and its underlying factors based on the European Labor Force Survey 2016. Although controlling for a wide range of covariates, the particular interest lies in the role of fields of study for vertical educational mismatch. The study reveals country differences in the impact of these factors. Compared to Social sciences, male graduates from, for example, Education, Health and welfare, Engineering, and ICT (Information and Communication Technologies) are less and those from Services and Natural sciences are more at risk in a clear majority of countries. These findings are robust against changes of the standard education. Moreover, some fields show gender-specific risks. We suggest that occupational closure, productivity signals and gender stereotypes answer for these cross-field and cross-country differentials. Moreover, country fixed effects point to relevant structural differences between national labor markets and between educational systems.


| Introduction
In general, the term overeducation refers to a job match in which the educational level of the worker clearly exceeds the educational requirements of the job. In the terminology of labour economics, this is often considered a vertical skill mismatch, as opposed to horizontal mismatches (workers choosing jobs with requirements outside the scope of their field of study/apprenticeship). A widespread occurrence of this phenomenon can seriously impair the competitiveness of an economy. From a macroeconomic perspective, an overeducation status of qualified workers reflects a waste of scarce human capital. From a microeconomic perspective, it can affect a worker's job satisfaction. In turn, a skill mismatch can reduce overall work motivation, expressing itself in more frequent absenteeism and higher turnover of the workforce (Tsang and Levin, 1985;Sicherman, 1991;Sloane et al., 1999). Moreover, overeducation is associated with earnings losses (e.g. Daly et al., 2000;Bauer, 2002;Boll and Leppin, 2016).
However, before being able to tackle the problem successfully, it is essential to understand the driving forces of overeducation at the individual level. In international comparison, the relevance of these driving forces might vary between countries and regions.
Against this background, the aim of this paper is to identify possible determinants of overeducation for young (20-35 years) highly-educated (tertiary level) workers in EU-28 countries. We make use of the 2016 wave of the European Labour Force Survey (EU-LFS), a quarterly household sample survey that covers approximately 1.8 million individuals aged 15 years or older. This data set provides rich information on the respondent's demographic background, labour status, employment characteristics and educational attainment. It allows us to assess and compare the impact of a large variety of potential determinants, both separately for single countries and in a cross-country estimation. In doing so, our focus is on the role of a so far relatively neglected impact factor, the choice of field of study.
In this way, we make several contributions to the existing empirical literature on the determinants of overeducation. First, we include a range of new candidates for explanatory factors into our framework, including a person's field of study and household characteristics such as the presence of inactive and unemployed household members.
Second, our results allow for a comprehensive country comparison of the associations between overeducation and distinct micro level characteristics within the EU area. This helps to identify differences in the seriousness of the phenomenon between countries and to develop tailor-made policy recipes.
Our findings reveal different overeducation risks for graduates from different fields.
Compared to Social Sciences, male graduates from e.g. Education, Health and Welfare, Engineering, and ICT (Information and Communication Technologies) are less and those from e.g. Services and Natural Sciences are more at risk. These findings hold for the majority of countries and are robust against a change of the standard education. However, countries show different gendered patterns of field-specific risks. We suggest that occupational closure, productivity signals and gender stereotypes answer for these crossfield and cross-country differentials. Moreover, country fixed effects point to relevant structural differences between national labour markets and educational systems.

| Literature Findings
The empirical literature on this topic has come up with a wide range of findings on the influence of some individual-and job-related factors, in particular work experience (Alba-Ramirez, 1993;Groot, 1996;;Sloane et al., 1999;Nielsen, 2011, Boll et al. 2016) and job tenure (Büchel and van Ham, 2002;Büchel and Battu, 2003;Groot and van den Brink, 2003;Ortiz, 2010) beyond macro level factors like job scarcity on the labour market which might advantage graduates due to lower expected training requirements on the side of employers (Thurow 1975). Much less well-documented are factors related to household composition like the number of children and unemployed or inactive adults living in the same household. Moreover, beyond the educational level also the educational field of a person, constitutes an important element for predicting labour market outcomes (van de Werfhorst and Kraaykamp 2001, Hansen 2001) and hence might also be a determinant of overeducation.
The fact that field of study is seldomly analyzed so far is to some degree due to data limitations. Nevertheless, the difference might be substantial for several reasons. First, fields of study differ in their occupational focus. Fields like medicine or engineering with their quite narrowly defined job profiles might require more occupation-specific skills, raising the chances of graduates to find appropriate jobs in the corresponding occupational groups (Reimer et al. 2008). On the other hand, a narrower focus could also imply that graduates will have a harder time finding adequate jobs outside the limited scope of their field of study. In the same vein, a higher occupational closure related to high entry barriers and narrowly defined educational standards could protect graduates of fields like medicine, law or architecture from educational mismatch (Ortiz and Kucel 2008). Thirdly, credentialism theories suggest that in a world where the true personal abilities are unknown, the chosen field of study can also act as an ability signal to employers. Obtaining a degree in fields like Maths, Natural Sciences or technical disci-plines, which enjoy the reputation of imposing high intellectual demands on their students, could convince employers of the extraordinary talent and/or motivation of applicants (Barone and Ortiz 2011). This could give them preferred access to positions with high skill requirements, possibly also outside the occupational groups associated with their subjects. Finally, field choice might be triggered by individual gender role orientations and social origin (Polachek 1978, Bradley 2000, such that field-specific labour market outcomes are not purely causal effects but to some part driven by selection into fields. More specifically, gender norms might impact decisions on family formation and marriage and via this channel impact educational choices (Chiappori et al. 2009, Attanasio andKaufmann 2017). On the other hand, having graduated as a female in a male dominated field could convey a negative productivity signal to employers, relative to male graduates in the same field, resulting in higher overeducation. Hence, in countries with highly gender segregated higher education (e.g. Scandinavian countries; Carlsson 2011), women graduating in gender-averse fields should be particularly penalized.
In estimating the role of field of study, it makes sense to distinguish between different levels of educational attainment. The training received by graduates from tertiary education is typically of a more academic nature and less focused on occupation-specific skills than vocational programs. Hence, the impact of the chosen field on the risk of overeducation is likely to differ with educational level. In the first analysis of this kind, Green and McIntosh (2007) restrict their estimation for the United Kingdom to the subsample of university graduates and thus the tertiary level. They make a quite detailed distinction between 12 educational fields. Among those, degrees in Physical Sciences and in Computing are estimated to lower the overeducation probability significantly relative to the reference category Business and Management Studies. The insignificance of the field Math explain the authors by the fact that school grades in Maths were included as an additional control variable, thereby diluting the measurement of the field effect. Moreover, signs of all field-related coefficients were negative, suggesting that the reference category business and management studies is associated with the highest overeducation risk.
In contrast, Ortiz and Kucel (2008) analyzed a mixed sample of workers differing in educational attainment. Here, tertiary and non-tertiary degrees are distinguished by distinct dummies. Estimations were separately conducted for Germany and Spain. As a reference category, a tertiary degree in Social Sciences, Businesses and law was chosen.
This category was associated with the lowest overeducation risk both in Germany and Spain, a result that is at least partially at odds with Green and McIntosh (2007). In fact, the large majority of other subject-degree combinations yielded significantly higher overeducation probabilities in both countries. The highest probability was estimated for tertiary graduates from the field Services, again in both countries. Moreover, both tertiary and non-tertiary graduates from Human Arts are exposed to a particularly high overeducation risk. In a further approach, Tarvid (2012) made use of the European Social Survey data and tested the field effect in a supranational sample comprising 30 countries, but only university graduates. Again, the most striking result is that graduates from Services exhibit a much higher overeducation probability than graduates from Business, Law and Economics. Probabilities lower than for the reference were detected for the fields Education and Health. Berlingieri and Zierahn (2014) compare the overeducation risk of graduates from Humanities/Social Sciences, Business/Law and Natural Sciences for highly educated German males. They find for most specifications that Business and Law graduates are at significantly higher risk than graduates from Natural Sciences. Finally, the most recent test we are aware of was conducted by Capsada-Munsech (2015) for Italian university graduates. She found that graduates from Sociopolitics experience the highest overeducation probability, even significantly higher than the reference category Humanities. The lowest probability was estimated for Medicine. Overall, even though investigations are rare and comparability is limited by the different field classifications, the literature results suggest some considerable degree of heterogeneity, with students of Social Sciences, Services and Humanities being at higher risk than those in Natural and related Sciences.

| Data and Measurement
We use data from the European Labour Force Survey (EU-LFS) 1 to identify possible determinants of overeducation. The EU-LFS covers approximately 1.8 Mio. individuals from the EU-28 countries (plus Iceland, Norway and Switzerland) aged 15 years or older 2 and asks the respondents for their demographic background, labour status, employment characteristics and their previous employment experience/search for person not in employment. Our analysis is based on 2016 data and is restricted to 21 EU-countries, guided by issues of data availability regarding household variables and occupation groups. Respondents are assigned to countries based on their place of work. In order to illustrate country differences in overeducation risk and its determinants, we perform estimations both for an aggregate cross-country sample with country fixed effects and for the single countries separately to allow for country-specific associations of the included covariates to the dependent variable.
In line with previous studies (Reimer et al., 2008;Smyth and Steinmetz, 2008), we restrict our sample to highly educated individuals, as the issue of overeducation is by definition most relevant for members of this group and, with a sharp increase of graduates' polulation shares during the last decades in OECD countries (from 23.3 % in 1995 to 43.1 % in 2016 on average), affecting more and more people (OECD 2018). 3 Highly educated individuals are defined as persons who have completed tertiary education. This corresponds to educational levels 6, 7 and 8 of the ISCED 2011 classification included in the dataset. Furthermore, the sample is restricted to respondents aged 20 to 34 years. This restriction is motivated by our primary interest in the impact of field of study, as field of study information is in EU-LFS merely available for this age group.
We refer to the above mentioned overeducation as a vertical inadequacy. In the literature, different ways for measuring overeducation are followed, from expert evaluation of occupation-specific required education (which is seldomly available, Eckaus 1964) and respondents' subjective assessments to statistical approaches (realized matches). For our purposes, we adopt the variant of the realized matches approach. This is the only measure that can be employed based on the data at hand but referring to the literature, each measure has its pros and cons. Empirical evidence suggests that self-assessed overeducation is subject to other job features such as occupational status and particularly income (Dolton and Vignoles, 2000). Survey participants may be inclined to exaggerate educational requirements of their job for various reasons (Borghans and de Grip, 2000).
More specifically, we follow the realized matches approach proposed by Kiker et al. (1997). We apply the 80 th percentile of the levels of education within each occupation group as proposed by Ortiz & Kucel (2008). It considers a worker to be overeducated in her given job match if her educational level exceeds the 80 th percentile of the distribution of observed levels of education in the given occupation. As a sensitivity check, we additionally report results calculated based on the mode as the educational standard. 4 The 3 Studies that compare educational groups stress the higher magnitude of overeducation for graduates. For Germany for example, a study based on the Socio-economic Panel (SOEP) estimated a 30 % (41 %) risk for West (East) German male graduates and a 36 % (38 %) risk for West (East) German female graduates of statistical overeducation in 2011 whereas the corresponding figures for workers with medium education are 8 % (12 %) for men and 14 % (10 %) for women (Boll et al. 2016).
choice of reference point can potentially have a sensitive impact on the measurement, depending on the specific distributions of educational levels within an occupation group. Referring the 80th percentile over the mode follows the idea that the mode regularly relates to higher overeducation rates in the same methodological setting and based on the same data (see for a literature overview Cedefop 2010, p. 18-20). This particularly applies when the underlying distribution of the dependent variable is fairly even; in this case, depending on the exact position of the most frequent single value the oberservations above (or below) this threshold may cover a quite high population share.
To investigate the association between overeducation risk and educational field, we implement the field of study indicator provided in the EU-LFS data as an explanatory variable. Following the classification scheme ISCED 2013-F, it distinguishes 11 field categories. In our estimation model, the single categories are coded as categorical dummy variables, choosing the category "Social Sciences, Journalism and Information" to be the omitted reference category. Additionally, in order to illuminate potential gender differences in the role of the single fields, we include interaction terms of the field category with a dummy measuring sex (female=1; male=0).
Moreover, to control for the impact of other factors, we add a broad range of control variables. In particular, we differentiate between three categories of covariates, namely personal characteristics, household characteristics 5 and job characteristics.
Personal characteristics include sex, marital status and two dummy variables that are equal to one if the respondent is a foreigner from another EU country or a non-EU country, respectively. Furthermore, we consider the impact of age within our already narrow sample of 20-34 years old by introducing dummies controlling for the impact of membership in the following age groups: 25-29 years old and 30-34 years old. Hence, the 20-24 years old represent the reference category in this regard. As household characteristics, we are principally able to control for the educational level (ISCED) of the spouse, the existence of unemployed or inactive adults, elderly persons (aged 75 and over) and children by age of the youngest one (between 0 and 5 years, between 6 and 11 years and between 12 and 17 years) in the same household. As recent research still identifies a strong gender bias in the labor market implications of children (Waldfolgel, 1998) Job characteristics include, among others, usual working hours and tenure. Usual working hours are given as the number of hours that a respondent is usually working per week in her main job. Tenure is defined as the number of years since a person started to work with her current employer or as self-employed. In order to shed light on potential non-linearities, both variables are also included as squared terms. Further job characteristics are considered by means of dummies that are equal to one if the respondent holds a temporary contract or if she has a second job, respectively. Firm size is split into three dummy variables, namely 11 to 19 employees, 20 to 49 employees and more than 50 employees. Persons who work for firms whose number of employees varies between 1 and 10 belong to the reference group.
As a variable reflecting the spatial dimension, the degree of urbanization at the workplace is included. It is split according to population density into two dummy variables for rural areas and towns/suburbs, with cities as the most densely-populated area as a reference category. Finally, as common control variables, we also include sector (sections according to NACE Rev. 2) and country dummies in our regressions.

| Estimation method
The most simple (and also most common) approach to analyze impact factors on overeducation risk is to implement a Probit model (see Judge et al., 1988). The target variable classifies a respondent either to be overeducated ( = 1) or not ( = 0). In the Probit model, the probability of = 1 is modelled as follows: where Φ(. ) is the cumulative distribution function of the standard normal distribution and is the set of covariates presented above. The model can be estimated with the Maximum-Likelihood-Method, which yields consistent, asymptotically efficient and asymptotically normal distributed estimates. Due to the nonlinearity of the model, marginal effects are not simply given by the estimated coefficients ̂, but depend on the level of the covariates.
A drawback of this simple approach is that it neglects a potential estimation bias due to self-selection into employment. It rests the analysis purely on those individuals having a job at the time of observation. However, intuition suggests that overeducation risk could well be correlated with employment selection, for instance if the prospect of entering into a skill mismatch induces job seekers to rather stay unemployed to circumvent expected earnings drawbacks or other disadvantages like job dissatisfaction. Under such circumstances, employed and non-employed individuals systematically differ in their risk levels. Results based on estimations not accounting for the impact of work selection will then be biased. Fortunately, Heckman (1979) has developed a two-step correction mechanism to take care of this issue. As a first step, based on a sample of workers and non-workers, a Probit model is specified estimating the likelihood of being in employment at the time of observation as a function of several control variables. As a second step, based on a sample restricted to workers, the Probit model for overeducation can be estimated, including the inverse mills ratio computed from the results of the first step as an additional control variable, reflecting the impact of selection to the overeducation equation.
In principle, the application of the Heckman procedure sets two data requirements. First, both workers and non-workers need to be included in the dataset. This is the case with EU-LFS. Second, in order to create exogenous variation in the inverse mills ratio, one or more identification variable(s) need to be included. These are indicators which influence the employment probability, but not directly the probability of overeducation. They therefore only appear in the employment equation. In our case, the choice of identification variables is further complicated by country heterogeneity: due to diversity of culture and traditions, different sets of identifiers could be appropriate for different countries. For a consistent approach, we applied the following specification scheme for each country sample (as well as the cross-country estimation): First, a basic Probit model for overeducation probability without selection correction including all indicators as explanatory variables was estimated. Those household variables for which the null hypothesis of zero influence could not be denied at 10 % significance level were considered as candidates for identification variables. Second, a Probit model for employment probability, likewise considering the whole set of indicators as explanatory variables, was estimated. Based on the results, we selected among the candidates those as identification variables for which a significant influence on employment probability could be measured. Finally, the Probit estimation of overeducation probability was repeated, this time including the sample correction (i.e. the inverse mills ratio obtained from the previous step), but omitting the identification variables. The coefficients obtained from this last regression are reported as the final results and are interpreted in the following section. Our expectation was that using the mode instead of the 80 th percentile should relate to a comparatatively higher magnitude of overeducation. This is confirmed by the descriptive overeducation frequencies graphed in Figure 2 in the Appendix. About every second worker is considered overeducated in the pooled sample when the mode is used as the educational standard. Moreover, the distribution among countries shows quite a different picture. Italy, Ireland and the Netherlands represent the countries with the highest overeducation rates now, whereas the lowest rates are observed for Poland, Portugal and Romania. In these countries, only about one third of the workers are assessed to be overeducated.  Table 1 presents estimation results for educational fields obtained in the overeducation regression at the cross-country level. Sign and significance of the single coefficients need to be interpreted relative to the reference category "male graduates from Social Sciences, Journalism and Information", respectively. 6 Table 2 in the Appendix reports the country-specific results. For a correct interpretation, it is important to be aware that the 80 th percentile measure represents a more restrictive critertion under most circumstances.

| Descriptive results
Hence, persons classified as overeducated by this criterion can be considered severely overqualified. In what follows, we discuss the results of the cross-country estimation together with country-specific results.

Base term effects
We start with the base term which repesents the field impact for male graduates. Compared to the reference category, graduates from Education, ICT, Engineering, and Health 6 We abstain from reporting results for the category of General Programmes as as this applies only to a very small share of graduates in the tertiary segment. and Welfare exhibit a significant lower risk of overeducation whereas graduating in Natural Sciences and Services is associated to a higher risk in the cross-country comparison.
Arts and Humanities, Business and Law as well as Agriculture are not signficantly different from Social Sciences in terms of overeducation risk in the cross-country perspective. At country-level however, the picture differs in some aspects. For Education, the overall picture is confirmed, with the majority of countries reporting a comparatively lower risk (significant results: 3 with positive, 9 with negative sign). The same holds true for ICT, Health and Welfare, and Engineering, with a lower risk of overeducation compared to Social Sciences in most countries (significant coefficients signs ICT: 0 positive, 7 negative; Health & Welfare: 2 positive, 4 negative; Engineering: 1 positive, 5 negative). As in the cross-country estimation, Natural Sciences (6 positive, 0 negative) and Services (3 positive, 1 negative) are related to a higher risk compared to Social Sciences also on the country level. Moreover, Arts and Humanities which do not significantly deviate from the reference category in the country-pooled estimation, tun out to be associated with a lower risk in the majority of countries (2 positive, 4 negative). The same applies to Business and Law (3 positive, 5 negative), but the opposite holds for Agriculture which exhibits a comparatively higher risk in most countries (4 positive, 1 negative). 7 Concerning the significance of results, the roles of Education, ICT, Engineering, and Natural Sciences mark the most clear-cut results. The obtained relationships between field of study and vertical educational mismatch broadly fit intuition: The identified low-risk fields are for the most part associated with comparatively specific job profiles, whose well-defined qualification requirements represent entry barriers that provide preferential access for those who exhibit the appropriate degree (with physicians in Health and Welfare being the most rigorous example). In this regard, Arts and Humanities seems to represent an exemption. However, beside traditionally less job-specific programmes like history and philosophy, this ISCED-F category also includes more labour-market oriented subjects like handicrafts and design studies, possibly explaining the surprisingly low overeducation risk of the main group.
By contrast, programmes in the fields of Social Sciences are traditionally much less jobspecific, forcing their graduates to compete with a range of applicants with other educational backgrounds when entering the labour market. Hence, it is no surprise to see a comparatively large share of them to be drawn into mismatches. The particularly high risk detected for graduates from Services is also as expected, given that it includes areas like Catering, Travel and Personal Services in which competition by non-academic applicants is tough. More surprising is the positive coefficient on Natural Sciences. It might be in so far explicable as this field not only contains disciplines with the reputation of setting high demands regarding analytical skills like Mathematics and Physics, but also several forms of Environmental studies, whose marketability tends to be more limited.
Confronting our results with those of the literature, we notice some parallels and discrepancies to other studies based on EU-LFS. This foremost concerns Ortiz & Kucel (2008) as well as Ghignoni & Verashchagina (2014), the only other studies we are aware of that investigate the impact of study choice on overeducation with the help of EU-LFS.  (2015) are in accordance with our results in the sense that they also obtain a low overeducation risk for Engineers, albeit compared to another reference category (Humanities).
The remainder of relevant studies is interested in other forms of labour market outcomes. Some of them seem to underpin the composition of our results. For instance, Nunez & Livanos (2010) analyze the impact of field choice on unemployment risk. For their Europe-wide sample, they identify the lowest risks for graduates from the fields Engineering, Education and Health/Welfare, groups that are also associated with a particularly low overeducation risk in our cross-country and the majority of country-specific estimates in our study. In a country-specific analysis, Reimer et al. (2008) likewise detect a consistently low unemployment probability for graduates from Health/Welfare, and, with just a few exceptions, also for Education and Engineering. Low overeducation and low unemployment rates both indicate a high labour market demand for the respective field-specific skills.

Gender-specific effects
A slight variation of this pattern is observed when focusing on the coefficients of the interaction terms. To a large part, they are insignificant. At the same time, the base term for sex is also insignificant. Together, this implies that no difference in the overeducation risk of male and female graduates within the corresponding fields can be statistically proven. The two exceptions are Engineering and Arts and Humanities, where female graduates are assessed to be at significantly higher risk than male graduates. For a correct interpretation of this, it is important to stress again that the coefficient estimates are obtained from a regression including a large set of household-and job-related control variables. Hence, well-known reasons for gender biases on the labour market like the facts that women face tougher restrictions by the presence of children, stuck more often with part-time jobs and sort into different sectors than men cannot serve as explanations in this case. 8 Instead, one could think of five (non-exclusive) channels. First, the risk discrepancies might reflect that, on average, female graduates exhibit different preferences in terms of job attributes than their male counterparts in the same educational fields.
Women might prefer family-friendly and flexible work arrangements over an optimal match with corresponding higher earnings (Coudin et al. 2018). Second, they might be a sign of field-specific gender discrimination concerning access to adequate jobs, e.g. as a consequence of gender stereotypes regarding job images (Glick et al., 1995). Third, they could indicate that in these fields male graduates showed on average the better academic performance, giving them better chances to enter adequate positions. Fourth, they could also indicate the existence of educational sorting at a lower aggregation level than measured. This would mean that within these two fields female students tend to self-select into specific programmes that offer comparatively worse job opportunities and are thus more prone to cause overeducation (see the discussion of educational branches in Section 5.2.2). Given the rather high aggregation level of our field variable in EU-LFS, this appears to be a likely explanation. Fifth, finally, gender differences in field-specifik risks could origin in masked gender differences regarding assumed occupations. 9 Concerning our specific results, intuition suggests that the second explanation might be more relevant in the case of Engineering, while the fourth and fifth one seems rather suitable for Arts and Humanities.
Furthermore, gender differences in field-specific overeducation rates could origin in gender-different field-specific enrollment rates and correspondingly different demand/supply ratios on the labour market. This is not the place to discuss these effects in detail, but a reference shall be made to some empirics that analyzes the role of institutional factors underlying the observed gender segregation in the fields of study as family policies, prevalent gender norms, gender pay gaps etc. in a cross-country comparison (Zuazu 2018, Smyth/Steinmetz 2008.

| Country dummies
The country dummies report country-specific risks that cannot be explained by the controlled individual characteristics of the national sample members ( Table 3 in  The country-specific effects may refer to country differences in Higher Education (HE) attainment rates, the skill structure of national labour markets but also to special features of educational systems, i.e. regarding selectivity of entry, drop-out rates, and the reputation of different branches of HE (masters vs. bachelors in sequential systems and universities vs. vocational schools in binary systems). According to Barone and Ortiz (2011), comparatively low attainment rates in HE in the Czech Republic, Italy, Austria and Germany should relate to relatively low overeducation rates in these countries. This holds true for the Czech Republic in our study which exerts an insignificant country dummy (compared to Germany as a reference) and displays an only slightly higher overeducation rate than Germany in the descriptive analysis. Low tertiary attainment rates and a highly stratified HE system in the Czech Republic (OECD 2006) should play out in terms of low overeducation rates of graduates. However, Czech graduates from Agriculture suffer a significantly higher risk than Social Scientists which does not apply to Germany.
However, the Czech Republic is in a more advantageous position related to Germany in terms of overall overeducation magnitude than Spain. Spain turns out as a country with high vertical mismatch which might be explained by mass enrollment in a sequential HE system generating particularly high numbers of bachelors without exhibiting a suitable absorption capacity of the high-skilled on the labour market (Barone/Ortiz 2011). Second, in the highly segmented Spanish labour market with a high share of temporary jobs, a suboptimal match is the 'prize' for a permanent job (Ortiz 2010). By contrast, the Dutch labour market including posts in the public sector accomodates the high supply of graduates leaving the HE system. In line with this, the Netherlands is the only West European country whose country-level factors operate more strongly against overeducation than in the German case.
With Austria and Italy however, two countries exhibit clearly higher magnitudes of overeducation than Germany and their country-level effects seem to contribute to this result. This is astonishing since Austria's HE system is highly stratified (OECD 2006).
One reason might be that vocational schools which have a shorter tradition than in Italy still send out a negative productivity signal (compared to universities), despite posing high entry barriers. Secondly, single fields of study might drive the overall result in Austria and also in Italy. Compared to Austrian and Italian ones, German graduates from Education and Health/Welfare are at significantly lower risk than graduates from Social Sciences. This view is supported by Barone and Ortiz (2011) who state that Education and Health are among the employment areas that drive cross-country differences in overeducation. and France with both positive and Slovakia with a negative significant coefficient as the only exceptions (regarding France, the significance level is 10% only).

| Other impact factors
Among the individual characteristics, only nationality is estimated to be of statistical influence at the European level. Other factors being equal, foreigners are at higher risk than domestic citizens, with non-EU foreigners suffering an even (slightly) higher risk.
To the extent that foreigners include immigrants, this is in line with general economic reasoning. It would predict a higher risk for immigrants due to the non-transferability

| Sensitivity analysis
As we pointed out in the methodological section, our statistical approach of measuring overeducation is not the only consistent method available. Therefore, it is important to grasp the sensitivity of a measurement choice by comparing it to an alternative. To learn in how far the application of this alternative might impact our econometric results, we repeated our estimations, this time based on the mode instead as the 80 th percentile. Table 5 in the Appendix lists the estimates for the fields of study obtained under this alternative scenario for the cross-country sample and  (5 positive, 1 negative 18 ) are associated with a higher overeducation risk than Social Sciences. Furthermore, also Agriculture which lacks significance in the overall estimation, is related to a comparatively higher risk (5 positive, 3 negative). In the cross-field comparison, results are most clear-cut for Services, Arts/Humanities, ICT, Engineering and Natural Sciences. Note that the results regarding the prevalent direction ("lower" vs. "higher") compared to the reference category does not only hold true for the signicant parameters on the country level but also when the total of results is taken into account.
In comparison of measures and in terms of the basic effect, the pattern of fields derived from the mode is almost identical with the one derived from the 80 th percentile, both for the cross-section and the country-specific results. That is, our sensitivity analyses confirm the main results regarding the base term. Not a single field is changing sign due to the measure change. Directly compared based on significant results, the switch from the 80 th percentile to the mode is accompanied with even more clear-cut results with respect to Services and Arts/Humanities whereas whereas results for Education, Natural Sciences, and ICT are a bit more clear-cut under the stricter overeducation threshold. Results remain similarly unequivocal for Engineering. Less clear-cut results are retrieved, irrespective of the measurement, for Agriculture, Health/Welfare and Business/Law. As a tendency, severe overeduation seems to play a lower role for Arts/Humanities (where graduates are at low risk anyway) and Services, but marks a notable portion of overeducation among engineers, teachers, ICT graduates (although their overall risk is low), and national scientists (with a quite substantial risk).
Most of the country heterogeneity regarding overeducation however concerns the interaction term (gender-specific effects) although most of them are insignificant at cross-country level. One example are female graduates from Education which are at significantly higher risk than male ones in several countries (Cyprus, Estonia, Greece, Portugal, Romania), while there are at least two in which female graduates are at significantly lower risk (Ireland, Netherlands). Concerning Arts and Humanities, the slightly higher risk of females (significant at 10 % level) at the cross-country level is also confirmed for a wide range of countries which accords with the findings based on the 80 th percentile. In contrast to the cross-country results, among Natural Scientists there are three countries in which males are subject to systematically higher risk than females (Czech Republic, France, Ireland) and one in which the opposite is true (Lithuania). Concerning ICT, the positive association to overeducation that is measured for Greece applies to male workers only. At the same time, a positive association can be testified for females only in Austria and Romania. Agriculture shows a particularly high degree of heterogeneity, also in gender interactions with overeducation risks. Concerning Health and Welfare, the higher risks in the Czech Republic, Lithuania, and the Netherlands associated with overeducation, almost exclusively apply to male graduates. In Germany, Poland, Bulgaria, Estonia, Greece, Lithuania, female graduates from Engineering face higher risks than male graduates in the same field. In Germany and Poland, where male engineers exhibit lower overeducation risks than male social scientists, female engineers approach the risk level of male social scientists when the interaction and the base term of Engineering are taken together.
Concerning the country dummies, a somewhat different picture emerges, compared to those derived from the standard education in the main analysis (see Table 7 in the Appendix). As the associations of fields of study are robust against the measure change, the deviations in country-fixed effects have to be attributed to deviating associations of individual characteristics with overeducation, resulting from the change in overeducation measurement.
Regarding other individual and household characteristics, Table 8 in the Appendix reports detailed results. The mode differs from the 80 th percentile employed as the standard education in the main analyses in that under the mode, individuals living together with persons who are unemployed or inactive on the labour market face a higher overeducation risk. This was not the case under the 80 th percentile as the much stricter overeducation threshold. One interpretation could be that the need to financially support unemployed household members induces workers to avoid own unemployment by accepting some degree of suboptimal matches but this does not extend to very bad matches. Alternatively, it could point at a linkage between household composition and job-related productivity based on assortative mating (Mare, 1991): workers living together with unemployed might on average be less productive themselves, a fact that reduces their chances to find a match adequate to their formal education. At country level, both variables represent identification variables for a large set of countries, are thus not included in the overeducation regressions. Among those in which they are included, there are a few deviations to the cross-country result to be noticed. This primarily concerns Hungary, with opposite (i.e. positive) contributions measured for both variables.
Finally, we also undertook additional estimations including further explanatory factors at the regional level (NUTS 2), such as the regional unemployment rate and the employment-to-population ratio. However, due to the large share of missing values, models including this regional information did not yield reliable results for the population as a whole.

| Conclusion
The purpose of this paper was to conduct a comprehensive econometric analysis of potential determinants of overeducation among graduates in 21 EU countries in a unified framework. A special focus was set on the role of subject choice made by the individuals during their educational career. It turned out that both in the cross-country estimation and at country level differences in overeducation risk between graduates from different fields are significant. Furthermore, gender discrepancies in the impact of certain fields are noticeable. At the European level, graduates from Services, Natural Sciences and Agriculture are found to exhibit the highest risk among men. At the same time, male graduates from fields like ICT, Health/Welfare, Education, Engineering but interestingly also Arts/Humanities, are exposed to a rather low risk. The field-specific risks apply for the majority of countries and are robust against a measure change in the educational standard. We suggest that occupational closure and productivity signals are among the relevant underpinnings of these cross-field differentials.
Gender differences in field-specific overeducation risks mostly lack statistical significance, with Engineering and Arts and Humanities, where female graduates are assessed to be at significantly higher risk than male graduates, marking the exceptions. By and large, the above named sensitivity analysis deploying another standard education confirms this pattern for an alternative method of measuring overeducation. Prevalent gender stereotypes, discrimination, but also gendered preferences and institutional factors on the national level boosting gendered segregation of college majors are supposed to drive gendered overeducaton, but we have to left this issue for a more detailied analysis in future research.
Moreover, country fixed effects point to relevant structural differences between national labour markets and educational systems. As we included a selection correction in our estimation approach, country differences concerning employment selection should not be the source of this heterogeneity. Rather, differences in educational systems, in the capacities of labour markets to absorb young tertiary graduates as well as in culture-and tradition-based attitudes seem likely candidates. Although we made some references to the literature here, disentangling these different national-aspects and utilizing them for an analysis of country patterns represents a second interesting avenue for future research.
Further arguments add to the limitations of our study. Despite the wide range of individual covariates we are aware of missing factors that proved to be relevant for overeducation propensity like paternal background (Jackson et al. 2008)        Reference category: Germany *: Statistical significance at 10%-level; **: Statistical significance at 5%-level; ***: Statistical significance at 1%-level;