Study abroad programmes and student outcomes: Evidence from Erasmus

Exploiting admission thresholds for participating in Erasmus, the most popular higher education study abroad programme in Europe, we implement a regression discontinuity design and show that student mobility does not delay graduation and, in addition, has a positive and significant impact on the final graduation marks of undergraduate students. We find that Erasmus mobility improves graduation results for undergraduate students enrolled in scientific and technical fields and for those who apply in the first year of their studies, especially when enrolled in more demanding degree courses. Investigating plausible mechanisms, we find that the positive impact on performance at graduation is stronger for students who visit foreign universities of relatively lower quality compared to their home university. Finally, we do not find statistically significant effects of Erasmus mobility on postgraduate educational choices and labour market outcomes one year after graduation.


A Appendix A A.1 Design features
The selection process for the award of Erasmus grants presents several peculiarities that deserve further investigation.Within the main yearly call for Erasmus applications, launched in January of each year, the grant assignment mechanism can be summarized as follows: • Each student can submit up to two applications for mobility grants funding specific Erasmus programmes.
• A student who is offered an Erasmus grant can decide to turn down the offer when the first ranking is published and within the deadline, and these grants are reallocated to the next students in the specific programme ranking, until the last eligible student.
• Within each ranking relative to a specific Erasmus programme, the qualifying cutoff score is the score of the last student who is offered one of the available grants, regardless of whether she has participated in Erasmus.
• The value of the running variable  for each student is the maximum of her scoresnormalised to the cutoff score -among rankings in which she participates in the January call of her first year of application.The assignment indicator is  = 1( ≥ 0)).
Given the outcome of this allocation: • If there are non-assigned grants, a second round of applications is launched and the procedure is repeated.Students can only submit one application in this second round.
• After having initially accepted an Erasmus grant, students can later renounce to it or be rejected by the host institution due to not fulfilling specific requirements (e.g.deadlines, etc.).The grants of renouncing/rejected students are not reallocated.
• Any student can re-apply in subsequent years within her study career.
We define as treatment status (  ) the binary indicator that takes value 1 if the student  has ever participated in the Erasmus programme during her study career.

Derivation of cutoff score and running variable
Let us illustrate how the cutoff score and running variable are derived with an example.Consider that, in a given year , in the main call for applications that takes place in January,  = 15 students are enrolled and two specific Erasmus programmes -programme  and programme  -exist to which they can apply.Let's assume that 5 students apply only for programme , other 5 students apply only for programme  and the remaining 5 students apply for both.Let's further assume that students' application scores are as presented in Table A1.Once students have applied, they are ranked based on the scores assigned to their applications.
Then, those who are awarded an Erasmus grant decide whether to accept or turn down the grant offer.Let's assume that the outcome of this decision process aligns with the results presented in Table A2.In our example, due to students turning down offers and the reallocation of turned-down grants, for both programmes the position in the ranking of the last student who is offered a grant (6  ℎ student for programme  and 5  ℎ student for programme ) is higher than the number of available grants (2 for both programmes).
Then, for each student, the running variable is computed as the maximum value of the normalized score.The results are presented in Table A3.Identification Our setting has two notable aspects: first, students offered a grant have the option to turn it down/renounce or be rejected later; second, those who were not offered the grant in their first year of application can reapply in subsequent years and, if successful, receive the grant.
Thus, we have two-sided non-compliance.Identification in such designs may be invalidated by the presence of defiers (i.e.violation of the monotonicity assumption).As in standard settings with two-sided non-compliance, the absence of defiers cannot be directly tested.However, descriptive statistics can offer interesting insights.The Erasmus grant assignment mechanism allows us to explicitly define theoretical subpopulations of interest and to define the sub-populations on which we identify our causal parameter of interest.
In the remainder of this section, for exemplification purposes we assume that a student cannot participate in the Erasmus programme more than once in her study career (this share is small in our data on the population of applicants, accounting for 1.65%).We also assume monotonicity (i.e. the absence of defiers).
First of all, we decompose our treatment indicator   as where    takes the value of 1 if student  participates in the Erasmus programme as an outcome of her first application year, while  +  takes the value of 1 if student  participates in Erasmus as an outcome of applications in subsequent years.The assumption on the single participation implies that It is also important to underscore how    = 1 if and only if    = 1 (i.e.one-sided non-compliance in ).Given the setup, we can define the following strata: The two-period design implies that we may observe two types of 'always-taker' and two types of 'compliers', characterised by early (ATI, CI) or later (ATII, CII) participation.
Limits in    and  +  identify combinations of sub-populations: While the overall limits in  identify Assuming continuity of the probability of belonging to a potential sub-population in a neighbourhood of the threshold point, the differences between the right and left limits (i.e.discontinuities) identify These quantities can be computed by estimating the 'first-stage' regressions for ,   and  + separately and by taking the values of the two boundary points:24 In principle, and assuming the absence of defiers, as in our setting, the 4 sub-populations (ATI, ATII, CI, CII) are not identifiable from the data: since the two boundaries in  are the sum of the four boundaries in   and  + , we can only rely on 3 pieces of information for 4 unknown quantities.
However, columns (3) and ( 6) in the bottom row of Table A6 (reporting the lim show that the sum of CII and ATII is very small (3.6% for bachelor's students and 2.4% for master's students).Thus, we can reasonably argue that If the probability of being CII is exactly equal to zero and under standard continuity assumptions of the potential outcomes for each sub-population in a neighbourhood of the threshold point, the parameter obtained in our design identifies the treatment effect on the CI sub-population, i.e.
students who participate in the Erasmus programme as the outcome of their first application year and would not have participated in Erasmus had they not been offered a grant.This sub-population represents roughly 50% of bachelor's applicants and more than 60% of master's applicants at the threshold point (columns 1 and 4 of table A6, respectively).On the other hand, the sub-populations of those who would still participate in Erasmus during their career had they not been offered a grant in their first application year (ATI) represent approximately more than 35% of bachelor's students and roughly 20% of master's students (columns 3 and 6 of table A6, respectively).

A.2 Similarities to centralised school assignment
Our research design shares some common features with the recent literature developing a methodology that generalises regression discontinuity designs to allow for multiple cutoffs and multiple running variables (Abdulkadiroglu et al., 2017(Abdulkadiroglu et al., , 2022)).The context of this literature is centralised school assignment, where the matching of kids to schools is realised through a scheme that takes as input information on applicant preferences (parents provide a preference order of schools to which they apply) and school priorities (in each school, kids are assigned to priority groups based on observable family characteristics).Given preferences and priorities (labelled parental 'type'), the offer of scarce school seats is determined by tie-breaker rules that can be in the form of a lottery or are 'general' (non-lottery, e.g.test scores).Parental type is likely correlated with potential outcomes; general tie-breakers play the role of a running variable in a regression discontinuity design and are likewise a source of omitted variable bias.The authors show how, in their context, omitted variable bias is eliminated by controlling for a local propensity score, i.e. the ex-ante probability of receiving an offer quantified as a function of a few features of student type and tie-breakers such as proximity to the admissions cutoffs and the identity of key cutoffs for each applicant, which they show to have a much coarser distribution than the underlying type distribution.Conditional on the local propensity score, school assignments are shown to be asymptotically randomly assigned, and school seat assignment provides a credible instrument for school enrolment.
More specifically, for each school they classify applicants as conditionally seated if their (school-specific) tie-breaker value is in the neighbourhood of the school admission cutoff (i.e. in the range [ − ℎ;  + ℎ], where  is the admission cutoff and ℎ is the selected bandwidth); always seated if the tie-breaker value is above the neighbourhood of the admission cutoff (higher than ( + ℎ)); never seated if the tie-breaker value is below the neighbourhood of the admission cutoff (lower than ( − ℎ)).The limiting local probability of assignment of a seat at each school is 0.5, 1 and 0, respectively, for the three groups of applicants.The probability of being assigned to a specific school is derived as the school seat assignment probability at that school multiplied by the probability of being excluded in preferred schools, i.e. the disqualification rates at preferred schools, which depend on priorities and key cutoffs at the preferred schools.The propensity score for assignment at any school with a given characteristic is then derived as the Namely, the matching of students to Erasmus grants does not depend on programme priorities (no applicant is granted priority in any programme); the preference order of applicants to more than one specific programme (i.e. the maximum of two programmes) is not explicit at the moment of application, and only when the first programme-specific rankings are published are preferences potentially revealed, at which time students sort into the preferred programme among those offered.Given only the participants' set of applications, ties are broken in favour of applicants with the highest tie-breaker value, i.e. application score, which is the single non-lottery tiebreaker characterising our setting.For these reasons, the propensity score in our setting is simply the probability of being conditionally seated in at least one programme.This can take only the values of 0, 1/2, 3/4 or 1.In more detail, the propensity score takes a value of 0 for students who submit one application and are classified as never seated and students who submit two applications and are classified as never seated in both; 1 for students who submit one application and are classified as always seated and students who submit two applications and are classified as always seated in at least one application; 1/2 for students who submit one application and are conditionally seated and students who submit two applications and are conditionally seated in one and never seated in the other; 3/4 for students who submit two applications and are conditionally seated in both.Estimating our model by means of local linear regressions makes our approach comparable to that proposed by Abdulkadiroglu et al. (2022).In particular, excluding individuals with a maximum score above (below) the selected bandwidth, i.e. the always (never) seated, implies excluding individuals with a propensity score equal to 1 (0).Including controls for the remaining two values of the propensity score leaves the results unchanged, as shown in Table A7.
The table reports the results from the estimation of the model described in equations 1 and 2 with the inclusion of indicators of the propensity score taking the values 1/2 or 3/4, for the four main outcomes.Notes: The table reports the results of the estimation of a reduced-form equation (columns 1, 3, 5 and 7 of both panels) and of an IV regression (columns 2, 4, 6 and 8 of each panel) including the control for whether the propensity score takes a value of 1/2 or 3/4, for samples of bachelor's -panel (a) -and master's -panel (b) -students with a running variable within a bandwidth of 0.1, for the four outcomes of interest.All specifications include a polynomial of the running variable of order 1 and are estimated using triangular kernels.Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses.The number of bins is calculated with the mimicking variance evenly spaced method using spacings estimators.The relationship is fitted with a polynomial of order three.Notes: The table reports summary statistics for the final samples of students who never applied for Erasmus and students who applied for Erasmus at least once in their study career, separately by degree level (bachelor and master).The final samples are made up of students who enrolled in the first year of a study career at the University of Bologna from the 2007/2008 academic year onwards and who had graduated by the end of 2019 (when the data were extracted).Notes: The table reports the results from the estimation of the first-stage equation (eq.2) for the samples of bachelor's (column 1) and master's (column 2) students, including a polynomial of the running variables of order 1, estimated using triangular kernels.Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses.Notes: The table reports the results of the estimation of a reduced-form equation (columns 1, 3, 5 and 7 of both panels) and of an IV regression (columns 2, 4, 6 and 8 of each panel) for four alternative measures of time to graduation: the probability of graduating in July of the last year of the degree's legal duration (3rd year for bachelor's students and 2nd for master's students), in columns 1 and 2 of both panels; October of the last year of the degree's legal duration, in columns 3 and 4 of both panels; December of the last year of the degree's legal duration, in columns 5 and 6 of both panels; March of the year following the last year of the degree's legal duration, in columns 7 and 8 of both panels.The estimations are performed on samples within a bandwidth of 0.1.All specifications include a polynomial of the running variable of order 1 and are estimated using triangular kernels.Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses.Notes: The table reports the results of the estimation of the reduced-form equation (columns 1 and 3 for the samples of bachelor's and master's students, respectively) and of the IV regression (columns 2 and 4 for the samples of bachelor's and master's students, respectively) for the average exam mark at the end of one's study career, before graduation.The estimations are performed on samples within a bandwidth of 0.1.All specifications include a polynomial of the running variable of order 1 and are estimated using triangular kernels.Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses.Notes: The table reports the results of the estimation of IV regressions on the sample of master's students with a running variable within a bandwidth of 0.1, for four outcomes: the probability of graduating without delay; the time to graduation measured in months; the final graduation mark; and a dummy for graduating with distinction.The reported coefficients are those of the interaction between given student characteristics of interest and the endogenous treatment variable instrumented with the interactions of the predictions of the first-stage regression and the same characteristic.All specifications include a polynomial of the running variable of order 1 and are estimated using triangular kernels.Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses.Notes: The table reports the results of the estimation of the reduced-form equation on the sample of master's students with a running variable within a bandwidth of 0.1, for four outcomes: the probability of graduating without delay; the time to graduation measured in months; the final graduation mark; and a dummy for graduating with distinction.The reported coefficients are those of the interactions between our instrument   and the given characteristic of interest.All specifications include a polynomial of the running variable of order 1 and are estimated using triangular kernels.

B Appendix B B.1 Additional Figures and Tables
Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses.Notes: The sample is composed of all applicants to Erasmus study abroad programmes between 2013/14 and 2018/19 who have graduated from a bachelor's degree and for whom there is information in the administrative data on study careers, and who participated in the Erasmus programme.Observations are at the student-exam level.'During Erasmus' is defined as the calendar year of exam dates registered as taken while abroad, and 'after Erasmus' identifies subsequent calendar years.Errors are clustered at the degree-course level.Robust standard errors are in parentheses.
sum of the propensity scores for the individual schools with the characteristic of interest (because the assignment algorithm generates a single offer for each child).The authors then estimate a two-stage least squares (2SLS) model with saturated controls for the local propensity scores and local linear controls for non-lottery tie-breakers.Saturated regression-conditioning on the local propensity score eliminates applicants with score values of zero or one, and only local linear control for general tie-breakers for applicants to schools in which students are 'conditionally seated' is implemented.Relative toAbdulkadiroglu et al. (2022), our setting differs in terms of some key features.

Figure B1 :
Figure B1: Density of the running variable (a) Sample of bachelor's students

Figure B2 :
Figure B2: Intention-to-treat graphs (a) Sample of bachelor's students

Figure
Figure B3: Main results: IV estimates with different bandwidths (a) Bachelor Prob. of graduating on time Time to grad.(months) Final mark Prob. of distinction

Table A2 :
Example: outcome of ranking and decision process

Table A3 :
Example: running variable and final qualifying ranking

Table A4
reports the share of students by treatment status (rows) and by the value of the running variable (i.e. a student's maximum normalised score among rankings to which she participated in her first year of application, the latter being indicated as ) being below or above the cutoff score (columns).Treatment status is distinguished across: non-treated students (i.e.students never participating in Erasmus during their study career) not applying for Erasmus in years subsequent to their first year of application (indicated as  + 1); non-treated students applying for Erasmus in subsequent years; treated students participating in Erasmus as the outcome of applying in ; treated students participating in Erasmus as the outcome of applying in  + 1.The results in TableA4show that (i) the majority of students (approximately 80%) with running variables above the cutoff score, i.e. who are offered a mobility grant the first year of application, are treated at , i.e. accept the scholarship the first time they have the chance of

Table A6 :
Discontinuities in time-varying treatment indicators

Table A7 :
Main results controlling for propensity score

Table B1 :
Representativeness of the sample of students from the University of Bologna Notes: The table reports some selected statistics on the composition of the sample of students enrolled in their first university career at the University of Bologna from the 2010/2011 to 2019/2020 academic year -in column 1 -and the population of higher education students first enrolled in any Italian university in the same time period -column 2. Source: Italian Ministry of Education (data extracted from http://dati.ustat.miur.it/dataset/immatricolati).

Table B2 :
Descriptives on the initial sample of applications The table displays summary statistics for the sample of all applications for an Erasmus grant funding a period of study abroad submitted by students from the University of Bologna between the 2013/2014 and 2018/2019 academic years. Notes:

Table B3 :
Descriptive statistics on the final sample of Erasmus applicants and non-Erasmus applicants

Table B4 :
First stage

Table B5 :
Alternative measures of time to graduation

Table B6 :
Effect on GPA before graduation

Table B7 :
Heterogeneity of effects across students characteristics -Master sample -IV estimates (a) Differential effects by field of study

Table B8 :
Heterogeneity of effects across programme characteristics -Bachelor sample -Reduced-form estimates.Quality measured relative to the University of Bologna.(a)Differential effects by quality of host institution -above University of Bologna The table reports the results of the estimation of the reduced-form equation on the sample of bachelor's students with a running variable within a bandwidth of 0.1, for four outcomes: the probability of graduating without delay; the time to graduation measured in months; the final graduation mark; and a dummy for graduating with distinction.All specifications include a polynomial of the running variable of order 1 and are estimated using triangular kernels.Errors are clustered at the programme-year-specific ranking level.Robust standard errors are in parentheses. Notes:

Table B9 :
Heterogeneity of effects across programme's characteristics -Master sample -Reducedform estimates (a) Differential effects by quality of host institution -top 100 Differential effects by quality of host institution -top 100 -and length of study abroad period

Table B10 :
Evidence from data on exam grades