Can role models help encourage young people to apply to (selective) universities? Evidence from a large scale English field experiment

Under-participation in selective universities lowers social mobility in England, the United States, and elsewhere. English universities have standardized tuition costs, and strongly heterogeneous graduate earnings. Attending a selective university is therefore strongly incentivized, yet under-participation is extensive. The British Government sent 11,104 “nudge” letters to school students whose prior attainment made them competitive for entry into selective universities, urging them to consider that option. We evaluate this RCT and find it effective at raising the number of students who apply to, and accept offers from, selective universities. We find the cost to be low relative to outcomes.


Introduction
In England, the university you attend matters for your future finances. Alumni of universities ranked highly by five year salary typically earn over twice as much as their compatriots who attended lower ranked universities, taking subject studied into account. For that reason, access to particular universities, as well as access to university per se, matters for social mobility. Here we set out and evaluate a British Government organized randomized controlled trial to try to "nudge" school students into making better choices.
The paper proceeds in a conventional way. First we discuss the problem to be solved, setting out briefly the necessary background information for those unfamiliar with the English university system.
We go on to survey the relevant literature on using "nudge" techniques in this area. We then describe the intervention. Next we set out the data used in the evaluation. This is followed by the analysis, and the results. Finally, we conclude.

Context
In 2016, just 3.6% of school-aged applicants from the least advantaged areas of England entered a highly-selective university, compared with 21.3% of those from the most advantaged areas [UCAS 2016]. This is despite a dramatic increase in the proportion of people from poorer areas attending university: up 73% compared with ten years previously [UCAS, 2016]. Given the salary premium associated with attending a selective institution, this is an issue of significant policy interest Britton et al. (2016).
The reason is that selective universities select their intake predominantly by the subjects that would-be students have studied and the grades they have achieved in national academic exams. 6 Subjects studied, and particularly grades achieved, are correlated with social class. 7 Nevertheless, a lthough prior attainment can explain much of the disparity in entry rates, Anders (2012) found 'gaps' in university application behavior which suggest students from poorer backgrounds were less likely to apply to universities, and particularly selective universities, than richer students with equivalent qualifications. A similar pattern has been observed in the USA Hoxby & Avery (2013) and Smith (2013). This is the issue with which we are concerned, and it is made more pressing in the context of evidence that there are positive returns to attending more 'selective' institutions (Zimmerman, 2019, Goodman et al (2017, Cohodes and Goodman, 2014, and so this disparity not only reflects, but exacerbates, economic inequality. To understand how this might happen, and what could be done about it, we need to understand how the English university system works, and the opportunities and incentives for those involved. 8 English higher education has a unique structure. Universities and colleges are independent self-governing institutions. That said, they overwhelming charge the same fees, have the same course lengths, and offer access to the same (government subsidized) loans for tuition and maintenance. None charge upfront fees. The names of universities with strong graduate earnings can easily be found online: Wikipedia provides a good list of the various lists.
Applicants use a single, common, low-cost application procedure when applying to university. This allows them to apply to up to five universities, in each case for a specific subject. There is a standard charge: £13 is you apply to one university, £24 if you apply to more than one, irrespective of the universities applied to. The principal information in the application are qualifications and grades from national exams taken at age 16 ("GCSEs"), and subjects currently being studied, and likely grades (A levels). The application also contains a personal statement by the would-be student, and a reference from the school.
Selective universities make offers to those applicants that they wish to admit, and applicants receiving more than one offer are able to choose which one to accept. There are procedures for those who receive no offers. Universities reserve no places for children of alumni, nor can one gain entry by being exceptional at sport, music or other extra-curricular activity. No preference is given to students already living in the locality.
A would-be student with good grades in relevant subjects will be competitive for entry into a wide range of universities. All options have the same tuition costs -none of which are borne upfront. In contrast, likely labour market trajectories are very different according to university attended. In that context it makes sense for the would-be student to apply to, and attend, a selective university. At very least a "libertarian paternalist" would want them to consciously consider such an option.
Universities want to recruit from the widest talent pool. This is inherent in being a selective institution, but they are also required to widen access, particularly by social class. In the period under consideration all universities wanting to charge the maximum (i.e. usual) tuition fee had to agree annual 'widening participation access agreements' with a government body, the Office for Fair Access. These agreements are not trivial: universities and colleges currently spend £833 million to meet these obligations [OFFA (2017)].

Literature
The behavioral economics literature is now well-established, and interventions based on its findings have been widely applied in an educational context (Page et al, 2016, Lavecchia et al, 2016. Perhaps the most clear-cut result is the power of the default effect, whereby people remain with the default option more often than we might predict, even when good information is available to make a better choice. A good example of the power of explicit defaults in university applications comes from the US. In 1997, ACT removed the $6 fee for sending test scores to a fourth college, at which point the proportion of students submitting four test reports rose from 3% to 74% (Pallais (2009)). The default number of free applications strongly drove behavior.
The creation of social or familial norms can act as soft defaults, with similar effects to more explicit default options (Akerlof and Kranton (2002)). This is because individuals often reply on rules of thumb to make complex decisions about their future, often subconsciously, even when good information is readily available (Thaler & Sunstein (2009)). This is the relevant area for our work.
Social and familial norms mean that students considering universities may well minimize the extent to which their application strategies differ from those of their peers, even when those patterns are sub-optimal for them (Fryer and Torelli (2010)). Related to this is the 'availability heuristic', which states that individuals gauge the likelihood of an event by how readily they can recall examples of it.
In this case, if a student cannot readily recall someone who has applied to a selective university, they may assume that their own chances of being successful are low (Tversky and Kahneman (1973)).
Individuals from disadvantaged backgrounds are less likely to have friends or family who can provide reliable advice on universities. These students are therefore more likely to rely on incorrect information (Dynarski & Scott-Clayton (2013)]. In the US at least, this is further compounded by potential social costs. Black students, engaging in behaviours associated with career success such as attending university, are sometimes said to be "acting white", and risk peer rejection (Fryer (2006)).
What is critical here is that the default effects are unlikely to be optimal. It is difficult for there to be a strong social norm to be to attend a selective university, given that by definition many applicants are rejected. Furthermore, if the default is to be local, many areas don't have a selective university nearby.
The need to travel to attend a selective university means that attending such an institution will entail commuting or relocation costs which can be high relative to ability to pay, at least for poorer students. Moreover, physical distance can translate into psychic costs, particularly for students with no family history of university participation. Recent analysis of English administrative data has shown that, for those students who go to university, distance is the strongest factor influencing choice, particularly for those from low-income households (Gibbons & Vignoles, 2009).
There is growing interest in using behavioral science to improve education outcomes for low-income students. This has generated a range of behavioural interventions to encourage students to apply to university and college. Some of these focus on improving the salience and availability of good information for students. For example, McGuigan et al. (2016) found that a low-cost information campaign in UK schools had had a striking effect on students from low-income backgrounds, leading to a 11 percentage point reduction in the proportion stating that university was too expensive for them. Informational resources have also proven effective in other education systems (see for example Oreopoulos and Dunn (2013) and Dinkleman and Martinez (2014).
However, although information-only interventions appear to improve understanding, it is less clear that this translates into action. Bergman and Denning (2016) found that sending information about financial aid to over 80,000 US high school seniors had no statistically significant effects on college enrolment. In contrast, Bettinger et al. (2012) found that coupling information provision with personalized assistance in completing the FAFSA (Free Application for Federal Student Aid) increased the proportion of low-income high-school seniors attending and completing two years of college from 28% to 36%.
Although few in number, some trials have focused on encouraging students to apply to more selective institutions (see for example Goodman (2016) and Pallais (2015)). More relevant to our study are experiments which explore how to make information on highly-selective institutions more salient to low-income individuals. Most notably, in a large-scale field trial Hoxby and Turner (2013) tested the effect of sending high-achieving students a semi-personalised package of information on college options alongside a fee waiver. They found that the intervention caused students to submit 19% more college applications. Critically, they applied to, and enrolled in, more selective institutions than the control group. Even though these students enrolled in more competitive courses, their freshman grades were as good as students in the control group, suggesting that their academic performance was not impaired.

The intervention
The intervention outlined here draws strongly on the successful work of Hoxby and Turner (2013), and on subsequent attempts to replicate thus work, particularly Bird et al (2019), and Gurantz et al (2019a, b). The aim was to encourage English students who had achieved good results in national exams the previous summer to consider applying to more selective universities than they would otherwise have done. To that end, we conducted a large scale field experiment in academic year, 2013-2014. Put simply, some likely would-be university applicants received letters urging them to think widely about where to apply, and others did not. Due to the logistical difficulties in randomising at the individual level (in particular that when the first letters were to be sent, the home addresses of students would not be known), the intervention is randomly allocated at the level of the school, such that every student eligible to be in our sample within a school receives the same allocation. We then compared the results.
Eligibility for the trial determined by the UK government Department for Education (DfE), and consisted of two sequential tests. First, the school had to meet one criterion, and then, within qualifying schools, students had to meet two criteria.
The school criterion was that more than 20% of high achieving students who went on to higher education did so at a single, local institution. This served as a proxy that students in that school could likely benefit from thinking more broadly and imaginatively about which university was right for them. In no case was the local institution drawn from the selective end of the spectrum.
Once we had identified the schools in scope, we then considered the students currently studying at that school. Students had to meet two criteria; to be in year 12, and to be in the top 20% of students nationally by attainment at the end of the previous academic year. Students are aged 16 at the start of year 12. It was chosen because it is the year in which students consider university choices, visit campuses and so on, prior to applying early in year 13. Students in the top 20% were chosen because we would expect them to be competitive for entry into selective universities. Specifically, students needed 367 or more grade points in their 8 best GCSEs (the high stakes academic exams taken at the end of year 11) 9 .
Our selection criteria gave us 11,104 students in 300 schools, around 1.5% of the cohort nationally.
Schools were assigned at random to one of four equal-sized conditions. Randomization was stratified by a quartile split of the proportion of students in a school that were eligible for free school meals.
The four conditions are described below.

1: Control
Participants attending these schools received no letters, and continued their educational experience as they would have done in the absence of the trial. They were unaware of the trial.

2: One letter, sent via the School in November 2013
Participants in this group received a hand-signed letter, from Ben, a second year university student at the University of Bristol. The [#text of the] letter is reproduced in full in Annex A. In essence it said that they had many choices, and should not jump to conclusions. This letter was written by Ben, a second year student at Bristol University. The words were his own, but he was asked to build a connection, and to emphasis that more prestigious universities are not necessarily more expensive to attend, and may well be cheaper.
The letter was designed to meet behavioral science best-practice. First, it was written by Ben himself, and is in natural language. This was reinforced by being hand-signed. Ben was someone with whom we hoped the recipient would connect. He emphasized this at the outset by stating that he had been in their position. Second, Ben appeared knowledgeable, setting out some basic facts about finance, as well as giving a pointer towards more information. Ben also appeared authoritative: he had got through the process unscathed, had ended up at a highly selective university and was pleased with his choice. The University of Bristol is well-known in England. It is ranked 17th in the UK Complete University Guide 2018 10 , and 44 th in the World in the QS rankings 11 . It is highly selective: in 2016, Bristol received 43,930 applications for 5,315 places. 12 10 https://www.thecompleteuniversityguide.co.uk/league-tables/rankings 11 https://www.topuniversities.com/ 12 UCAS, End of Cycle 2016 Data Resources: DR4_001_02 & DR4_001_03. UCAS Analysis and Research (2017) accessed at: https://www.ucas.com/corporate/data-and-analysis/ucas-undergraduate-releases/ucas-under graduate-end-cycle-data-resources/applicants-and-acceptances-universities-and-colleges---2016 These letters were placed inside individual envelopes addressed to the student, which were in turn shipped inside a larger envelope addressed to the school's head teacher, with a request that they distributed the letters to the relevant students. We are not aware of any schools refusing to do so, and some heads wrote back to thank us.

3: One letter, sent directly to the students home in April 2014
Participants in this group were sent letters to their home addresses in April 2014 (these addresses had not been available in the November). These letters were sent by a female second year student, Rachel, at the University of Bristol. The letter is reproduced in Annex B. Again, the letter was written by the student themselves, and the content was substantively the same as the first letter.

4: Two letters, one via school in November 2013, one to home in April 2014
Participants in this group received both the letter from Ben in November 2013, from their school, and the letter from Rachel to their home, in April 2014.
No further deliberate actions were taken as part of the trial. The letters may have prompted the recipients themselves to take further actions -to talk to teachers and others, look at websites, visit universities and so on. Participants, including those in the control, were able to apply to universities as normal in the autumn of 2014, via the standard UCAS process.

Data Description
The English Education system offers particularly good data for researchers. The UK Government's Department for Education compiles the "National Pupil Database" (NPD), which records comprehensive information about pupils. This includes their name and address, eligibility for free school meals, gender and ethnicity. It also includes the school that they attend, and the qualifications and grades that they are awarded. The qualification and grade data come from exam boards, and can be considered authoritative. These data are made available at student level to bona fide researchers. 13 The Higher Education Statistics Authority (HESA) record who attends which university. Department for Education statisticians have matched this dataset to the NPD, and the data are made available at student level to researchers as a matched dataset.
13 https://www.gov.uk/government/collections/national-pupil-database The matched dataset allowed us to identify the schools that met the criteria for inclusion in the trial.
The NPD was then used on its own to identify the students within those schools who were in the right school year and had the relevant GCSE grades to be included.
UCAS data cover who has applied to which university, whether the universities to which they have applied offer them a place, and whether the applicant has then accepted the place. Their data are provided via the UK Government's Department for Education. For confidentiality reasons, these data are not available at the level of the individual student. Rather they are available at school level. We know how many people in each school applied to university, received an offer from a university, and accepted an offer. The information is not given by individual university. Instead we know it for all universities, and for "Russell Group" universities combined. The Russell Group is a membership organization of selective universities. All Russell Group universities are selective, but not all selective universities are members of the Russell Group. Our variables of interest will be the proportion applying to the Russell Group, being made an offer by a Russell Group University, and accepting such an offer. Data are provided in such a way that allow us to identify for each student, which school they attended (and hence, which treatment group they were assigned to), and whether they applied to, were accepted by, and accepted an offer from, any university, or a selective institution. For reasons of preserving anonymity, we are not able to identify participants' ethnicity or gender, and so cannot conduct this kind of subgroup analysis.
We conducted ex ante power calculations, based on 300 school settings and 11,104 students randomised (an average of 37 students per school). Following Kerry and Bland (1998), and assuming an intra-cluster correlation rate of 0.1 (as is commonly used in education trials in the UK), we are powered to detect effects equivalent to a cohen's h of 0.16, a fairly small effect.

Analytical Strategy
The hypothesis to be tested is that the intervention will increase the proportion of people applying to a selective university. This will in turn cause a rise in the number who receive an offer, and a rise in the number who accept the offer. We do not, however, expect a change in the ratio of applications to offers, or offers to acceptances. If we find that there is a fall in the proportion who receive an offer, then we would need to investigate whether our intervention was leading uncompetitive students to apply, or whether their applications were weak (personal statements, school references), or whether universities were not treating them fairly. If we find that they receive offers, but do not accept them, we would need to consider what supplementary actions would be needed to convert interest into acceptances.
There is no expectation that the intervention will lead to a rise in the number applying to university overall, since the letters were sent to students whose grades make it likely that they will apply to university in any case. Nevertheless, and for completeness, we report the outcomes for all universities as well.
We need to define what is meant by a "selective" university. All universities are selective to an extent, in that no university will accept every candidate who applies. Some universities are, however, much more selective than others.
We therefore need to separate universities into two groups: more selective and less selective. There are two alternative approaches: to use our own definition, or use an external, exogenous definition.
It would be possible to create our own definition of what constitutes a selective university, using, for example, data on the number of applications per place, or the average grade of candidates admitted, or some combination thereof. Both are plausible strategies, but we would have to set an essentially arbitrary cut off, dividing those we deem to be selective from those we deem not to be selective. There is a risk that the result would be determined by the definition of selective.
To avoid any risk of unconscious bias leading us to use a definition favorable to finding that the hypothesis is supported, we choose instead to use an exogenous definition of selective. Specifically, we follow a well-established definition of selective, membership of the "Russell Group". The Russell Group is a membership organization for research-intensive, world-class universities (# http://russellgroup.ac.uk/about/). All 24 Russell Group universities are ranked in the QS global top 250 (#http://russellgroup.ac.uk/media/5524/rg_text_june2017_updated.pdf). They are typically twice the size of non-Russell Group Universities, and have faculty student ratios twice that of non-Russell Group Universities. More than 400,000 undergraduates study at these universities, and they are the dominant suppliers of graduate courses, including PhDs. Of the 24 Russell Group universities, 20 are in England, with a good geographical spread, from Exeter to London to Liverpool to Newcastle.
The use of the Russell Group as a valid definition for selective universities is well-established. When recording the destinations of UK school leavers, the UK Department for Education divides universities into three groups: Oxford and Cambridge Universities, the Russell Group, and other universities (#https://www.gov.uk/government/news/government-publishes-destination-data-for-the-first-time) . The Russell Group account for around 1 in 5 of all undergraduates. Other researchers, such as the Sutton Trust, use the Russell Group as a proxy for selective universities (#https://www.suttontrust.com/newsarchive/access-highly-selective-universities-stalls/).
We have a principal set of outcome measures: applying to, being made an offer by, and accepting an offer from a Russell Group University. In addition, we report whether there are changes in the number of students who apply to, are made an offer by, or accept and offer from any university.
We conduct balance checks to ensure that randomisation has been successful -these are reported in Appendix C. We find no significant evidence of imbalance.
Our model takes the following form.
(1) T Y is = α + γ s + u s is a binary outcome variable. It takes a value of 1 if student i in school s applies, is offered a Y is place, or accepts a place; otherwise it takes a value of 0. is a constant, while is a vector α T s covering the three binary treatment assignments (letter to school, letter to home, both letters). is u s a an error term clustered at the level of the school. It is clustered at school level, because the treatment is assigned at school level. This accounts for the lack of independence between observations within a school. In addition to estimating the effects of each treatment independently, we also estimate a pooled effect of receiving at least one letter. We make use of a linear prediction model, with logistic and probit regressions used for robustness analysis, as each form of analysis, given the simplicity of our model, estimates differences in cell means. Table 1 reports the results for selective universities. This shows whether the letters increased the number of students who applied (column 1), were made an offer (column 2), and accepted an offer (column 3). Columns 4-6 contain pooled regressions: reporting whether receiving any letter had an effect on the outcome measures. We find a large and statistically significant impact on the probability of young people in our sample applying to Russell Group universities if they receive both letters compared to if they receive no letters. We see a positive, but not statistically significant impact on the likelihood of being made an offer (p=0.059), and a positive and statistically significant increase in the likelihood of accepting an offer from a selective university for participants who receive both letters. Participants in this condition were 2.9 percentage points (33%) more likely to accept an offer from an RG university than were participants in the control group   We find no statistically significant impacts of any of our treatments, or all treatments pooled, on the likelihood of applying to university overall, although the results for young people who receive both letters are consistently positive for applications (p=0.16), offers (p=0.18), and acceptances (p=0.19).

Secondary Analysis
In addition to these analyses, we investigate whether the effect of our interventions differ depending on the characteristics of the school -specifically, we estimate the interaction between 14 our pooled treatment (receiving at least one letter), and either the amount of deprivation in the area surrounding the school, or the performance of the school measured using progress 8 14 , the main performance metric used by the Department for Education.
There are three possible outcomes from this analysis. If participants in lower quality schools are more influenced by the intervention, suggesting that the role model intervention helps to alleviate some deficit in aspiration provided by the teachers themselves -or that teachers operating under constraints optimize by prioritizing investments in outcomes other than aspiration. If the effect is largest for participants in the most highly performing schools, this would indicate complementarity between teacher investment and the existence of a role model, while no interaction between the effects and school quality would suggest that the impact is orthogonal on the performance of the school itself. Table 3 shows the results of these analyses. In the interests of parsimony, we consider only acceptances of applications in this analysis. Column 1 regresses acceptances of offers from any university on pooled (any letter) treatment assignment, an indicator of whether the school has received a progress 8 score of below minus 0.5, and an interaction between the two.
Column 2 regresses acceptances of offers from Russell Group universities on pooled treatment assignment, an indicator of whether the school has received a progress 8 score of below minus 0.5, and an interaction between the two.
14 Progress 8 is the main accountability metric for secondary schools in the UK. It measures students' progress between age 11 and 16 in eight subjects, including English and Maths, and at least three traditional academic subjects, such as sciences, humanities and languages. A Progress 8 score above 0 mean that students in that school made better than average progress, while scores below 0 means the reverse. ##These covariates are The results from this analysis are indicative suggest a stronger treatment effect overall for schools that are performing below the -0.5 threshold score in progress 8. Although this effect is not statistically significant for applications to all universities (interaction effect = 0.103, p=0.159), it is both large and statistically significant for selective universities (interaction effect = 0.069, p=0.005).
Together, the directions and magnitude of these findings support the idea that role model letters act as a substitute, or replacement for teacher investment in students' aspirations, with worse schools responding more strongly to treatment. This analytical treatment is fairly blunt, however, and could by an artefact of the placement of the acceptability cut-off in progress 8 scores. Figures 1-3, below, attempts a more nuanced approach at this analysis by allowing a more flexible functional form. It plots the relationship between progress 8 scores and rates of application to/being made an offer by/acceptance of an offer by a Russell Group University using a kernel weighted local polynomial plot for each of our any treatment and control groups, with confidence intervals around the treatment group.
16  All of these figures show evidence of substitutability. At all levels of progress 8 score, we can see that the treatment performs no worse than the control condition. However, we can see that the treatment performs substantially better than the control group for schools with below average (<0), progress 8 scores, supporting an argument for substitutability between role model messages and teacher investment/quality.

Discussion
We have reported the results of a large scale randomised controlled trial in the UK, and shown that role model letters can have a significant impact on the rate of university applications and acceptances.
We have shown that the applications do not appear to be deadweight -i.e. that students who are influenced to apply are not systematically under qualified and hence rejected by more selective universities. We have also shown that these letters have the largest effect among lower quality schools, based on the UK's performance metric for schools, Progress 8. This suggests that where there is a local deficit in teacher performance and aspiration, this can be partly corrected through the use of relevant role models. This also suggests that as a policy prescription, simply increasing the performance of schools and hence young people within them along academic grounds may not be sufficient to increase access to higher education. The young people in our sample were those who had, regardless of their institution, performed well by national standards on high stakes exams aged 16. Despite their good academic performance, however, their university application rate was significantly lower when they were in lower quality schools -indicating that good grades are a necessary, but not a sufficient, condition for aspiring to a good university.
There remain some questions that we are unable to answer. In particular, we are concerned about the university application rate in general, not just for selective institutions, and our study does not provide convincing evidence of this being impacted by role model letters, although the combined condition does have directionally positive impacts. We must therefore consider whether alternative interventions might be more effective at this margin. Our data is not sufficient to analyse differential effects by, for example, gender, or geographical location, which we hypothesise may be important for the effectiveness of role model interventions.

Annex C -Balance Checks
In Table C1, below. Column 1 regresses school's Ofsted ratings on treatment assignment, column 2 regresses school's progress 8 scores on treatment assignment, and column 3 regresses a binary