Content expectations and dropout in Dutch vocational education

Introduction When students expectations at the start of their program are not in accordance with reality, this may have negative consequences for future study success (e.g. Baker et al. 1985; Fonteyne et al. 2017; Helland et al. 2002; Jacob and Wilder 2010; Maloshonok and Terentev 2017; Tinto 1975; Wigfield and Cambria 2010; Zijlstra and Meijers 2006). Expectancy can be operationalized as performance expectations that refer to the results in the educational program (e.g. success in graduation, dropout and grades) and content expectations that refer to the content of the educational program (e.g. study load, curriculum, learning environment and effort required to pass the subjects). There are several theoretical reasons to suppose that expectations about an educational program influence academic success. An influential model of motivation is the expectancy value theory (e.g. Eccles et al. 1983; Eccles and Wigfield 2002; Wigfield and Cambria 2010). This theory proposes that one is motivated for behavior that results in Abstract

outcomes of value, but only when one expects to obtain that outcome. Two judgements are thus important for motivation: that of the value of the outcomes, and that of the likelihood of obtaining it.
As for the first judgment, in the choice of a higher (vocational) education program, the value as seen by the prospective student must be related to the content of the program: either the interest of the student in that content, or the doors it opens to later careers. If the content is different than expected, this may lead to disappointment, a lack of motivation and ultimately a lack of success and risk of dropout. Indeed, according to person-environment fit theories, the fit between the educational program and the individual student is one of the factors important in preventing dropout in higher education (Kristof-brown et al. 2005).
As for the second judgment, the student must expect success in the program to be motivated to choose it-in the terms we introduced above, they must have high performance expectations. However, unrealistic performance expectations have also been linked to dropout in the literature (Maloshonok and Terentev 2017). The idea here is that expectations of success may also affect motivation negatively and hence increase the likelihood of dropout if these expectations do not match reality. In this regard, it is important to also take into account the effort that the prospective student expects to have to make to achieve success-what we defined above as content expectations. If a student only expects to have to spend x hours on their studies, and success in the program requires, e.g., double that, the student will either have to revaluate and adapt to these requirements, or not have the expected success-leading to dropout. This study therefore focuses on the content-related expectations of students entering Dutch upper secondary vocational education (which we will shorten to vocational education in this paper).
Content-related expectations that are different from reality may explain a substantial part of dropout, especially in vocational education, where it is to a large extent unclear for the prospective student what skills will be developed and what is required of them in order to perform well in the program. Thus, it is relevant to know which contentrelated expectations entering students entertain, whether these expectations are realistic and whether having unrealistic content-related expectations are in any way related to dropout.
The literature linking student expectations to study success has generally focused on performance expectations at the expense of content expectations. Research on entering students' performance expectations has generally focused on the idea that such expectations are unrealistically optimistic which in the student retention literature is thought to lead to student dropout. Stern (1966) named this the Freshman Myth, which describes both academic and non-academic expectations students have of college life before entering into tertiary education. Stern (1966) noted that students tended to have unrealistically high expectations of university life (e.g. academic program, standards of achievement and vocational orientation), and were disappointed by their actual experiences. This has led to many empirical studies about the Freshman Myth and the relation between, in particular, performance-related expectations (expected grades) and later study success. Many of these (e.g. Baker et al. 1985;Cook and Leckey 1999;van Klaveren et al. 2019;Lowe and Cook 2003;Maloshonok and Terentev 2017;Zafar 2011) confirmed Stern's findings, in numerous settings and with different student samples. First-year students thus indeed overestimate (or were overconfident about) their future academic performance and had a higher chance of dropping out.
Further research confirms more parts of The Freshman Myth and included expectations about their own behavior and the content of the program. For example Baker et al. (1985) found that students also expect more of themselves than they subsequently realized. Fewer studies have looked at content-related expectations; the ones that did are predominantly retrospective. Zijlstra and Meijers (2006) for instance showed that college students dissatisfied with their program often report having had inadequate expectations at the start of the program. Every year the students looked back and rated their satisfaction and how challenging the program was perceived. Their results show that first year students find their program challenging, but that the longer students are in college the less challenging they rate their program and the less satisfied they are. This often manifests itself in the feeling that they had started off with inadequate expectations. The problem with retrospective studies, however, is that the students are already confronted with their performance (e.g. dropout or promotion to the next grade) which can influence their retrospective reporting about their expectations at the start. It is impossible to check retrospectively whether remembered expectations were truly the ones students had when they entered the program.
Another retrospective study, that of Helland et al. (2002) focused on dropout as a function of different expectations (academic expectations, social expectations, social integration, institutional commitment and intent to re-enroll). They surveyed 715 students at three moments in their first year using three different questionnaires. In the second questionnaire the authors retroactively queried social expectations for college by asking "When compared to how satisfied I thought I would be when I decided to attend (name of institution), my satisfaction is now…. " for three items; The day-to-day personal relations I would have with other students, my social life and overall, the degree to which I feel that I fit into the social environment at (name of institution) (Helland et al. 2002, pp. 386). They showed that when new students' social expectations about college are matched with reality, they are more likely to be satisfied with college and are more likely to persist. Again, because the study was retrospective, students might not have remembered their expectations correctly. Moreover, Helland et al. 's (2002) questionnaire required a comparison of retrospectively reported expectations with current reality. Reality (i.e. whether students truly had a satisfying social life in college) might have influenced persistence more than a match or mismatch with the expectations of the student (in particular because mismatches usually consisted of reality being less satisfying than expected). Maloshonok and Terentev (2017) took a different approach and linked expectations to measured student outcomes. They questioned 283 respondents during the first 2 weeks of their study about student's expectations concerning their learning activities, time allocation, grades and expected difficulties. After 7 months a second questionnaire followed that contained questions about their activities in the first year of study, including attainment. Maloshonok and Terentev (2017) found three predictors of low attainment: first, if students spent less time than they had expected on extracurricular activities, this was associated with better outcomes. Second, students that found the study more boring than expected had worse outcomes. Finally, students who had overconfidence in their grades had higher academic performance. Thus, overconfident students (whose expectations were higher than actual grades), on average had higher grades than students who were underconfident about their academic performance. This was interpreted as that low student expectations about grades negatively affect performance. However, it is difficult to relate overconfidence to actual performance because over-and under confidence are themselves functions of student performance (i.e., one can only diagnose overconfidence by comparing expectations to actual performance). Students with high expectations could thus be realistic assessors of their own high skills (and thus future good performance), and merely be a little over-optimistic in their expectations.
Because the literature linking student expectations to study success generally is retrospective and focuses on performance-related expectations, it is important to gain more insight in the link between prospective expectations about content-related expectations after transition to college and dropout. This study investigates prospectively whether the content expectations of an educational program can eventually be the reason to drop out. Specifically, we investigate the following hypotheses: Students who later drop out have, at the onset of their studies, different contentrelated expectations than students who are successful. Students who later drop out have, at the onset of their studies, less realistic contentrelated expectations than students who are successful Exploratively, we also investigated whether the two categories of students also differed with respect to performance-expectations (i.e., expected grades), and whether students update their content-related expectations during the first year.
We combined administrative data (information on grades and dropout) with survey data about content-and performance-related expectations, gathered by two questionnaires, of cohort 2016 and 2017 of ROC TOP Sports Academy Amsterdam in the Netherlands. This paper proceeds as follows: First the context of this study is explained. Subsequent, the data, descriptive statistics and used questionnaires, the empirical analyses and a discussion of the analysis and findings follow that lead to our conclusions.

Context: Dutch Education System and ROC TOP Sports Academy Amsterdam
This study was conducted at ROC TOP Sports Academy Amsterdam, an upper secondary vocational education (which we will shorten to vocational education in this paper) program that educates students in sport-related fields on ISCED level three and four (see Fig. 1). Dutch vocational education, organized in large institutions, consists of programs that train for specific professions. These programs are typically developed together with the industry in which the graduates will work after completing the program. Because the quality rating and funding of a program depends on the dropout rate in the Netherlands, it is important for the vocational institutes to gain more insight in the dropout process. Also for students it is more beneficial to choose the 'right' program since students have to pay college tuition each year. When expectations of students are more aligned with the expectations of the program (e.g. required attitude/skills/etc.) it might prevent students to drop out or switch programs ). There are numerous educational programs at ICED level 2, 3 and 4, educating e.g. administrative service provider, nurse or bike technician. ROC TOP offers about 50 vocational education programs to approximately 4000 students. About 120 students enroll in the Sports Academy each year.
Prospective students who register a ROC TOP take part in an intake procedure. Because of the large amount of students who want to start the sport-related programs, Sports Academy Amsterdam is allowed to select their students via an intake test, an interview and a sports test. The dropout rate of the Sports Academy is high. Of the 349 students who started in 2012, 2013 and 2014, 171 (49%) students dropped out, which was ROC TOPs motivation to gain more insight in student expectations.

Data and descriptive statistics
This study focusses on the students of cohort 2016 and 2017 entering ROC TOP Sports Academy Amsterdam (Table 1). Every student who registers at ROC TOP Sports Academy Amsterdam receives an invitation to make an intake test which assesses the cognitive skills and personality traits of the student (since this test does not focus on expectations and are not predictive of dropout (Eegdeman et al. 2018), its results were not included in our analyses). For the two cohorts used in this study, our expectations The Dutch Education System. After finishing primary education (grade six) children are tracked into three educational levels based on the recommendation of the primary school. About half of the children are tracked into prevocational Education (VMBO, four years) which prepares children for upper secondary vocational education (vocational education in this paper, MBO in Dutch; three or four years). Vocational education is itself tracked, with four levels labelled one to four. Depending on this level vocational education students are on classified as International Standard Classification of Education (ISCED) level two (first stage of secondary education building on primary education, typically with a more subject-oriented curriculum), three (second/final stage of secondary education preparing for tertiary education and/or providing skills relevant to employment, usually with an increased range of subject options and streams) or four (programs providing learning experiences that build on secondary education and prepare for labor market entry and/ or tertiary education; the content is broader than secondary but not as complex as tertiary education). Note: ISCED level in parentheses ( adapted from  questionnaire was filled out before this intake test. Of the 208 students (107 starting in 2016, 101 in 2017), 202 filled out the questionnaire before they started the program. 1 After finishing the first year, or at the moment of dropping out, the students filled out the questionnaire again. At that moment, 77 students had dropped out (37%). Of the 208 students, 117 students filled it out on both occasions (56%). To create a benchmark with which we can relate students' answers, 10 of the 14 teachers (71%) of the Sports Academy filled out the questionnaire. Their assignment was to fill out the best possible answers a student on track to graduate could give (Table 2).
When analyzing the output of students we decided to take first-year dropout as the dependent outcome variable, since later dropout is a more ambiguous outcome (e.g., students may be employed and decide that they do not need a formal closure of their  program). First year drop out is defined as every student who stopped following the educational program; students who repeated the first year because of insufficient grades were not counted as a dropout. The first-year dropout rate of cohort 2016 was 33% and for 2017 was 42%. Of all 78 first-year dropouts, 31 students filled out both questionnaires (40%). Unfortunately, the response rate of the second questionnaire was quite low, because often there was no contact with these students anymore after they dropped out or the dropped-out student did not want to fill out the questionnaire. When a ROC TOP student drops out of a vocational program, there is also a standardized 'exit interview' . In this interview, administered by the student councilor, student are asked to choose which from a set of categories comes closest to the reason why he or she dropped out. We had access to a summary of these interviews, and recoded the reason that student listed for their dropout. The filled out forms of the exit interviews mentioned in the context section are also used in order to see if it is possible to corroborate the reason why students in our sample dropped out.

Questionnaires
Two questionnaires were developed by ROC TOP Sports Academy Amsterdam to get better information on expectations of incoming students, and on reasons of dropout. It was developed together with students, alumni and teachers, taking as inspiration the 'Beginning College Survey of Student Engagement' (Cole & McCormick, 2009). These questionnaires asked about (1) expected effort coming year, (2) the content of the educational program including characteristics students, alumni and teachers think a student should have to be successful in this program (see Table 3 for the questions relevant for this study). Also students were asked what GPA they expected to attain in the first year ('Expected GPA' , defined as the mean grade the student expect to obtain in the coming year). Administrative data from prior education (containing prevocational GPA) were also available and were used in the analyses.
To determine whether student content expectations are realistic, we created two realism scores: Effort realism describes whether students had adequate expectations about the time they would need to invest in the program, Content realism describes whether students had adequate expectations about the aspects of the program and student characteristics needed to succeed.
We first set out to determine for the effort related questions in Table 3 what answer could be considered to be correct. For those questions we did this by looking into the programmed hours, available in the annual plans of the educational program. We then created realism scores for the required effort questions in Table 3. The right answer to the question "About how many hours a week do you think you will spend in a theoretical class?" is 6-10 h. An answer that wholly matched this right one was given 8 points (because there are 8 different answer possibilities), the answer that was furthest away from the true answer given 1 point. All other answers were given points through linear interpolation. 2 Scores were summed and then divided by the maximum score of 24 (3 questions times 8 points) to generate an individual overall effort realism score, which was our main independent variable. When a student scores a '1.0′ as realism score, the student's idea of time investment is exactly what the educational program had scheduled.
The teachers of the Sports Academy were also asked to fill out the best possible answers students could give when the student wants to graduate. This creates a teacher benchmark with which we can relate students' answers. All correct answers, either in the form of planned hours or mean answers of the teachers (labelled as such) are shown in Table 3.
We then generated an individual score capturing how well student expectations matched teacher answers by computing the absolute difference between the answer of the student to the calculated mean score of the teachers (i.e. how close the student answer was to the mean teacher score). This score is referred to as the realism score (e.g. teaching realism, sports realism in Table 5). Since larger values (further away from teacher means) indicate less realism, we added a minus sign to these scores so that they can be interpreted in the same direction as effort realism scores (i.e., higher values as more realism). All content and characteristics scores were also summed to a content realism score.

Empirical analyses
In a first analysis, we then determined whether demographic variables or prevocational GPA (defined as the GPA of the student after finishing their education at lower secondary prevocational education) were related to the content-related expectations of the students. This analysis was done to ensure that any effect of expectations on dropout would not be a proxy for, e.g., effects of prevocational GPA.
In a second analysis, we examined to which extent content-related expectations matter for dropout and retention. We first performed a probit analysis to estimate the association between first year dropout and the raw answers to our questionnaires. We subsequently performed a two-tailed Bayesian t-tests on both raw answers and our realism scores with dropout (yes/no) as independent variable, to test our two main hypotheses, namely that there are differences in expectations between future dropouts and nondropouts, and the more specific one that future dropouts have less realistic expectations.
To address the potential selective attrition in the second questionnaire, we first estimate a logistic regression model to predict the probability of selective attrition based on background characteristics. In the second step we corrected for attrition by including the probability of making the second questionnaire as a covariate in a Bayesian ANCOVA, thereby controlling for the fact that some persons have a larger probability to make the second questionnaire. This approach to address potential selection bias is similar to the inverse Mills ratio (Heckman 1979;Mills 1926). In contrast to the more common method as multiple imputation or model-based estimation (e.g. FIML), the adopted two-step estimator is more robust and preferred for practical applications (Chiburis and Lokshin 2007). Moreover, the mentioned imputation methods are related to the efficiency of the estimated parameter without reducing the variance that is imputed, but it does not so much address the selectivity of the attrition as the imputation is solely based on the observed outcomes. This is distinctively different from the adopted twostep method which uses the complete sample to predict selective attrition (i.e. selection in filling out the second questionnaire) and then, based on the predicted attrition probabilities, take into account the relative attrition probability. More intuitively: we take the fact that it was more likely to observe some persons than others in the estimation sample into account. To be able to compare results with and without correction, we performed the analysis of the second questionnaire using Bayesian ANCOVAs (with correction) and ANOVAs (without correction) instead of t-tests-even though the independent variable of interest (successful vs dropout) has only two levels.
Bayesian t-tests, ANOVAs and ANCOVAs were performed because they allow us to weigh evidence in favor of both the alternative hypothesis that dropouts have less realistic expectations than successful students, and the null hypothesis that there were no differences between dropout and successful students. In particular, Bayesian statistical tests allow us to also quantify evidence in favor of the null hypothesis, which is impossible with classical statistical tests (Rouder et al. 2009). In case the alternative hypothesis was supported, we analyzed the content of the twelve individual questions to test whether students who drop out have a specific image of the educational program that is or is not correct.
We used JASP to conduct the Bayesian t-tests (van Doorn et al. 2019). In JASP default prior distributions are available in cases where prior knowledge is absent, vague or difficult to elicit. These prior distributions can be used for the most common statistical models (e.g., correlations, t-tests, and ANOVA).For t-tests a Cauchy prior distribution with scale 0.707 is used (van Doorn et al. 2019). We also included a robustness check of the Bayes factors that provide evidence for the alternative hypothesis under different prior specifications (see Appendix B). If the qualitative conclusion does not change across a range of different plausible prior distributions, this indicates that the Bayes factor is robust.
Finally, we also performed a Bayesian repeated measures ANOVA to determine whether content-related expectations are different than the responses students gave to the same questions after a year or after dropout. We used the variables that were shown in the preceding analyses to be different for dropped-out and successful students.
While our main analyses were on dropout as a dependent variable, we also analyzed the effects of expectations on GPA. Since GPA is a continuous measure a different set of statistical analyses were required-whence we report them in Additional file 1. However, these analyses yielded the same main conclusions as those with dropout as dependent measure reported here.

Results
The correlation matrix table shows that prevocational GPA does not correlate with any content-related expectations of the student nor with the realism scores. Age and gender were found to be correlated with the amount of hours expected to be spent in sports lessons and in theory lessons. The correlation matrix table can be found in Appendix A (Table 8). Table 4 shows the probit estimation results. We start departure from a model with only the raw answers to the questionnaire and a constant included in the model (the baseline model) and then include student characteristics (prevocational GPA, age and gender). The chosen stepwise approach shows how the estimation parameters of interest change when controlling for these background characteristics, which gives information on whether the answers are independent from these characteristics and increases precision.
The probit analysis shows that there is no association between the different expectations and dropout. We then determined whether there are any differences in expectations or their realism between dropouts and successful students. Table 5 shows the results of Bayesian independent t-tests for all expectations and the realism of these expectations as well as the Cohen's d as the classic effect size measure. Substantial evidence (BF > 3 or BF < 1/3) is shown in italic font style, the bold font style represents strong evidence for either the alternative hypothesis (BF > 10) or the null hypothesis (BF < 0.1). For most expectations there was no substantive evidence (defined as a Bayes factor larger than 3, in favor in the alternative hypothesis or smaller than 1/3rd, in favor of the null hypothesis; Rouder et al. 2009) one way or the other, although most Bayes factors were below 1, suggesting more evidence for the null hypothesis than for the alternative hypothesis. We discuss the exceptions: • The Bayes factor for prevocational GPA (BF 10 = 7.460) indicates substantial evidence for the alternative hypothesis: It was approximately 7.5 times more likely that dropped-out students have a lower prevocational GPA than successful students. Cohen's d shows a medium effect size of d = 0.50 (Sawilowsky 2009). We assessed robustness of this Bayes factor by plotting the Bayes Factor as a function of the prior width r. Across a wide range of widths, the Bayes factor appears to be relatively stable, ranging from about 3 to 5 (this robustness check and those of others strongly differing variables can be found in Appendix B). • The Bayes factor for expected GPA (BF 10 = 0.126) reveals substantial evidence for the null hypothesis, indicating that successful students did not expect a higher GPA in advance than dropped-out students (this could indicate either that there is no difference or even that dropped out students expected to achieve a higher GPA than successful students -we do not differentiate between these two options). • The Bayes factor of BF 10 = 0.040 for age indicates strong evidence for the null hypothesis, meaning that successful students were not older than dropped-out students (d = 0.43). • The Bayes factor of BF 10 = 0.076 for the overall effort realism score reveals strong evidence for the null hypothesis, meaning that dropped out students were not less real-  For all tests, the alternative hypothesis specifies that group 0 (non-dropout) is greater than group 1 (dropout). Since larger values (further away from teacher means) indicate less realism, we added a minus sign to the realism scores so that they can be interpreted in the same direction as effort realism scores (i.e., higher values as more realism). All expectations scores are on a scale from one to five, one means completely disagree/ don't need at all and five means completely agree/need a lot.  istic about the to-be invested hours than successful students (d = 0.18). Similarly, the Bayes factor for 'teaching realism' (BF 10 = 0.057) as well as 'less theory' (BF 10 = 0.074) show that there is strong evidence that dropped-out students were not less realistic about the overall content, the importance of teaching and theory lessons then successful students. • All the Bayes factors for the characteristics that are important for the program show strong evidence for the null hypothesis (perseverance: BF 10 = 0.095, and sociability: BF 10 = 0.086) or substantial evidence (Discipline: BF 10 = 0.126, Front of class: BF 10 = 0.196 and Work together: BF 10 = 0.148), meaning that either there is no difference between students or that dropped-out students expected to need the characteristics more than successful students. Table 6 shows responses on the second time that the questionnaire was administered, after one year or at the moment of dropping out. The last two columns show the results after correcting for attrition. Also shown are grades of the two groups (grades are of all the students in the cohort, not just those that completed the questionnaires). As could be expected, Bayesian t-tests performed on grades show decisive evidence for the alternative hypothesis, indicating students who later dropped out scored lower grades. This was also the case after quarter 1, suggesting that dropped-out students already underperform at the start of the year. For ratings of all effort and most characteristics, however, Bayes factors indicate substantial evidence for the null hypothesis. This indicates that there are no differences between dropped-out students and successful students on how they rated, at the end of the year or the moment of dropping out, how many hours they had to work or how realistic they were about what characteristics were needed to be successful in the program.
Next, we investigated whether expectations changed more strongly for either successful or dropped-out students over the course of the first year. This was done by testing for an interaction between group and moment with a Bayesian repeated measures ANOVA. We did this on only the variables that were shown in the previous analyses to be different for dropped out students and successful student (see Table 5: overall effort realism score, content expectations about teaching, the theory lessons and the characteristics perseverance and sociability). The results are shown in Table 7, with Fig. 2 showing the changes of the mean realism scores. There is decisive evidence for the hypothesis that dropped out students changed their overall effort realism score compared to successful students. Inspection of Fig. 2 shows that dropped-out students initially were more realistic than successful students (our one-sided t-test only tested that dropouts were less realistic), and over time grew closer to the successful students. All other items showed no interaction.
In the two cohorts investigated here, students were asked why they quit the educational program and we then coded their open answers, categorizing them as indicating that program content or the occupation perspective was mentioned as a reason to drop out, both or neither. This was independently coded by two of the authors (Cohen's Kappa (0.67). Although merely 3 out of 59 (5%) spontaneously mentioned expectations being wrong, 50% gave an answer that suggested that either the program or the occupation perspective was different than expected. Most students within these 50% formulated this Table 6 Grades and responses to the expectations questionnaire after one year or at the moment of dropping out Shown are descriptives and Bayesian ANOVA and ANCOVA testing for differences between successful and dropout students. In the ANCOVA, to correct for attrition the likelihood of attrition was added as covariate. Substantial evidence (BF > 3 or BF < 1/3) is shown in italic font style, the bold font style represents strong evidence for either the alternative hypothesis (BF > 10) or the null hypothesis (BF < .1) For all tests, the alternative hypothesis specifies that the two groups differ All expectations scores are on a scale from one to five, one means completely disagree/ don't need at all and five means completely agree/need a lot. The grades were available for more students: All grades: N = 127; Grades quarter 1: N = 71, Grades quarter 2: N = 64, Grades quarter 3: N = 60, Average GPA: N = 53 as that they now considered either the program or the occupation a bad fit to themselves. We also analyzed the reports from the exit interviews carried out by the student advisor with 77 of the dropouts. For 50 of the 77 (71%) the reason coded by the student advisor after the interview was "program-related factors". The student advisor stated to us that these students often faulted their own choice for the program as the reason to drop out.

Discussion
This study investigated whether content-related expectations of an educational program are realistic, and whether a lack of realism was associated with any increase in the likelihood of drop out. It is important to realize that this study was conducted exclusively in the ROC TOP Sports Academy in Amsterdam. Students were selected via an intake test, an interview and a sports test, which could influence motivation and willingness. This very specific environment and the preponderantly male sample can jeopardize generalizability of the results. The empirical results should thus be interpret with caution, as replications are needed to generalize this result. We first tested the hypothesis: do students who drop out have different content-related expectations than students who are successful?, and, are these less realistic? Then we looked at the research question: Do students update their content-related expectations? To test our hypotheses, we analyzed differences in prospective expectations between students that would later turn out to be successful in year one, or drop out.
While not our central concern, it is instructive to first discuss performance expectations. When students were asked what grade they expected to achieve (performancerelated expectations), our results show that at the beginning of the educational program, the students who would later drop out expected similar grades than successful students did, despite having a lower prevocational GPA. This finding might be interpreted in line with the idea that the Freshman Myth,, i.e. unreasonable optimism, could lead to student dropout. Indeed, our findings numerically mimic those of Maloshonok and Terentev (2017) in that students with larger gap between prevocational GPA and expectations Fig. 2 Change in time of mean realism scores from T1 (start of the program) to T2 (either end of the year or moment of dropout), separately for successful students and dropped out students. Only data form those students are included who filled out the questionnaire twice. Error bars show standard error of the mean for future GPA were the ones more likely to drop out (Maloshonok and Terentev 2017), although this pattern was not significant. However, this was a factor of prevocational GPA being lower, not of expectations being different. It could thus also be that neither group knew what to expect and thus opted for an average pass grade, which later performance then made more realistic for well-performing students than for dropouts (see Scheck et al. (2004) for a demonstration of such spurious realism in performance predictions).
Paradoxically, while literature on the Freshman Myth assumes optimistic expectations lead to dropout, exactly the opposite assumption is made in literature on achievement. There it is generally held that, for a given ability level, higher expectations result in higher performance. This is also known as the 'Self-Fulfilling Prophecy' (Könings et al. 2008).
This apparent contradiction has not been resolved yet. Moreover, neither literature has looked in detail at what causes expectations to be overly optimistic in the first place. Expectation formation is a complex process: individuals have access to incomplete and different information, have different frames of reference and likely weight various factors differently in determining expectations (Manski 2004). Also students update their expectations in time based on knowledge about their academic ability gained during school Zafar 2011). Zijlstra and Meijers (2006) even concluded that many students probably have few concrete expectations when they transit to college.
In line with this idea the content-related expectations of students that dropped out were not different from nor less realistic than those of successful students. If there was any relationship, it would have been that subsequent dropouts made a more truthful estimate of the amount of time spent in the program. Where previous studies found relations between expectations and dropout (e.g. Baker et al. 1985;Helland et al. 2002;Jacob and Wilder 2010;Maloshonok and Terentev 2017;Tinto 1975;Zijlstra and Meijers 2006), we could thus not find this in expectations about effort and content. One explanation could be that our study was the first to be performed with vocational students, who could be different in several ways from the college students who were the focus of previous studies. There is also a selection into program participation. The fact that all students have passed an admission procedure for this program could perhaps indicate that the group of students is quite homogenous in their choice to enroll in the program and their expectations, but act different when confronted with reality. Homogeneous groups with similar expectations reduces the unobserved heterogeneity. Alternatively, previous studies of content-related expectations were retrospective. It could simply be that dropouts remember their expectations to be unrealistic, even thought this was is fact no more so than for students who did not drop out.
Indeed, when dropouts were asked why they dropped out, students answered in half of the cases that the concerning program did not fit the student. This suggests that in our sample, toot, a retrospective study would have yielded different results than our prospective one did.
As expected, dropped-out students had lower grades than students who were successful. Interestingly, this was already the case after the first quarter. This is in line with previous studies focussing on college students, where first grades in the program tend to be much better predictors than any other variable that students bring to college (Montmarquette et al. 2001;Plak et al. 2019). This suggests that the first weeks of college are particularly decisive when it comes to subsequent dropout and suggests programs intended to stop dropout may have to start early after matriculation.
The final research question was: Do students update their content-related expectations? Our results show that students do not differ in how their reported expectations develop over time, except for the effort realism score. In line with the findings of Zijlstra and Meijers (2006), at the beginning of the college year students tend to overestimate the amount of time spent in the educational program. After the college year or when dropping out, students have adjusted the number of hours downwards. This could mean that students did not have to work as hard as they expected-but again, estimates from dropouts at the end of the year were no different from those who did not drop out.

Conclusion
If expectations of students would matter, an institution could develop a policy in order to prevent students from making the wrong program choice and prevent students from dropping out. A successful program choice means that the program matches the ability but also the preferences of the student ). This study is not set on identifying which part causes dropout, but how to prevent dropout by making the adjustment and revaluation by students as small as possible. In summary, expectations at the start of the program do not seem to differentiate later dropouts from successful students and thus student expectations do not appear to provide actionable information about what policies can be developed. However, previous studies and our own results have suggested that dropouts themselves do feel that their expectations were wrong. This apparent contradiction can be resolved if one assumes that students generally do not have very specific expectations about the program they are about to enter. It would also be hard to have such expectations without experiencing the program. Specifically for the vocational education context in which this study was performed, in their first year, students gain experience in field internships, which may lead them to discover that they do not like the professions that the program trains them for. Both successful students and subsequent dropouts might thus be surprised by what they encounter once they enter the program. What may differentiate the two might not be the amount of surprise, but only that for successful students the surprise is a pleasant one and for later dropouts it is not.   Table 9 The outcomes of the principal component analysis; the two columns indicate that 72% observed variation in content realism and characteristics realism can be explained by the two factors with an eigenvalue above the cutoff of 1.0 (χ 2 (36) 1570.98 p < 0.001; Cronbach's alpha program content realism 0.91, Cronbach's alpha characteristics realism 0.88) Factor 1 explains underlying variation in realism scores on the characteristics questions and Factor 2 explains variation in realism scores for program content questions