VIETNAM : Identifying Reliable Predictors of Learning for Results-Based Financing in Education RESULTS-BASED FINANCING AROUND THE WORLD

Many education systems around the world have reached nearly universal access to schooling, but ensuring high quality learning for all students has proven to be more difficult to achieve. Results-based financing (RBF) has the potential to transform the way in which education systems improve by incentivizing students, parents, teachers, school administrators, and other stakeholders to achieve better results. RBF mechanisms work by linking financial incentives to measurable results such as school attendance, dropout rates, student test scores, or other indicators of education quality. Conditional cash transfers (CCTs), teacher performance pay systems, and disbursement- linked indicators (DLIs) are all examples of RBF that have been shown to be effective at improving learning outcomes at the student, parent, teacher, and school district levels. However, directly financing learning outcomes can be problematic for many reasons - because it can add such distortions to real learning as teaching to the test, because it is difficult to set targets for learning for all students with widely diverse abilities, and because teachers, students, and policymakers may not know how to improve learning. Therefore, as a precondition to establishing RBF systems, it is first necessary to identify the intermediate drivers of student learning to shed light on the mechanisms through which learning is achieved.

The Results in Education for All Children (REACH) Trust Fund at the World Bank funded the development of a machine-learning predictive modeling tool in Vietnam to identify which variables are reliable predictors of student learning outcomes.This predictive model was used to analyze over 2,000 variables from the Young Lives (YL) longitudinal survey to predict which ones are the key drivers of student performance in language and math among fifth graders in Vietnam.Using this approach revealed that the most important variables were related to students' characteristics, including their raw cognitive abilities (as measured by a Cognitive Development Assessment and other tests administered early in childhood), physical characteristics and health, school and pre-school trajectory, and their routines and habits.In addition, teacher characteristics and parental expectations are also important predictors of student performance.Effective RBF mechanisms could be designed to take into account many of these categories of variables.For example, parents and teachers could be given incentives to implement interventions known to promote children' s cognitive and physical development, by parents before school age and by teachers during the school years.Also, parents could be given incentives to become more involved in their children' s education, or school administrators could be incentivized to hire teachers with the most effective characteristics.
Vietnam has achieved learning outcomes that are on a par with much wealthier countries and that are higher than those of most countries at similar income levels.However, to continue to build on these learning gains, the government is moving away from donor aid based on "more of the same" inputs and towards a new emphasis on financing results.In light of this, the Ministry of Education intends to implement disbursement-linked financing and other RBF approaches, which will make it necessary to identify practical indicators that can be linked to financing in order to avoid the common pitfalls involved in directly incentivizing student learning, such as testing to the test or cheating.The predictive model developed in this study has helped to fulfill this requirement by identifying a series of key indicators to which disbursements could be linked.This study will help the Government of Vietnam and education stakeholders to under-stand in detail both the drivers of the country' s educational success and the factors that limit some students from sharing in that success.However, just as incentives conditioned on test scores can distort learning, providing incentives to achieve intermediate indicators can also cause distortions.Therefore, when implementing these RBF approaches, it will be necessary to monitor teachers, students, and parents carefully to ensure that the intermediate indicators are not being manipulated.

Context
Over the last 20 years, Vietnam has been a development success story, and education has played a significant role in this success.According to the World Bank, the total poverty rate in Vietnam fell from 57 percent in 1992-1993 to 37 percent in 1997-1998. 1 The primary school gross enrollment rate has reached a high of 96 percent, while the secondary school gross enrollment rate grew from 32 percent in the early 1990s to 85 percent in 2010. 2 In addition, there is only a small gap in enrollment between boys and girls at both the primary and lower secondary school levels.
In addition to high levels of access to education, Vietnam has also had significant success in achieving good education quality.Based on the results of the 2012 and 2015 rounds of the Programme for International Student Assessments (PISA), Vietnam' s 15-year-olds performed as well in math, reading, and science as their peers in much richer countries and much better than those from other developing countries.Although Vietnam has one of the lowest income levels of all PISA participating countries, its performance has been better than the international average. 3e Government of Vietnam has also enacted a series of reforms and investments in education that have contributed to its success.In 2005, the Education Law committed the government to providing primary and lower secondary education for all students, and this law was amended in 2010 to extend universal education to pre-primary school.The Government of Vietnam spent 18.5 percent of its public expenditure and 5.7 percent of the country' s GDP on education in 2013 -well above the global average. 4hy was the Analytical Strategy for the Intervention Chosen?
To extend the progress that has already been made in student learning, the Government of Vietnam is continuing to pass ambitious reforms.It has piloted new pedagogical approaches through the Vietnam: Escuela Nueva program, co-financed with the World Bank.In this program, students learn through group discussions with their peers while their teachers act as facilitators, which is very different from the traditional learning approach in Vietnam.The government is also rapidly moving away from donor aid based on "more of the same" inputs and towards innovative disbursement-linked financing approaches, including both domestic and international sources of funding.The Ministry of Finance recently embraced the use of disbursement-linked approaches in major donor-assisted investments, while borrowing in the social sector is being tied to performance and results rather than inputs.For example, the Enhancing Teacher Education Program uses financial incentives to strengthen teacher education institutions using DLIs such as the satisfaction rates of teachers and principals with continuous professional development programs.To facilitate the shift from input-based aid to results-based financing, the Vietnamese government has recognized the critical need to gather accurate and reliable information on what works to improve its education outcomes.The government' s ultimate goal is to improve student learning outcomes, but providing direct incentives to students or teachers to improve test scores can be problematic for several reasons.First, paying students or teachers for improvements in test scores may introduce distortions into real learning, such as "teaching to the test" or cheating.Second, it can be difficult to aggregate students who have diverse learning abilities into a single learning target that is neither too difficult nor too easy to achieve.Third, even though all stakeholders may be properly motivated to improve learning, they may not know how to translate their inputs and effort into higher academic achievement.Therefore, rather than financing either basic inputs or learning outcomes directly, which often leads to distortions, the government intends to adopt RBF interventions that incentivize intermediate indicators such as teachers' characteristics or children' s cognitive abilities, which lead to positive learning gains but are not as easy to manipulate.
However, before such an RBF intervention can be designed, it is critical to identify which of these intermediate indicators of education quality are the best predictors of learning outcomes.Doing this accurately and reliably requires robust statistical analysis of all potential determinants of learning.Studies using a regression approach have shown that a unique combination of resources, investments in education, and cultural factors in Vietnam have resulted in disciplined and focused students, hard-working teachers with close supervision from principals, and committed and involved parents with high expectations for their children.While these conditions might help to explain Vietnam' s impressive learning gains, statistical analysis has found that all these factors can explain only 50 percent of Vietnam' s positive academic performance at most. 5 Therefore, further analysis is needed to identify the additional drivers of learning in Vietnam.

How does the Analytical Strategy for the Intervention Work?
To identify the key drivers of student learning in Vietnam, analysis was conducted on the largest and most detailed dataset related to student success-the Young Lives dataset.Using these data, six distinct predictive models were developed using an artificial neural network (ANN) approach.ANN predictive modeling can find patterns within the complex relationships between inputs and outputs, which makes it an ideal tool to demystify the "black box" of which inputs contribute to learning gains.This methodology uses machine-learning techniques to improve the quality of prediction over time as more data are introduced and the relationships between variables become clearer.The models were trained using data on a subset of students to establish the relationships between various predictors and learning and were then tested using the remaining data to measure the predictive power of these relationships.
The ANN models were built to predict two different outcome variables -which students would be high performers (in the top 33 percent of the student distribution) and which students would be low performers (in the bottom 33 percent) on grade 5 examinations in math and Vietnamese.The independent variables used in the model to predict student outcomes were drawn from the Young Lives longitudinal study, which followed Vietnamese students for 15 years.This survey collected over 2,000 variables in 15 categories, including students' cognitive, non-cognitive, and physical characteristics, parents' and household characteristics, teachers' and principals' characteristics, teaching practices, school characteristics, and the characteristics of the broader school community.Examples of variables in each category are listed in the table to the right.
The predictive models produced a prioritized list of variables and categories of variables that were the most accurate and reliable predicators of high or low student performance as well as predictive weights that quantified the strength of the prediction for each variable.This list was the basis of the investigation of how these factors contribute to learning outcomes.It is important to note that the variables may interact differently between high performers and low performers.
To test the value of the ANN approach, it was contrasted with a statistical analysis using a more traditional technique -logistic regression.Six logistic regression models were built using the same independent variables and the same outcomes, and the results were compared to those produced by the ANN models.
One important caveat that must be noted when interpreting the results of the predictive modeling is that there is a fundamental difference between prediction and causation.Predictive models can identify which variables predict learning, but this does not necessarily mean that changing those variables will improve learning.For example, owning a textbook might be predictive of higher test scores, but this could be only because richer families are the only ones that can afford textbooks, and children of rich families are more likely to have better learning outcomes for other reasons.Further investigation is needed to determine whether the use of the textbook or some other variable contributes to learning.

Variable Category
Example of Variables

Children's cognitive factors
Raw score in Cognitive Development Assessment

Children's physical factors
Weight-for-age z-score; Health compared to peers

Children's routines and habits
Hours of sleep; Hours spent studying

Children's school trajectory
Age started school; Years of pre-school

Children's non-cognitive factors
Attitudes and perceptions; Trust;

Parents' background
Parents' language; Ethnic group

Household socioeconomic status
Household durable goods; On list of poor households

Household education
Father's and mother's years of education

Parental expectations
Child's level of education; First job

Teaching practices
Frequency of homework; Whether a calculator is used in the classroom

Years of experience;
Attitudes; Teaching awards

Years of experience;
Highest level of education

School information
Number of pupils per classroom;

Community
Quality of health facility;

Existence of adult education classes
What are the Results?
The predictive models achieved a high level of accuracy in predicting student performance.
The six predictive ANN models were able to achieve between 95 percent and 100 percent accuracy in terms of predicting which students would be in either the top third or bottom third of the student population.Furthermore, the ANN models were able to predict high and low performers in both math and language more accurately than the logistic regression models, which were between 77 percent and 97 percent accurate.
Student characteristics (cognitive ability, physical factors, routines and habits, and school trajectory) and teacher characteristics were found to be the most predictive categories of variables in predicting student performance.
Although the most predictive variables for each subject (math and language) and outcome (high-performing and low-performing) varied somewhat, the most consistently predictive categories of variables were the following: (i) students' raw cognitive ability (as measured by a Cognitive Development Assessment and other tests administered early in childhood), which was predictive of academic performance later in school; (ii) students' physical factors and health, such as birth weight and weight-forage; (iii) students' routines and habits, such as how much time they spend or studying; (iv) household socioeconomic status, such as the possession of certain durable goods; (v) students' school trajectory, including years of pre-school and age at starting school; and (vi) teachers' characteristics, such as years of experience, attitudes, or teaching awards.
These findings confirm that students' characteristics -which cannot be easily influenced by policymakers -continue to play a significant role in their academic performance.However, variables that are more within the control of education policy-makers -such as teachers' characteristics -were second in importance to students' characteristics.While teachers' characteristics are more amenable to change through interventions, the expected effect sizes of these changes on learning would be lower than the effect sizes if it were possible to change variables that are more fundamental to learning, such as students' key cognitive abilities like working memory.This is consistent with previous research that has demonstrated the crucial role played by working memory and attention in academic achievement.However, there is no "silver bullet" variable to predict learning as many variables are needed to accurately predict students' academic performance.
While the predictive models were able to identify those variables that best predict student performance, each variable on its own has a relatively small predictive weight of between 0 percent and 3 percent.Each of the 15 categories of variables contributed between 1 percent and 19 percent predictive weight to the overall models.In each of the six models, the top 20 variables contributed between 21 percent and 22 percent predictive weight.Therefore, it is the cumulative effect of the entire set of variables that makes it possible to accurately predict performance rather than any one variable or set of variables.This insight has important implications for policymakers when choosing between a broad or a targeted set of interventions and when managing expectations of the potential impact of any one intervention.These results suggest that RBF mechanisms would need to address a broad set of indicators in order to have a significant effect on learning.
What are the Implementation Lessons Learned?
By providing a clearer picture of the inter-relationships between the variables that lead to different levels of student performance, these results can help policymakers to design targeted interventions to foster the most effective determinants of student learning.This study has identified three important priority areas for policymaking.First, the analysis reconfirms that policies to promote children' s cognitive ability and physical health early in life should be prioritized, such as early access to preschool programs, early cognitive stimulation, and pre-natal and post-natal medical care.Second, it points to the need for interventions to identify children and families at risk using indica-tors like children' s health at birth.Third, it highlights the need for interventions that focus on factors that are more closely under the control of education policymakers.Teachers should be carefully selected and trained to maximize the teacher characteristics that were found to be most predictive of good student performance.
In all three of these policy areas, the results of this study could be used to inform the development of RBF mechanisms to incentivize various stakeholders to improve student performance.First, parents' investments in their children' s cognitive ability and physical health could be incentivized, for example, through CCTs aimed at encouraging pre-school attendance, pre-natal and post-natal medical care, childhood nutrition, and other actions that have been shown to be effective in boosting children' s cognitive ability and physical health.Similarly, RBF mechanisms could give teachers in the classroom an incentive to maximize their students' cognitive stimulation, particularly in pre-school.Second, school administrators could be incentivized to improve their teacher selection and teacher training processes to enhance the teacher characteristics that best predict student performance or teacher training institutions could be incentivized to foster these characteristics.Third, parents could be incentivized to be more involved in their children' s education and to actively encourage their children to work hard and perform well in school.Lastly, officials at intermediate levels of government could also be incentivized through carefully targeted disbursement-linked indicators to improve any of the indicators that have been shown to predict student learning.
While incentivizing intermediate indicators rather than learning outcomes can have the benefit of removing distortions to learning, they can cause distortions themselves.Therefore, the implementation of any RBF approach needs to be carefully monitored by the relevant stakeholders to ensure that any manipulation of the indicators is minimized.

Conclusion
RBF mechanisms can improve student learning outcomes by providing financial incentives to students, parents, teachers, school administrators, or local governments to encourage them to work towards improving learning results.However, as a precondition to establishing these RBF mechanisms, it is critical to identify those indicators and interventions that are most likely to improve student learning.Using a machine-learning predictive modeling approach with a dataset of over 2,000 variables, this study achieved high levels of accuracy in predicting both high-performing and low-performing 5th grade students on math and language exams in Vietnam.This modeling methodology effectively identified variables related to student cognitive ability and physical health and teacher characteristics as the most predictive of good academic performance in Vietnam.Although the study found these categories of variables to be the most predictive of student learning, there is no single variable that predicts test scores, and understanding the complex inter-relationships between all of the drivers of learning is important for achieving better results.These findings suggest that RBF mechanisms should be designed to focus on a broad set of education quality indicators rather than any single indicator.These findings will help the Government of Vietnam to design and implement RBF incentives and other effective interventions to enable it to build on its progress in improving learning for the next generation of students.
The Results-Based Financing Around the World series is produced by REACH with generous support from the Government of Norway through NORAD, the Government of the United States of America through USAID, and the government of Germany through the Federal Ministry for Economic Cooperation and Development.This note was adapted from Musso, Mariel F., Eduardo C.Cascallar,  Neda Bostani, and Michael Crawford.(2018)."Are Teaching

of Variables by Category for each Subject-Outcome Combination (%)
Predictive WeightVIETNAM: Identifying Reliable Predictors of Learning for Results-Based Financing in Education 5 The Results in Education for All Children (REACH) Trust Fund supports and disseminates research on the impact of results-based financing on education outcomes.The goal is to collect and build empirical and operational lessons learned to help and development organizations design and implement the most appropriate results-based financing mechanisms for improved learning outcomes.For more information about who we are and what we do, go to: http://www.worldbank.org/reach.