What drives academic peer effects in middle school classrooms in China: Peer composition or peer performance?

This quasi-experimental study estimates academic peer effects in China's middle school (7th–9th grade) classrooms, using data from a large-scale nationally representative survey of middle schoolers in China. Our study design circumvents endogenous sorting by focusing on 52 schools that randomly assigned incoming 7th graders to different 7th-grade classes. Further, reverse causality is addressed by regressing students' 8th-grade test scores on their (randomly assigned) classmates' average 7th-grade test scores. Our analysis reals that all else equal, a one-standard-deviation increase in (8th-grade) classmates' average 7th-grade test scores raises an individual student's 8th-grade mathematics and English test scores, respectively, by 0.13–0.18 and 0.11–0.17 standard deviations. These estimates remain stable when peer characteristics examined in related peer-effect studies are included in the model. Further analysis reveals that peer effects work through raising individual students' time spent studying per week and their confidence in learning. Finally, classroom peer effects are found to be heterogeneous across subgroups: larger for boys, academically stronger students, students attending better schools (i.e., schools with smaller classes and urban schools), and students with relatively disadvantaged family backgrounds (e.g., lower levels of parental education and family wealth).


Introduction
Peers', especially classmates' personal characteristics, cognitive ability, and academic performance are widely believed to be key determinants of individual children's schooling attainments, such as school enrollment, test scores, and degree completion [1][2][3][4][5][6][7]. Concrete evidence of the direction and scale of classroom peer effects can thus inform educational policy regarding ability tracking, teacher assignment, and class management, such that available (but usually limited) resources can be better utilized to foster school-age children's educational development.
However, despite the considerable effort devoted to estimating academic peer effects in the classroom, existing findings have remained inconclusive. While some studies found strong positive impacts of classmates' academic performance on individual students' performance [3][4][5][6], others found moderate to insignificant peer effects in the classroom [8,9]. This inconclusive picture is perhaps not

Data source
The analysis performed in this study makes use of two waves of the CEPS data. The CEPS (China Education Panel Study) is a largescale, nationally representative, school-based survey of middle-school students (7th-9th graders) in China. The CEPS project was designed and conducted by the China Data and Survey Center of the Renmin University of China, and was reviewed and approved by the Institutional Review Board of the Remin University. In the survey, written informed consent to participate project was provided by the participant's legal guardian/next of kin. The data used in this study are publicly available, second-hand data, which do not include any private information of the study subjects and are not individually identifiable.
In the 2013-14 academic year (the baseline), the CEPS adopted a multi-stage, stratified PPS (Probability-Proportional-to-Size) strategy to sample and select participating students. Several steps were involved. First, 28 urban districts or rural counties were selected using the average schooling level of the local population and the proportion of migrants in the local population as stratifiers. Next, four middle schools from each of the 28 chosen districts or counties were selected, using school type (public schools, private schools, etc.) and enrollment size as stratifiers. A total of 112 schools were thus sampled. In each sampled school, two 7th-grade and two 9th-grade classes were randomly chosen, yielding a total of 438 sampled classes. 2 All (19,487) students enrolled in these 438 sampled classes in the 2013-14 academic year (and their parents or legal guardians) participated in the CEPS project. Four waves of follow-up surveys took place in 2014-2018, but currently, only the first two waves of data (collected in the 2013-14 and 2014-15 academic years) are made publicly available. Thus, we base our empirical analysis on the two publicly available waves of the CEPS data. Moreover, we restrict our analysis to those 7th graders interviewed at the baseline (N = 10,279, 92% of whom participated in the second wave) as most of the baseline 9th graders had graduated by the second wave and could not be tracked.
During each wave of the survey, information on sampled students' educational development was collected through interviews using questionnaires separately administered to sampled students themselves, their parents (or legal guardians), and their teachers (including subject teachers, grade headteachers, and school principals). The most important variables in this study are students' academic performance in Chinese, mathematics, and English, the three most important subjects in China's middle school curriculum. Presumably, due to logistical considerations, the CEPS collected only scores on the midterm exam of the first (Fall) semester in each grade. Because the tests were designed by the project schools (rather than the CEPS team), the specific contents of them differ somewhat across schools. As such, to facilitate comparison and interpretation, test scores were standardized within schools (with a mean of zero and a SD of one). Fig. 1, panels A-F, plotting both the original and conditional distributions of 8th-grade classmates' average 7th-grade test scores for our analytical sample (-see the next subsection for sample construction), suggests that these peerperformance measures have sufficient variations for identification purposes.
Also collected were data on sampled students' individual characteristics (gender, age, cognitive skills, etc.), household characteristics (sibship size, parental education, parental employment, family income, etc.), teacher characteristics (gender, years of schooling, teaching experience, etc.), and class/school characteristics (class size, conditions of school facilities, etc.).

Sample characteristics
To implement our quasi-experimental design, we followed previous CEPS-based studies [13,23] and imposed several sample restrictions to construct our study sample. First, we began with the 93 CEPS project schools (of a total of 112) whose principals reported randomly assigning incoming 7th graders to different 7th-grade classes. Next, 34 sampled schools were excluded, where 7th-grade headteachers did not uniformly confirm that class assignments in 7th grade were not based on students' (previous) test scores. 3 Finally, for reasons that will become clear in Section 2.5, seven schools that reassigned their students to different classes in 8th grade were further excluded. Applying these restrictions yielded a final analytical sample of 52 focal schools and 4018 students attending these schools. These students were attending 7th grade at the baseline and 8th grade at the time of the second wave. Table 1 depicts the sample profile of these students. Slightly more than half (51%) of the 4018 sampled students are boys. On average, the students were 14 years old in 2015, having 0.6 siblings. Their fathers and mothers had completed, on average, 11.1 and 10.5 years of formal schooling, respectively. The average class had an enrollment size of 49, taught by a Chinese/mathematics/English teacher with 15.6/15.6/15.5 years of education and 15.2/17.1/16.6 years of teaching experience. These figures are very close to official statistics for China as a whole [24], verifying the representativeness of our analytical sample. 4 2 Ten of the 112 schools have only one class in a target grade; these classes are all included in the sample. 3 China's middle schools use various methods to assign incoming students to different classes at the beginning of 7th grade. In some schools, students are assigned based on their entrance test scores or academic performance in the primary school. Recently, more and more schools have started to assign incoming 7th graders to different 7th-grade classes randomly. The random student/class assignment method is strongly promoted by China's Ministry of Education to ensure equal and fair opportunities among students [24]. Schools that adopted random student/class assignments in 7th grade typically use a computer program incorporating data on incoming 7th graders' gender, residential registration status (hukou), primary school attended, and other factors to find a suitable assignment algorithm, which ensures educational resources are allocated equally across different classes. 4 According to official statistics [24], in China's middle schools, the average class size was 48 in 2013, and Chinese, mathematics, and English teachers completed on average 15.9, 15.8 and 15.9 years, respectively.

Random student/class assignments in 7th grade in focal schools: evidence
To check whether class assignments in 7th grade were indeed performed randomly in the 52 focal schools, we examined the correlations between individual students' own and classmates' 7th-grade test scores in these schools (-as noted in Section 2.2, these are midterm exam scores of the Fall semester in 7th grade). Table 2, panel A, reports the results for the three core subjects: columns (1)- (2) for Chinese test scores, columns (3)-(4) for mathematics test scores, and columns (5)- (6) for English test scores. The results reported in odd-numbered columns suggest that without controlling for any other covariates, the correlations between one's own and classmates' average 7th-grade test scores are large and statistically significant for all three subjects. In contrast, the correlations became much smaller and statistically insignificant once conditional on school fixed effects and students' observed individual and household characteristics reported in Table 1.  Notes: Number of observations (N) = 4018. The sample includes 52 schools that randomly assigned newly-enrolled students to 7th-grade classes but did not reassign students in 8th grade. The sample includes only students who were attending 7th grade at the baseline (i.e., the 2013-2014 academic year).

Table 2
Correlations between one's and classmates' test scores in 7th grade. (1) Chinese

Mathematics English
A. 52 focal schools with random class assignments in 7th grade but did not reassign students in 8th grade Notes: Standard errors in parentheses, clustered at the class level. ***p < 0.01.
A further "falsification" test is to estimate the correlations between individual students' 7th-grade test scores and their classmates' average 7th-grade test scores in the 60 CEPS project schools that were excluded from our final analytical sample (-recall section 2.3), which potentially assigned their newly-enrolled 7th graders into 7th-grade classes in a non-random manner. The results reported in Table 2, panel B, indicate that in these 60 non-focal schools, the correlations between individual students' 7th-grade test scores and their classmates' average 7th-grade test scores remained large and statistically significant, even after school fixed effects and observed individual and household characteristics were included in the models. The contrast in the patterns observed in these two sets of schools provides further support to the presumption that class assignments in 7th grade were indeed done randomly in the 52 focal schools.
A third way to test the presumption of random student/class assignments is to test whether the distribution of educational resources was indeed "balanced" across different 7th-grade classes within given focal schools. More specifically, we followed recent CEPS-based studies and tested if the means of individual students' observed personal and family characteristics (e.g., age, gender, cognitive ability test scores that measure one's general logical thinking and analytical skills, birth order, sibship size, parental education, household registration status, and household income) within a class are correlated with class characteristics (e.g., class size, teacher education, teaching experiences, teaching load, and teaching style) [16,17]. Table 3 reports the results of the test: conditional on the school an individual student was attending, virtually all the correlations are statistically insignificant. Only two correlations (i.e., those between students' average age and their teachers' time spent grading homework and between students' household income level and teachers' time spent on lesson planning) are statistically significant. Yet given the large number of correlations being examined in Table 3, these few significant ones are likely driven by sampling variations.
The tests discussed in this subsection lend strong support to the presumption that in the 52 focal schools, incoming 7th-graders were assigned to 7th-grade classes in a random manner within schools, which offers a unique opportunity to identify within-class academic peer effects in these schools.

Estimation framework
To develop an empirical framework for estimating the effects of classmates' academic performance, we begin with the standard "linear-in-mean" specification [11].
In Equation (1), the outcome variable y ics stands for a test score in a core subject (Chinese, mathematics, or English) for student i in class c of school s; y − i,cs represents the "leave-me-out" mean of the corresponding test scores of this student's classmates (who are denoted by " − i"); Z is a set of individual, family, class, and school characteristics reported in Table 1; as no set of observed factors can fully explain variations in y ics , a disturbance term u is added to balance the two sides of the equation, capturing potential influences of unobserved factors and measurement errors of the observed factors and the outcome variable. If Equation (1) is well-specified, β 1 is the parameter of primary interest, which captures the impact of (randomly assigned) classmates' (average) academic performance on individual students' performance; and the method of OLS (ordinary least-squares) can provide consistent estimates of β 1 . However, OLS estimates of β 1 may be biased due to two potential identification problems. First, since any student is also his/her classmates' classmate, there may exist reverse causality operated from a student i's own academic performance, y ics , to i's classmates' (average) performance, y − i,cs . In that case, OLS will overestimate the actual peer effects. Second, there might exist unobserved confounding factors (in u) that affect a student's own and his/her classmates' academic performance simultaneously, thereby creating a spurious correlation between y ics and y − i,cs . In the case of "endogenous sorting", for instance, where sampled schools assigned incoming 7th-graders to different 7th-grade classes based on their talent unobserved by the researcher, then y ics and y − i,cs may be correlated even if there is no real causal relationship between them.
As discussed above, the CEPS data provide a unique opportunity to circumvent endogenous sorting. More than half of the CEPS project schools (59 out of 112) reported that they randomly assigned incoming 7th graders to different classes upon enrollment (-we have provided evidence for the plausibility of this claim in Section 2.4). Because 7th grade is the first grade in the middle school system in China, such an assignment mechanism created an "experiment" that randomly mixed an incoming 7th-grader with other incoming 7th-graders with different demographic markups, personality traits, and talents. Conditioning on the school that a student is attending, the within-school random classmate assignment also balances out the influence of unobserved confounders. Our analysis, therefore, focuses on the 59 schools with random classmate assignments. Yet, for reasons discussed immediately below, we further excluded 7 schools that reassigned students to different classes in 8th grade.
To circumvent the potential reverse causality from y ics to y − i,cs , we replace y − i,cs with its (one-year) lagged measure. Specifically, we estimate: where y G8 ics is a student i's 8th-grade test score of a given subject (i.e., the midterm exam score of the Fall semester in 8th grade) and y G7 − i,cs is his/her classmates' average 7th-grade test score in that subject (i.e., the midterm score of the Fall semester in 7th grade). Since y G7 − i,cs was generated before y G8 ics , the latter cannot affect the former, which eliminates the concern about reverse causality. But in schools that "reshuffled" students in 8th grade, many students' 7th-grade classmates were no longer their 8th-grade classmates; as such, y G7 − i,cs would not really capture one's 8th-grade classmates' 7th-grade performance in these schools. Therefore, we excluded seven schools that Table 3 Balancing test for the allocation of educational resources across classes within schools.   "reshuffled" students in 8th grade in the analysis. A potential concern is that the lagged peer-performance measure available in the CEPS data, y G7 − i,cs , was constructed based on classmates' midterm exam scores of the Fall semester in 7th grade-i.e., it was measured about two months after random class assignments took place in 7th grade. As such, peer interactions during these two months may open some room for individual students to affect their classmates' academic performance, suggesting individual students' performance during this period as a potential omitted channel variable. To address this issue, we include student i's own midterm exam scores, y G7 ics , in the model: Since y G7 ics was also measured about two months after random class assignments took place in 7th grade, the inclusion of it effectively "blocks" the channel through which peer interactions occurring during the first two months of 7th grade enter the model. Put differently, with y G7 ics being held fixed, β 1 captures the effect of classmates' academic performance after both y G7 − i,cs and y G7 ics were observed.

Main findings
This subsection reports the main findings of this paper. Table 4 presents estimates of classroom peer effects on individual students' 8th-grade test scores in three core subjects: Chinese (columns 1-3), mathematics (columns 4-6), and English (columns 7-9). Since, as discussed above, random student/class assignments of incoming 7th graders were done within the 52 focal schools, all estimates reported in Table 4 have been adjusted for school fixed effects. Three empirical specifications were adopted for each of the three core subjects. The first controls only for students' personal characteristics (e.g., one's own 7th-grade test score, gender, age, and the score of a cognitive ability test that measures one's general logical and analytical skills rather than his/her subject-specific knowledge), in addition to school fixed effects. The second specification adds the household characteristics reported in Table 1, panel E, to the model. The final specification further includes the characteristics of subject teachers reported in Table 1, panel E, as additional controls.
Two findings are notable from the table. Firstly, being (randomly) assigned to a 7th-grade class with high-achieving classmates has a beneficial effect on individual students' academic performance in 8th grade. More specifically, a one-SD increase in classmates' average 7th-grade mathematics test score is associated with an increase of 0.13-0.18 SDs in one's 8th-grade mathematics score (columns 4-6); classmates' average 7th-grade English score has a very similar effect on one's 8th-grade English score (columns 7-9). These effects are within the range of previous estimates, especially those found in East Asian countries. 5 In contrast, while peer effects on one's Chinese test scores are positive, the effects are not statistically significant. These differences in academic peer effects across subjects may reflect that there is limited room for peer interaction in learning Chinese (a native language) than in learning mathematics (a technical subject) and English (a foreign language). They may also result from the different shapes of learning curves across subjects. For example, Chinese skills may require more time to develop and accumulate than mathematics and English skills; thus, peer effects on individual students' Chinese skills may need a longer time to realize [3].
Secondly, and perhaps more importantly, conditional on school fixed effects, the estimated academic peer effects in the classroom remain quite robust to different empirical specifications. The robustness of empirical findings reported in Table 4 lends further support to our key identifying assumption, echoing the evidence reported in Tables 2 and 3: student/class assignments in 7th grade were indeed done randomly in the focal schools.
For comparison purposes, we repeated the analyses reported in Table 4 with the sample of the 60 CEPS schools whose teachers did not uniformly report random class assignments in 7th grade. Appendix Table A1 reports the results: the peer effects estimated using this sample are generally smaller than those estimated using the sample of schools with random student/class assignments in 7th grade (Table 4). This contrast is consistent with the possibility that the 60 schools with potential non-random class assignments sorted incoming 7th graders into 7th-grade classes based on their (previous) academic performance (- Table 2, panel B, has already provided some suggestive evidence for this possibility). There are two reasons for the smaller estimates of peer effects in this sample. Mechanically speaking, with "performance sorting," classmates are likely to have more similar 7th-grade test scores than schoolmates in different classes. Thus, once individual students' own 7th-grade test scores have been controlled for, the relative importance of their classmates' average 7th-grade test scores in explaining their 8th-grade test scores declines. Econometrically speaking, as suggested in panel B of Table 2, sorting by previous performance creates a much higher correlation between individual students' and their classmates' 7th-grade test scores. This higher correlation, in turn, introduces multicollinearity issues in the models: once individual students' own (previous) test scores have been controlled for, the coefficients of classmates' average test scores become less statistically significant (especially for English test scores). Note that when subject teachers' characteristics are further included in the models, the estimated classroom academic peer effects vanish (Appendix Table A1).   Notes: The sample includes 52 schools that randomly assigned newly-enrolled students to 7th-grade classes but did not reassign students in 8th grade. "Household income" levels include "very poor" (reference group), "poor," "average," "rich," and "very rich." Standard errors in parentheses, clustered at the class level. *p < 0.1, **p < 0.05, ***p < 0.01.

The role of peer characteristics examined in previous CEPS-based studies
A related question is: Do the academic peer effects discussed above pick up the effects of peer composition in the class? Or do they represent separate effects of peers' academic performance? Recall from the Introduction that by exploiting random class assignments within schools, previous CEPS-based studies have provided important insights into the impact of classroom peer composition, such as proportions of girls [14], migrant children [15,16], primary-school repeaters [21], classmates with alcoholic fathers [19], only-child classmates [17], on individual students' academic performance. While many of these studies provided suggestive evidence that the effects of classmate composition work through a better learning environment (with more high-performing classmates), 6 none has provided direct evidence on whether classmates' academic performance indeed serves as a key channel.
The following analysis helps provide an answer. Table 5 examines how adding those peer-composition measures examined in previous CEPS-based studies in our models affects the estimated effects of peer academic performance-for Chinese (panel A), mathematics (panel B), and English (panel C) test scores. Columns 1-5 of Table 5 show that the estimated effects of peer academic performance remained similar after adding previously examined peer-composition measures. Further examining how including other commonly-used peer characteristics (i.e., parental education and family income) may affect our estimation results yields a similar pattern (columns 6-8). These findings suggest that classmates' academic performance exerts an additional and separate effect on individual students' academic performance rather than merely picking up the peer-composition effects found in previous CEPS-based studies.

Potential working channels
Classroom peer effects may also work through multiple channels (besides those examined in previous CEPS-based studies). The CEPS data enable us to explore two potential channels: increased time spent studying and raised confidence in learning. Table 6 reports the results.
Firstly, given China's competitive high-school admission system, seeing one's classmates perform better, one may decide to spend more time studying to either "catch up" with or even surpass them. To test this prediction, we estimate the effects of classmates' average 7th-grade test scores on one's time spent doing homework for each of the three core subjects in 8th grade. The results suggest that classmates' better academic performance induces individual students to spend more time working on homework assignments-on both weekdays and weekends. Specifically, a one-SD increase in classmates' average 7th-grade test score in a given subject raises one's time spent doing homework by 0.04-0.06 h on weekdays ( Table 6, panel A) and 0.06-0.07 h on weekends ( Table 6, panel B).
Yet, spending more time doing homework does not necessarily mean that one would actually learn more. Low-achieving students may need to spend more time on homework because of their relatively low learning efficiency. Thus, it is also informative to explore other channels. Another potential channel is students' perceived difficulty in learning (which reflects their "confidence in learning"). We use sampled students' responses to the following survey question, "Do you feel difficulty in learning [subject]? ("subject" = "Chinese", "mathematics", or "English")" to measure their perceived difficulty in learning. As shown in Table 6, Panel C, classmates' better academic performance in a given subject is associated with a lower level of difficulty a student perceives in learning that subject, which may improve his/her learning efficiency.
Note that the CEPS also asked sampled students to recall their perceived levels of difficulty in learning the three core subjects in 6th grade (the last grade of primary school), which provides an opportunity to perform a falsification test for our quasi-experimental design. If one's classmates in 7th grade were indeed randomly assigned, their academic performance in 7th grade should not have any predictive power for one's perceived difficulty in learning in 6th grade. The results reported in Table 6, Panel D, verify this expectation: the associations between classmates' 7th-grade test scores and one's perceived difficulty in learning in 6th grade are statistically insignificant for all three subjects. Thus, the estimates reported in Panel C of Table 6 can be considered causal. Arguably, it is still possible that students' lowered perceived difficulty in learning is an outcome of their improved academic performance rather than a channel to achieve the latter (-a similar argument applies to their increased learning time). But in that case, these results provide corroborative evidence that significant academic peer effects exist in China's middle school classrooms.
Note also that the above explorations also suggest another line of robustness checks. If improved confidence in learning and increased learning time are driven by better academic performance, then individual students' academic performance may also be driven by high-achieving classmates' confidence in learning and learning time, but not classmates' better academic performance per se. To test this possibility, we include classmates' average weekly learning time and confidence in learning (i.e., the proportion of classmates feeling difficulty in learning a subject) in our models. As shown in Appendix Table A2, the estimated effects of classmates' academic performance remain robust to the inclusion of these two variables, regardless of whether these two variables were measured in 7th grade (odd-numbered columns) or 8th grade (even-numbered columns). 6 It has been found that girls outperformed boys in all three core subjects [14]; compared to students who never repeated a grade in primary schools, repeaters scored significantly lower in all three core subjects [21]; students with alcoholic fathers also had lower academic achievement [19]; students with no siblings scored significantly higher in mathematics compared with students who had siblings [21]. These findings suggest that having proportionally more girls, non-repeaters, classmates with non-alcoholic fathers, and only-child students in the class helps provide a better learning environment for all students in the class. Girls [14] Yes Migrant children [15,16] Yes Repeaters [21] Yes Classmates with alcoholic fathers [19] Yes Only-child classmates [18] Yes Girls [14] Yes Migrant children [15,16] Yes Repeaters [21] Yes Classmates with alcoholic fathers [19] Yes Only-child classmates [18] Yes Family backgrounds of 7th-grade classmates: Mean  [15,16] Yes Repeaters in primary school [21] Yes Classmates with alcoholic fathers [19] Yes Only-child classmates [18] Yes Family backgrounds of 7th-grade classmates': Mean Notes: The sample includes 52 schools that randomly assigned newly-enrolled students to 7th-grade classes but did not reassign students in 8th grade. "Controls" include the full set of control variables (i. e., students' own subject-specific 7th-grade test scores, personal, family, and teacher characteristics, and school fixed effects) reported in Table 4. Standard errors in parentheses, clustered at the class level. *p < 0.1, **p < 0.05, ***p < 0.01.

Table 6
Potential working channels of academic peer effects in the classroom.
Outcome variables: Notes: The sample includes 52 schools that randomly assigned newly-enrolled students to 7th-grade classes but did not reassign students in 8th grade. Controls include all control variables (i.e., students' own subject-specific 7th-grade test scores, personal, family, and teacher characteristics, and school fixed effects) reported in Table 4. Standard errors in parentheses, clustered at the class level. *p < 0.1, **p < 0.05, ***p < 0.01.

Table 7
Heterogeneity in academic peer effects in the classroom.
Outcome variables  Notes: The sample includes 52 schools that randomly assigned newly-enrolled students to 7th-grade classes but did not reassign students in 8th grade. All models include the full set of "control variables" (i. e., students' own subject-specific 7th-grade test scores, personal, family, and teacher characteristics, and school fixed effects) reported in Table 4. In Panel J, the "not poor" group includes the "average," "rich," and "very rich" groups. Standard errors in parentheses, clustered at the class level. *p < 0.1, **p < 0.05, ***p < 0.01.

Heterogeneity in classroom peer effects
More insights into how classroom peer effects work may also be learned by examining how these effects vary across different subgroups of students. Thus, we repeated the analyses reported in Table 4, but this time separately by gender, previous academic performance, parental education, family income level, school location, and class size. Table 7 reports the results, revealing three informative patterns. First, classroom peer effects differ greatly between boys and girls: compared with girls (panel A), boys (panel B) benefit more from peer influence in the classroom, which is consistent with the findings of many previous studies [15,25,26]. 7 Boys may be more sensitive to their learning environment and thus easier to be affected by their classmates [26,27]. No matter what causes these gender differences, it is likely to widen the existing gender gap in educational attainment in China [28] through classroom peer effects. Second, students from families with relatively disadvantaged backgrounds, e. g., less-educated parents (panels F-H) and less wealthy (panels I-J), benefit more from peer interaction. For these students, classmates' better academic performance serves to compensate for their disadvantaged backgrounds to some extent. Finally, students attending schools presumably of higher quality, e.g., urban schools (panel L) and schools with smaller class sizes (panel N), also gain more from peer interaction, suggesting that school quality and peer effects complement each other in education production. Thus, it is not surprising that academically stronger students (measured by their academic performance in 7th grade (panels C-D)) enjoy larger peer effects.

Discussion and conclusion
Exploiting random class assignments of incoming 7th-graders in 52 middle schools participating in the CEPS, our quasiexperimental analysis performed in this study discovered significant peer effects in China's middle-school classrooms. A one-SD increase in classmates' average 7th-grade mathematics/English test score is associated with an increase of 0.13-0.18/0.11-0.17SDs in one's 8th-grade mathematics/English test score, which implies a social multiplier effect of about 1.2 for mathematics/English learning. 8 While these effects seem modest, with reference to a recent study on the effect of private tutoring in China [29], the peer effects found in this study (at least those on mathematics test scores) are nearly double those of one year's private mathematics tutoring.
Also discovered are two working channels of classroom academic peer effects: increased time spent studying and raised confidence in learning. Perhaps most importantly, classroom peer effects are found to be heterogeneous across subgroups of students: the effects are more sizable (and thus more statistically significant) for students with relatively disadvantaged family backgrounds, academically stronger students, as well as students attending better schools. While the overall positive peer effects suggest ability tracking as a means to improve learning efficiency (especially for students with relatively disadvantaged family backgrounds), the heterogeneity in peer effects in favor of students enjoying stronger academic ability and higher school quality raises equity concerns.
Despite these important findings, a note on the limitations of our study is in order. First, as pointed out in Section 3, the CEPS data lack detailed information on sampled students' academic performance before middle-school enrollment. Thus, there is a possibility that our estimates of academic peer effects are still tainted with reverse causality operated from one's own academic performance to one's classmates' performance, in that our primary peer-performance measures were recorded roughly two months after students' enrollment in 7th grade. However, as we have argued above, this possibility is likely to be small because two months may be too short for reverse causality to occur. And the inclusion of students' own 7th-grade test scores can help break the potential reverse causality to a large extent. Nonetheless, studies with access to students' pre-enrollment test scores should be conducted to provide more accurate estimates of classroom peer effects.
Second, the CEPS data do not permit us to examine students' behavior in the classroom in response to differences in classmate composition. While we did explore several possible channels through which classroom academic peer effects work, these channels do not provide direct indications of how classmates interact in the classroom. More research investigating students' classroom behavior and peer interaction is needed to gain further insights into the working channels of classroom peer effects.
Third, also due to data limitations, we only examined peer effects among students in a single grade (8th grade). Yet, the role classmates play may differ greatly across different grades, since the level of difficulty in learning new materials and the approach to grasping these materials would evolve as students advance to a higher grade [25]. Future studies employing follow-up data from the CEPS (or other suitable data) might help discover informative, dynamic patterns of academic peer effects in China's middle school classrooms.
Finally, the measures of peers' academic performance used in this study are defined based on classmate averages. Yet, as pointed out in the recent literature [30,31], students may form peer subgroups within the class-after all, not all classmates will influence an individual student in the same way. As such, our measures of peer academic performance may be tainted with measurement errors, thereby masking some meaningful interactions within peer subgroups in the classroom. Future studies with more advanced research designs and sophisticated empirical methods that can identify students' actual peer (sub)groups are expected to be fruitful. 7 Note that not all existing studies found such a pattern. For example, in England peer effects were found to be larger for girls than boys in middle schools [9]; in a Chinese college, peer effects were only found for female students but not for male students [35]. 8 Peer effects create social multipliers [36]. For example, if the peer effect in math is b (<1), a one-SD increase in peers' average math test scores increases one's math test score by b SDs, which would lead to another b 2 -SD increase in test scores in the next "round", etc. Hence, the total effect would be 1+ b + b 2 + … = 1/ (1-b). When b = 0.18, the social multiplier effect is 1/(1-0.18) = 1.22.
Despite these limitations, however, the current study still provides valuable new evidence of the existence, scale, and working channels of classroom academic peer effects that can help deepen our understanding of the nature of education production and inform educational policy in China and in countries alike.

Author contribution statement
Qihui Chen: Conceived and designed the experiments; Wrote the paper. Chunchen Pei: Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Yuhe Guo: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data. Shengying Zhai: Performed the experiments; Contributed reagents, materials, analysis tools or data.

Additional information
Supplementary content related to this article has been published online at [URL].

Data availability statement
The data that support the findings of this study are available in Chinese National Survey Data. Archive at http://www.cnsda.org/index.php?r=projects/view&id=72810330 and http://www.cnsda.org/index.php?r=projects/ view&id=61662993. The data are also available from the corresponding author, Dr. Qihui Chen (chen1006@umn.edu).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.