Feasible Peer Effects: Experimental Evidence for Deskmate Effects on Educational Achievement and Inequality

: Schools routinely employ seating charts to influence educational outcomes. Dependable evidence for the causal effects of seating charts on students’ achievement levels and inequality, however, is scarce. We executed a large pre-registered field experiment to estimate causal peer effects on students’ test scores and grades by randomizing the seating charts of 195 classrooms (N=3,365 students). We found that neither sitting next to a deskmate with higher prior achievement nor sitting next to a female deskmate affected learning outcomes on average. However, we also found that sitting next to the highest-achieving deskmates improved the educational outcomes of the lowest-achieving students; and sitting next to the lowest-achieving deskmates lowered the educational outcomes of the highest-achieving students. Therefore, compared to random seating charts, achievement-discordant seating charts would decrease inequality; whereas achievement-concordant seating charts would increase inequality. We discuss policy implications.

C OMPOSITIONAL peer effects-the causal effects of students' exposure to other students (Ogburn and VanderWeele 2014)-matter because they can influence educational choices and outcomes.Compositional peer effects have commanded attention ever since the Coleman Report (Coleman 1966, 1968) found a positive correlation between students' outcomes and the characteristics of their peers.Today, compositional peer effects inform research and policy on segregation, busing, school choice, selective admissions, affirmative action, ability tracking, and grouping.(for example, Bygren 2016; Hallinan 1994; Slavin 1990; Terrin and Triventi 2023).
Most past research in education has focused on compositional peer effects in large groups, such as schools and classrooms.This research, however, rarely isolates direct peer effects from competing mechanisms.For example, the total effect of educational tracking comprises direct peer effects, differential teacher quality and school resources across tracks, curricular differentiation, and the adjustment of teachers' expectations to their audience (Card and Giuliano 2016; Duflo, Dupas, and  Kremer 2011; Hanushek and Wößmann 2006; Oosterbeek, Ruijs, and de Wolf 2023;  Schiltz et al. 2019). 1 Studying smaller, sub-classroom environments benefits theory development because it helps isolate compositional peer effects from competing mechanisms.Deskmate effects, in particular, are a promising target of inference because although teachers may sort into schools and classrooms in response to school-or classroom level student achievement, they do not sort into classrooms in response to the seating chart.Similarly, teachers are unlikely to adjust expectations, curricular demands, or teaching styles at the desk level.Hence, deskmate effects likely represent pure, direct, peer effects.
Studying micro-environmental peer effects, and deskmates in particular, may also benefit policy.Although macro-level interventions, such as school de-segregation and tracking, are hindered by cost and political controversy, intervening in seating charts is cheap and broadly accepted because teachers typically control the seating chart.Hence, even if micro-level peer interventions have small effects at the individual level, their feasibility facilitates scaling to generate large effects in the aggregate.
To study micro-environmental peer effects in education and their aggregate consequences (efficiency and inequality), we executed the largest randomized field experiment on the effect of close peers on educational outcomes by randomizing the seating charts in 195 classrooms across 41 Hungarian primary schools.Students were assigned to random deskmates for the duration of one semester.Following best-practice advice, we analyzed multiple learning outcomes, including teacherassigned grades and machine-graded standardized tests; measured outcomes both during and after the intervention; pre-registered a detailed analysis plan to prevent p-hacking and hindsight bias; and we corrected our results for multiple-hypothesis testing to control the risk of "false discoveries," that is, the over-interpretation of chance findings.
We present results for the effect of deskmates on students' individual outcomes and results for the aggregate consequences of seating charts on mean achievement levels and inequality.Our primary confirmatory analyses found no dependable evidence that sitting next to a deskmate with higher prior achievement or sitting next to a female deskmate meaningfully affected learning outcomes for the average student.Exploratory analyses, however, revealed important effect heterogeneity.Although the middle half of students were not affected by sitting next to deskmates with the lowest or the highest baseline achievement, sitting next to the highest-achieving deskmates improved the educational outcomes of the lowest-achieving students, and sitting next to the lowest-achieving deskmates lowered the educational outcomes of the highest-achieving students.This implies that achievement-discordant seating charts, compared to random seating charts, would decrease inequality in educational outcomes, whereas achievement-concordant seating charts would increase inequality in educational outcomes-without affecting mean achievement in the classroom.For theory, our results highlight that even micro-level peer effects can affect distributional outcomes.For policy, these findings are promising because seating charts are easily manipulated in practice.
The article proceeds as follows: Section 2 lists theoretical mechanisms, states our expectations and reviews prior work.Section 3 describes the institutional setting.Section 4 details the experimental design.Section 5 explains estimation.Section 6 presents the results.Section 7 discusses the implications and concludes.

Expectations and Prior Research
Our theoretical expectations about the effects of deskmates on students' learning outcomes are informed by multiple related literatures on compositional peer effects in education, including the small literature on deskmate effects (Hong and Lee  2017; Keller and Takács 2019; Li et al. 2014; Lu and Anderson 2015; Wu, Zhang,  and Wang 2023) and the sizeable literatures on dorm room assignments (for example, Foster 2006; McEwan and Soderberg 2006; Sacerdote 2001; Stinebrickner  and Stinebrickner 2006), study groups (for example, Antecol, Eren, and Ozbeklik 2016; Booij, Leuven, and Oosterbeek 2016; Feld and Zölitz 2017), and educational tracking (for example, Card and Giuliano 2016; Duflo et al. 2011; Hanushek and  Wößmann 2006; Oosterbeek et al. 2023; Schiltz et al. 2019). 2 Together, these literatures suggest that deskmate exposures may affect students' academic outcomes via multiple mechanisms.First, students may learn directly from their deskmates by asking them for help, guidance, or explanations (Hanushek et al. 2003).Second, students and deskmates may learn from each other by collaborating on in-class exercises and assignments (Lu and Anderson 2015).Third, students may benefit from a deskmate's quiet focus or be distracted by their disruptive behavior (Lazear  2001).Fourth, students may treat deskmates as role models and emulate their effort and study habits (Stinebrickner and Stinebrickner 2006).Fifth, students may illicitly copy their deskmate's written work, including assignments and exams, or otherwise cheat. 3ome of these potential mechanisms are specific to deskmate relationships, such as pairwise collaboration or copying on exams.Other mechanisms are merely especially salient for deskmates.For example, although disruptive students may affect the entire classroom, they will primarily distract their deskmates; and students are more likely to accept others as role models if they are friends, which proximity promotes (Rohrer, Keller, and Elwert 2021). 4onsequently, we pre-registered two primary hypotheses that sitting next to a deskmate with higher prior achievement (1), and sitting next to a girl (2), increases students' academic achievement at endline.These hypotheses are interrelated because deskmates' characteristics are interrelated, that is, girls, on average, are better students than boys.
Five prior studies investigated deskmate effects on students' academic outcomes, with differing results.Hong and Lee (2017) studied deskmate effects at an elite Korean university, using instrumental variables estimation.They found that sitting next to a deskmate with a one-standard-deviation higher midterm score increased students' own final exam score by 0.12 standard deviations.This is a sizeable estimate, especially because it represents a lower bound on the true deskmate effect under the authors' theoretical model.Point estimates are similar for men and women but only statistically significant for men.Sitting next to a higher-achieving deskmate was found to help the lowest and highest performing students, but not students in the middle of the achievement distribution.Keller and Takács (2019) analyzed forty 10th-grade classrooms in seven underprivileged Hungarian high schools, using ordinary regression with classroom fixed effects.They, too, estimated large effects: a one-standard-deviation increase in deskmates' 8th-grade reading scores increased students' 10th-grade reading by 0.12 standard deviations.No effect was found for mathematics test scores, and effects did not differ by ethnicity.
The usual concerns over observational peer effects estimation apply to the foregoing two studies (Angrist 2014; Bramoullé, Djebbari, and Fortin 2020; Elwert and  Winship 2014; Shalizi and Thomas 2011), including the possibility of unobserved selection into deskmate allocations within classrooms (Keller and Takács 2019), possible violations of the instrument-variables exclusion restrictions (Hong and Lee  2017), and upward bias due to measurement error (Angrist 2014; Feld and Zölitz  2017). 5he three prior randomized experiments found much smaller, or no deskmate effects.Lu and Anderson (2015) randomized the seating chart in twelve 7th-grade classrooms in one Chinese middle school for the duration of one academic year.They found that a one-standard-deviation increase in deskmate's baseline test scores (aggregated across multiple subjects) increased students' endline test scores by only 0.02 standard deviations (p=0.34).Li et al. (2014) and Wu et  al. (2023) executed deskmate experiments in China with monetary incentives for peer performance.In sub-analyses without incentives, they found no evidence for causal effects of deskmates on achievement.
Prior research on deskmate effects has hardly explored implications for aggregate achievement, either for levels (efficiency) or inequality.Nonetheless, published results permit speculation. 7For example, it is easy to see that deskmate effects on aggregate achievement levels require effect heterogeneity-if all students benefited equally from every type of deskmate, then it would not matter, on average, where anybody sits.Hence, the absence of evidence for heterogeneity in the effect of sitting next to higher-achieving deskmates by gender (Hong and Lee 2017) and ethnicity (Keller and Takács 2019), just like Lu and Anderson's (2015) experimental null findings for effect heterogeneity at the desk level, imply no effect of seating charts on aggregate achievement levels one way or the other.
By contrast, when effect heterogeneity exists, it is possible that changing the seating chart may increase or decrease aggregate achievement levels.Predicting such consequences is difficult because they depend on the pattern and magnitude of the heterogeneity and on the extent to which various heterogeneity-relevant deskmate characteristics correlate with each other.Lu and Anderson's (2015) experimental finding for the effect of sitting next to girls suggests some leverage.Although the authors did not report evidence for differential effects of sitting next to a female deskmate on boys or girls, they did find that being fully surrounded by girls-front, back, and sides-increases girls' test scores by 0.2 standard deviations whereas being surrounded by girls may reduce boys' performance.Hence, gender segregation within the classroom as a whole (for example, boys seated on the left side of the room, girls on the right) or across classrooms may increase the mean achievement in the classroom. 8ffects of the seating chart on aggregate inequality do not require heterogenous deskmate effects.For example, in gender-balanced classrooms where girls are better students than boys on average, Lu and Anderon's (2015) finding of equally positive effects of sitting next to a girl on both girls and boys implies that genderconcordant seating would increase inequality, whereas gender-discordant seating would decrease inequality.Similarly, Hong and Lee's (2017) finding that both highand low-achieving students (but not middle students) benefit from sitting next to a high-achieving deskmate suggests that achievement-concordant seating might increase inequality.
In general, however, estimates from past research on deskmate effects, using relatively small samples, are too imprecise to offer a firm basis for speculation.Furthermore, whether these possible effects matter in practice is difficult to predict in the absence of exact calculations.We explore the consequences of our estimates for aggregate outcomes across multiple possible seating charts by simulation below.

Institutional Context
We study deskmate effects on students' learning outcomes in 3rd-to 8th-grade classrooms of rural Hungarian primary schools.Primary education in Hungary commences with first grade at age six and ends after 8th grade.Rural students attend the public school in their local catchment areas where school choice is negligible.Rural primary schools are not tracked by prior achievement or segregated by gender, ethnicity, or other criteria beyond the existing composition of the school's catchment area.Classrooms are quite stable so that students remain with the same classmates across grades.
Seating charts are typically determined by students' homeroom teachers.Teachers testify to their belief in the importance of seating charts for students' outcomes and often seat disruptive next to well-behaved students and lower-achieving next to higher-achieving students.Few teachers purposefully seat girls next to boys. 9tudents spend a significant amount of time next to their deskmate, because seating charts rarely change and apply to all subjects taught in the same room. 10or example, 3rd to 8th grade students receive seven to ten 45-minute lessons per week in the three core subjects: Hungarian literature (reading), Hungarian grammar (writing), and mathematics in the same classroom, accounting for between one quarter and half of the average school day. 11Because many subjects in rural schools are taught in the same room, the total exposure of students to their deskmates in our sample is likely substantially greater.
Deskmates have many opportunities to influence each other directly, for example, through collaborative learning.In a 2022 survey that we conducted among a nationwide sample of N=656 primary school teachers, most teachers reported that deskmates collaborate almost every lesson (61 percent) or at least once a week (95 percent).The three most common deskmate activities were helping each other to learn, working together, and developing social skills.
Students are graded on a combination of monthly written exams, end-of-semester exams, recitations, participation, and sometimes behavior. 12Written work carries the greatest weight for end-of-semester grades.Grades matter for students' grade advancement and admission to tracked upper-secondary schools in 9th grade.
Importantly, students are not graded "on a curve."Eighty-one percent of primary school teachers in our 2021 survey reported using absolute achievement cutoffs for assigning grades.Because grade distributions thus are not fixed, we can detect deskmate effects on average grades, should such effects exist. 13

Consent, Pre-registration and Research Transparency
The study was reviewed and approved by the IRB offices at the Center for Social Sciences, Budapest, and the University of Wisconsin-Madison.Consent was obtained from school districts, principals, teachers, and students' parents.We pre-registered coding choices and statistical analyses in a detailed pre-analysis plan (available at https://www.socialscienceregistry.org/trials/2610), which was submitted before receiving endline data.A replication package containing data, survey instruments, and analytic scripts is available at: https://osf.io/ehjf8/.

Recruitment, Random Assignment, and Compliance
In early 2017, we contacted all primary schools in seven contiguous rural counties of north-central Hungary via the heads of the local school districts.We obtained initial participation agreements from 55 schools, in which most 3rd-8th grade classrooms were anticipated to implement our randomized seating chart in at least three core subjects (Hungarian literature, grammar, and mathematics), and all students in a classroom would receive instruction in these subjects together (for example, no ability grouping).
Shortly before the start of the 2017 fall semester, we randomized students via unconstrained random partitioning to freestanding, two-person, and front-facing desks within each participating classroom.Desks were arranged on a grid and separated by aisles to ensure that each student had only one deskmate.We based randomization on the class rosters from the preceding spring semester and stipulated a replacement algorithm to account for changes to class rosters via exits and entries during the summer.We requested adherence to the random seating chart for the duration of the entire fall semester from September 2017 until January 2018 (five months).Although teachers were permitted to reseat students after randomization, we asked that students be moved in pairs to preserve deskmate assignments wherever possible.We defined the deskmate composition resulting from randomization and replacement as the intended seating chart for our intention-to-treat analysis.
Among recruited classrooms, we pre-registered to exclude all classrooms that did not meet certain inclusion criteria. 14We pre-registered a sample of 3,814 students.After additional pre-registered exclusions, 15 our final analytic sample contained 3,365 students across 195 classrooms of 41 schools.
We measured twice the compliance with the intended seating chart.Teachers reported that 94 percent of students in the analytic sample sat next to their intended deskmate in mid-September, two weeks after the start of the school year, and our field team recorded 86 percent compliance during in-person visits between October and December.

Data and Coding of Main Variables
The primary treatment variables derive from students' assignment to sit next to a specific, randomly allocated deskmate.We characterize the deskmate of student i in classroom c of school s using two baseline variables, T D ics : (1) deskmate's baseline GPA, defined as the average of deskmate's teacher-reported spring semester 2017 grades in the three core subjects (Hungarian literature, grammar, and mathematics, graded on a scale from 1 to 5, where 5 is best); and (2) deskmate's gender.Secondary treatment variables comprise deskmates' five subject-specific baseline grades in grammar, literature, mathematics, diligence, and behavior.We filled in missing teacher reports of students' baseline grades from students' self-reported baseline grades (3 percent of the cases).
Our primary outcomes are two measures of students' own educational achievement, Y ics : (1) students' endline GPA (January 2018), defined as the average of students' teacher-reported end-of-term grades in the three core subjects Hungarian literature, grammar, and mathematics; 16 (2) students' grade-specific standardized reading comprehension test score.Reading tests contained between 10 and 19 items, depending on grade level, and were developed for this study from the test banks of the PISA-like National Assessment of Basic Competencies (NABC) by the Hungarian Educational Authority.The test was administered in-class over 25-minutes as part of the 45-minute endline survey in the spring of 2018, one to three months after the completion of the intervention.Reading tests were machine-graded, precluding grader bias.We analyze the percentage of correctly answered items.
Secondary outcomes include (a) the five subject-specific fall-term grades in literature, grammar, mathematics, behavior, and diligence; and (b) the average of students' scores on teacher-written classroom tests in literature, grammar, and mathematics. 17e note that several of our outcomes depend on each other: endline GPA aggregates the subject-specific grades in literature, grammar, and mathematics; and these subject-specific grades, in turn, heavily depend on students' classroomtest scores and, to a smaller extent, on diligence and behavior.By contrast, our standardized reading test scores do not mechanically depend on other outcomes.
We collected additional baseline information on students' demographics (gender, age, ethnicity [Roma or not]) and socioeconomic status (asking teachers to name the richest/poorest students in their classrooms) from teacher reports.These covariates were used for balance checks.Anticipating missingness, we pre-registered to use these covariates only as robustness checks in supplementary outcomes models.
Table 1 gives descriptive statistics for the analytic sample.Half of the sample is female; students are 12 years old on average, and 25 percent are Roma.Baseline GPA had mean=3.7 and standard deviation=1.Missingness in baseline variables is low (0-5 percent).Moderate missingness in outcome reading scores (10 percent)  is due to students' absence on the day of the endline survey or lack of parental

Balance Checks
The key advantage of randomization is that it guarantees comparable ("balanced") treatment and control groups in expectation and thus justifies causal inference without the threat of unobserved confounding.We tested for balance following Guryan, Kroft, and Notowidigdo (2009) by regressing each baseline variable, X ics , of student i on students' deskmates' baseline characteristic, X D ics , the leave-one-out mean characteristic in the classroom, X −ics , and classroom fixed effects for the analytic samples of the two primary outcomes, with standard errors clustered at the school level.This procedure circumvents the artifactual correlation between student's own and their deskmate's characteristics induced by (successful) random partitioning within the classroom (Angrist 2014).Results demonstrate that the data are well-balanced (Online supplement Table A1).There are no, or only substantively minor, associations between students' and their assigned deskmates' baseline characteristics, and only one out of eleven associations is statistically significant.

Primary Analyses
Our primary confirmatory analyses tested two main hypotheses.First, we hypothesized that sitting next to a deskmate with higher baseline achievement increases students' endline achievement (GPA or reading scores), Y ics .Second, we hypothesized that sitting next to a girl increased students' endline achievement, Y ics .
We evaluated each hypothesis using the following model, where T D ics stands either for deskmate's baseline GPA or deskmate's female gender, respectively; T ics is student's own corresponding baseline characteristic, which must be controlled to avoid confounding by the artifactual correlation that is induced between students' and deskmates' characteristics by (successful) random partitioning (Angrist 2014); X ics is a covariate to increase statistical precision (students' baseline GPA when T D ics is deskmate's gender, and student's gender when T D ics is deskmate's baseline GPA); η cs are classroom fixed effects to account for the experimental design that randomized deskmates within classrooms; and ϵ ics is an individual level error term.
Due to randomization, the focal coefficient β 1 identifies the causal effect of sitting next to a deskmate with characteristic T D ics . 18All other coefficients estimate nuisance parameters and do not have a causal interpretation.

Secondary Analyses
We executed five sets of exploratory secondary analyses.First, we replaced our primary outcomes in equation 1 with students' subject-specific endline grades and average scores from teacher-written classroom tests.Second, we replaced the outcome with students' subject-specific endline grades and the treatment with deskmates' corresponding subject-specific baseline variable.
Third, we explored heterogeneity in the effect of deskmate's baseline GPA by students' own baseline GPA.Specifically, following prior scholarship (for example, Duflo et al., 2011; Sacerdote, 2001), we divided baseline GPAs within each classroom into three tiers-low (bottom quartile), middle (middle two quartiles), and high (top quartile)-and fully interacted students' and deskmates' baseline GPA categories with each other, where lower-case l, m, and h refer to student's own baseline GPA tiers, and uppercase L, M, and H, refer to their deskmate's baseline GPA tiers, with middle students sitting next to middle deskmates, mM, serving as the reference category.Fourth, we estimated deskmate effects by deskmate's and student's own gender, where g and b refer to the student's own gender (girl or boy), and G and B indicate deskmate's gender. 19ifth, we explored the robustness of our results by replacing the preregistered measures of baseline and endline achievement with student scores on the comprehensive and nationally standardized NABC testing program.Since NABC scores became available only after preregistration, and only for select grade levels, we report results in Online supplement B.

Estimation, Standard Errors, and Multiple Hypothesis Testing
All models were estimated using the xtreg fixed-effects regression command in Stata, version 17 (StataCorp 2021).Robust standard errors are clustered at the school level.We report two types of hypothesis tests.First, we report two-sided hypothesis tests at conventional (unpenalized) levels of statistical significance.Second, we correct for multiple hypothesis testing and report penalized tests that hold the false-discovery rate at 5 percent using Benjamini and Hochberg's (1995) procedure.The false-discovery rate is the probability of making at least one "false-discovery" (Type I error) across a set of statistical tests.Firm norms for defining test sets have not yet evolved.We consider the deskmate-related tests within each model for a primary outcome, and again the deskmate-related tests across all models for secondary outcomes within each table or figure as a separate set of tests. 20

Deskmate Effects by Deskmates' Baseline GPA
We found only weak evidence that sitting next to a deskmate with a higher (rather than a lower) baseline GPA increased the educational achievement of students on average.Although estimated deskmate effects were positive and in the expected direction for all primary and secondary outcomes (Table 2, and Online supplement Table A2), effect sizes were substantively small, and almost no estimate remained statistically significant after correcting for multiple hypothesis testing.
Specifically, the estimated causal effects of sitting next to a deskmate with a one-grade higher baseline GPA on the primary outcomes-student's own endline reading score (β = 0.29, p = 0.45) and endline GPA (β = 0.02, p = 0.13)-were only 1 percent and 2 percent of a standard deviation, respectively.Neither effect was statistically significant at the conventional 0.05 level (Table 2, Columns 1 and 2).
Columns 3 to 8 in Table 2 report exploratory results for the effects of sitting next to a deskmate with a higher baseline GPA on students' secondary outcomes.We found only weak evidence of deskmate-GPA effects on students' endline grades in literature, math, or behavior.Effects on grammar, diligence, and scores on teacher-written classroom tests were statistically significant at the conventional (unpenalized) 5-percent level but substantively small (not exceeding 4 percent of a standard deviation on any outcome). 21Only the estimated deskmate-GPA effect on students' endline grade in diligence remained statistically significant after correcting for multiple hypothesis testing, and even that only after adding pre-  We conclude that the experiment does not provide dependable evidence for causal effects of sitting next to higher-achieving deskmates on the educational achievement of the average student.This conclusion is further supported by supplementary analyses of newly available comprehensive standardized test scores from the Hungarian NABC, a PISA-like testing program, as alternative exposure and outcome measures (see Online supplement B).
Because the average effects reported in Table 2 (and Tables A2 and A4) may obscure important effect heterogeneity, we next conducted heterogeneity analyses that allowed deskmate effects to vary by deskmates' and students' own baseline GPA (Figure 1, equation 2).We found that sitting next to a high-GPA deskmate (rather than a middle-GPA deskmate) increased several achievement outcomes of low-GPA students (that is, reading scores, endline GPA, grammar grades, and scores on teacher-written tests).Conversely, sitting next to a low-GPA deskmate (rather than a middle-GPA deskmate) reduced high-GPA students' outcomes (that is, endline GPA, grammar, literature, mathematics and diligence grades, and scores on teacher-written tests).High-GPA students also benefited from sitting next to a high-GPA deskmate in grammar and diligence.Neither sitting next to a high-GPA nor a low-GPA deskmate (rather than a middle-GPA deskmate) appreciably affected any outcomes of middle-GPA students. 22he positive effects of sitting next to a high-GPA deskmate for low-GPA students were similar in size to the negative effects of sitting next to a low-GPA deskmate for high-GPA students.Several of these heterogenous effects were substantively quite large.For example, sitting next to a high-GPA deskmate increased low-GPA students' endline GPA by about 0.15 standard deviations, whereas sitting next to a low-GPA deskmate decreased high-GPA students' endline GPA by about 0.2 standard deviations.
Many of the conventionally statistically significant negative effects of sitting next to a low-GPA deskmate on high-GPA students' outcomes remained statistically significant after correcting for multiple hypothesis testing (that is, endline GPA, literature, diligence), as indicated in Figure 1, and several positive effects of sitting next to a high-GPA deskmate on low-GPA students' outcomes (that is, endline GPA, grammar, teacher-written tests) very nearly remained statistically significant, too (Online supplement Table A5).
We conclude that the experiment provides dependable evidence for strong heterogenous deskmate effects on the educational achievement of high-and low-GPA students.

Deskmate Effects by Deskmates' Gender
We found no dependable evidence that sitting next to a female deskmate increased the educational achievement of the average student.Our primary analyses (Columns 1 and 2 in Table 3) show that sitting next to a girl (compared to a boy) has no detectable effect on the endline reading scores (β = 0.03, p = 0.63) or endline GPA (β = 0.03, p = 0.08) of students on average, as point estimates are positive but substantively small and not statistically significant.Our secondary analyses (Columns 3-8 in Table 3) indicate that sitting next to a girl increased only the average students' endline grammar grade (β = 0.06; p = 0.02) but not their other outcomes.All point estimates were substantively small and not statistically significant after correcting for multiple hypothesis testing.
Similarly, we found no dependable evidence for effect heterogeneity in the causal effect of sitting next to a girl either.The heterogeneity analysis shown in Figure 2 (equation 3) indicates positive effects of sitting next to a female deskmate on female students' endline GPA, grammar, and literature grades of around 5 percent of a standard deviation.These effects, however, were no longer statistically significant  Benjamini and Hochberg's (1995) procedure with a 5% false-discovery rate across deskmate coefficients for the secondary outcomes indicates no statistically significant coefficients.Corresponding models with additional covariate controls in Appendix Table A3.A6.Reference category: Desk-mate's gender = Male after correcting for multiple hypothesis testing.There is no evidence that boys benefit from (or are harmed by) sitting next to girls.

Consequences for aggregate achievement levels and inequality
The presence of heterogenous deskmate effects across students with different baseline GPAs (Figure 1) implies that changing the seating chart can affect aggregate achievement levels and inequality.We illustrate the effects on within-classroom achievement levels and inequality with stylized simulations (assuming classrooms containing 24 students seated at freestanding two-person desks). 23The key simulation parameters are the heterogenous causal deskmate effects estimated from equation 2, shown in Figure 1.We further assume that students differ only with respect to their baseline GPAs (mean=3.8,SD=0.9 on a 5-point scale).We consider the effects of three prototypical seating charts: achievement-concordant seating charts that pair-up students with the most similar baseline GPAs; achievement-discordant seating charts that pair-up high-GPA with low-GPA students; and random seating charts that allocate deskmates randomly. 24 shows averages of 500 simulation runs for a classroom with 24 students seated at two-person desks and a baseline GPA distribution that mimics the empirical distribution (left-skewed, min=1, mean=3.8,max=5, SD = 0.9); simulation parameters for causal deskmate effects on endline GPA are from Eq. 2 Table 4 presents three main results. 25First, rearranging the seating chart by prior achievement has no appreciable effect on the average achievement level within the classroom.In the random seating chart, the average endline GPA is 3.75 (SD=0.69).In the achievement-concordant and achievement-discordant seating charts, the average achievements are 3.78 and 3.74, respectively.These differences are too small to matter in practice. 26Rearranging the seating chart has no appreciable effect on achievement levels because the substantial deskmate effects on the fifty percent of students with low or high prior achievement are still too small to compensate for the absence of deskmate effects on the middle half of students.
Second, rearranging the seating chart by prior achievement does affect aggregate inequality within classrooms.In the random seating chart, the standard deviation of students' endline GPAs is 0.69 grade points.Compared to random seating, achievement-concordant seating increases the within-classroom standard deviation by 6 percent to 0. 73.By contrast, achievement-discordant seating decreases the standard deviation by 14 percent to 0.59 (Panel A).
It follows from the parameter estimates shown in Figure 1 that all movement is in the tails of the distribution (Column 2, Panels B-D).Compared to random seating, achievement-discordant seating reduces the achievement of students with high baseline GPA by -0.17 points, because they now exclusively sit next to lowachieving deskmates, which exert a negative effect on their grades.The same seating chart increases the achievement of low-GPA students by +0.1 points, because they now exclusively sit next to high-GPA deskmates, which exert a positive effect on their grades.The reverse explains the inequality-increasing effect of achievementconcordant seating: no high-GPA students sit next to low-GPA students, which would have decreased their achievement, and no low-GPA students sit next to high-GPA students, which would have increased their achievement.
Third, as an aside, we note that seating charts also affect inequality within each tier of students: compared to random seating, both achievement-concordant and achievement-discordant seating charts decrease the variation of endline outcomes within the groups of students with high-, middle-, and low baseline GPA (Panels B-D, Columns 3 and 4).The reason is that, compared to random seating, the two systematic seating charts expose each group of students to a more homogenous group of deskmates. 27This demonstrates that systematic seating charts based on students' baseline characteristics can increase the internal coherence of groups defined by these characteristics, regardless of whether the seating chart increases or decreases outcomes inequality within the classroom overall.

Discussion
Peer effects matter because they can influence educational choices and outcomes.Dependable, field-experimental evidence for peer effects, however, remains scarce, especially at the sub-classroom level.We conducted a large field experiment to investigate causal peer effects of deskmates on students' learning outcomes by randomizing the seating chart in 195 classrooms for the duration of one semester.We draw two main conclusions.First, there is no dependable evidence for causal effects of sitting next to deskmates with high baseline GPAs or next to girls on students' average learning outcomes.Although estimates pointed in the expected positive direction across all outcomes (that is, standardized reading scores, endline GPA, subject-specific grades, and classroom exams), point estimates were substantively small, few estimates were statistically significant by conventional standards, and hardly any remained statistically significant after correcting for multiple hypothesis testing.Because our sample was large and standard errors were small, we can rule out substantively large deskmate effects on the average student's learning outcomes.
Our results thus contrast with prior observational studies that found large average deskmate effects in an elite Korean university (Hong and Lee 2017) and in Hungarian high schools (Keller and Takács 2019).Our results also fail to confirm Lu and Anderson's (2015) experimental finding of the positive effects of sitting next to a girl in one Chinese middle school.Like the randomized experiments in China (Lu and Anderson 2015); Li et al. 2014, Wu et al. 2023), we find no evidence for an effect of sitting next to a higher-achieving deskmate on average.. Second, we found dependable evidence for substantial effect heterogeneity by students' own baseline achievement (GPA), similar to Hong and Lee's (2017) observational findings.Across multiple learning outcomes, the lowest-achieving students benefited from sitting next to the highest-achieving students in the classroom.In turn, multiple outcomes of the highest-achieving students were diminished by sitting next to the lowest-achieving students.For example, sitting next to a deskmate from the bottom quartile of students within the classroom (measured by baseline GPA) reduced the endline GPA of top quartile students by 0.2 standard deviations, which is nearly as much as the descriptive difference in baseline GPA between boys and girls.Concretely, this means that two out of three high-achieving students seated next to low-achieving deskmates rather than middle-achieving deskmates would receive a lower grade in one of the three core subjects that comprise their GPA in our analysis. 28eterogenous peer effects have important policy implications.Although we found that seating charts do not meaningfully affect aggregate achievement levels (efficiency), they can affect inequality between students: Achievement-concordant seating charts would increase inequality, whereas achievement-discordant seating would decrease inequality.In this respect, our experimental results align with the observational literature on between-and within-school tracking, which reports that tracking by prior achievement (as the macroscopic equivalent of achievementconcordant seating charts) is associated with increased inequality (Terrin and  Triventi 2022), but contrast with experimental studies that sometimes indicate that tracking can increase aggregate achievement levels (for example, Duflo et al.  2011).
Employing seating charts to influence inequality presents ethical questions that hinge on competing policy objectives and normative preferences regarding which students should benefit.For example, policymakers who want to decrease inequality in student achievement might advocate achievement-discordant seating charts.The resulting decrease in inequality, assuming that our results generalize to the new setting, however, would be accomplished not only by increasing the achievement of the lowest-achieving students but also by reducing the achievement of the highest-achieving students.The gain of one group of students would be the other group's loss.Furthermore, the highest-achieving students within a given classroom will often be distinctly underprivileged when schools are stratified by prior achievement (for example, due to regional and neighborhood differences, as in the United States and Hungary).Reducing aggregate inequality by adopting achievement-discordant seating charts might thus unintentionally hurt the most promising among the neediest students.By contrast, policymakers who want to avoid systematically harming specific groups of students (or, because all students have to sit next to somebody, spread inevitable harm and benefits fairly) might advocate for frequent random reseating, albeit at the cost of foregoing the inequalityreducing effect of achievement-discordant seating.
Whether seating charts can effectively influence inequality also depends on additional parameters.Two considerations give pause.First, peer effects typically are 'second-order' effects (Borgen, Borgen, and Birkelund 2023).In ours as well as all prior deskmate studies, their magnitudes pales in comparison to the educational inequalities indexed by students' own and their families' baseline characteristics.Second, changing the seating chart will affect total inequality within a school, district, or nation only indirectly via its effect on within-classroom inequality.Consequently, the relative effect of seating charts on total inequality will diminish with the magnitude of between-classroom inequality.If there is no between-classroom inequality in students' baseline characteristics, then the effect of the seating chart on within-classroom inequality equals its effect on total inequality and hence will be large.If, however, students' achievement differs across classrooms, for example, due to geographic differences or ability tracking, then the effect of the seating chart on total inequality will be smaller.
That said, decades of costly, controversial, and frequently failed policy interventions at supra-classroom levels have demonstrated that changing aggregate educational outcomes is exceedingly difficult.Although their effects may be small at the individual level, seating charts are a promising tool for affecting educational inequality in the aggregate because intervening on seating charts is eminently feasible in practice.

Notes
1 For example, Duflo et al.'s (2011) famous tracking experiment in Kenyan primary schools found that all students benefited from exposure to higher-achieving peers, which would suggest negative effects of tracking on low-achieving students.Nevertheless, the total effect of tracking was positive for all students because teachers adjusted their teaching style to benefit low-achieving students.
2 Because college roommates rarely take the same courses (exceptions include Carrell,  Fullerton, and West 2009; Lyle 2007; Li et al. 2019), this literature speaks more to the effect of coresidence (akin to neighborhood effects) than to peer effects in the context of receiving shared instruction (Stinebrickner and Stinebrickner 2006).
3 Cheating affects students' measured learning outcomes (grades and test scores).Regardless of whether students learn anything in the process, measured outcomes are important because they affect grade retention and a host of future transitions (e.g., educational enrollment, hiring).We imply no normative judgement.
4 Another potential mechanisms is raised by 'Big-Fish-Little-Pond' theory in educational psychology, which posits that exposure to academically stronger peers depresses, and exposure to academically weaker peers increases students' academic self-concept, which, in turn, might affect students' achievement (e.g., Marsh and Parker 1984).We tested this prediction in the current setting and found no evidence for deskmate effects on students' academic self-concept (Keller, Kim, and Elwert 2023).
5 In observational studies of peer effects, classical measurement error produces upward bias rather than attenuation toward zero (Angrist, 2014).The attenuating effect of classical measurement error is restored when peer relations are randomly assigned (Feld  and Zölitz 2017).
6 The authors argue against differential disruptiveness as a mediator.Copying from a deskmate on the outcomes test was prevented by spreading test takers across multiple rooms.
sociological science | www.sociologicalscience.com 19 We did not execute pre-registered heterogeneity analyses using machine learning tools, because the software packages we pre-registered cannot accommodate the classroom effects fixed that our analysis requires for identification.
20 In so doing, we attempt to strike a balance between over-and under-penalization.On one hand, because Benjamini and Hochberg's (1995) procedure does not account for dependencies between tests, the procedure is conservative (that is, reduces the power to reject the null of no effect).In this sense, our corrections may be considered stringent.On the other hand, combining test sets across tables and figures would make the corrections even more conservative.In this sense, our choice not to declare test sets across tables and figures may be considered lenient.
21 The deskmate effects on students' endline grades in grammar and diligence appear to be driven by deskmates' baseline grades in grammar and diligence, respectively (Online supplement Table A4)-although these estimates are not significant after correcting for multiple hypothesis testing, either.
22 This suggests that the absence of effects on the learning outcomes of the average student (reported in Table 2) is explained by the absence of effects on middle-GPA students, who comprise half of the sample.
23 The defer the formal analysis of optimal seating chart assignments to future research.
24 Achievement-concordant seating charts rank students by GPA and seat pairs of mostsimilar students at the same desk.Achievement-discordant seating charts rank students by baseline GPA and seat the student with the highest GPA next to the student with the lowest GPA, and then continue to form subsequent pairs in order.In each classroom, there exist one achievement-concordant, one achievement-discordant, and many possible random seating charts.Table 4 reports averages cross 500 simulated classrooms.We also considered various other seating charts, which did not add much and are omitted here.
25 Simulation results should primarily be read for qualitative evidence on the direction of effects.The quantitative change in inequality depends on additional parameters, which this simulation does not vary.For example, the absolute effects of seating charts on within-classroom inequality will generally increase with the variance in the baselinecharacteristic on which the seating chart is based.Pointedly, even with strongly heterogeneous deskmate effects, changing deskmates will not change outcomes when students do not differ in their baseline characteristics.
26 Neither achievement-discordant nor achievement-concordant seating charts guarantee the most extreme results in general.
27 Concretely, under random seating, high-GPA students are exposed to a mix of low, middle, and high baseline GPA deskmates, which pull their endline GPA in all directions.By contrast, in achievement-concordant (discordant) seating, the endline GPA of all students with high baseline GPA is pulled upward (downward), because all have high-(low-)GPA deskmates, resulting in less endline inequality within the high-achievement group.Similar reasoning explains the inequality-decreasing effect of systematic seating charts for students with low and middle baseline GPAs, respectively.
28 The aggregate endline GPA has SD ≈ 1. Reducing the endline grade in one out of three subjects by 1 for two out of three students reduces endline GPA by 2 3 /3 ≈ 0.2 .

Figure 1 :
Figure 1: Heterogenous deskmate effects on educational outcomes by student's and deskmate's baseline GPAs.Notes: Estimated deskmate effects on various educational outcomes, by students' and deskmate's baseline achievement.Pre-registered specifications (Eq.2).Each panel visualizes the effects of sitting next to a deskmate in the lowest (L) or highest (H) baseline-GPA quartile, respectively, compared to sitting next to a deskmate in the middle (M, reference) two quartiles, separately for students in the lowest (l), middle (m), and highest (h) baseline GPA quartile.Deskmate effects are expressed in standard deviations of the outcome.Bars display 95-percent confidence intervals.Bold bars indicate statistical significance at the (unpenalized) α=0.05 level.For example, the faint bar in the top left corner shows that the effect of sitting next to a deskmate in the lowest baseline GPA quartile (compared to sitting next to a deskmate in the middle quartiles) on the endline reading score of students in the lowest baseline GPA quartile is not statistically significant.Contrasts marked by $ remain statistically significant at a false-discovery rate of 5% afterBenjamini-Hochberg (1995)  corrections for multiple hypothesis testing (separately considering the tests within each model for a primary outcome [reading and GPA] and all tests across all models for secondary outcomes in this figure as a set, respectively).Full regression results shown in Appendix TableA5.Reference category: Desk-mate's baseline GPA = Middle

Table 1 :
Descriptive statistics for the main variables in the analytic sample.
a Missingness in baseline grades and GPA is reported after replacing missing teacher reports with student self-reports, where possible.consenttoparticipate in the assessment.Missingness in fall-term test scores(16  percent)results from teacher non-response at endline.

Table 2 :
Deskmate Effects on Students' Achievement by Deskmate's Baseline GPA.: GPA is computed from grades in grammar, literature, and mathematics (5 is best, 1 is worst).All models control for classroom fixed effects and include a constant.No additional controls are included.Cohen's D equals the deskmate coefficient divided by the standard deviation of the outcome.Robust standard errors are clustered at the school level in parentheses.Table reports conventional (unpenalized) hypothesis tests.Penalizing significance levels for multiple hypothesis testing by Benjamini and Hochberg's (1995) procedure with a 5% false-discovery rate across deskmate coefficients for the secondary outcomes indicates no statistically significant coefficients.Pre-registered robustness checks of each model with additional covariate controls in Appendix Table A2.

Table 3 :
Deskmate Effects on Students' Achievement by Deskmate's Gender.: GPA is computed from grades in grammar, literature, and mathematics (5 is best, 1 is worst).All models control for classroom fixed effects and include a constant.No additional controls are included.Cohen's D equals the deskmate coefficient divided by the standard deviation of the outcome.Robust standard errors are clustered at the school level in parentheses.Table reports conventional (unpenalized) hypothesis tests.Penalizing significance levels for multiple hypothesis testing by Note

Table 4 :
Simulated effects of different seating chart designs on levels and within-classroom inequality in student achievement (GPA).