Learning statistics and probability through peer tutoring: A middle school experience

The academic effects of learning statistics and probability through peer tutoring were analysed in the research reported on here. Two hundred and eight students enrolled in Grades 7, 8 and 9 participated. Fixed-and same-age peer tutoring was implemented 3 times per week for 6 weeks. Each tutoring session lasted approximately 25 minutes. The main aims of this research were to quantify the effect of peer tutoring and to determine any differences among grades. A pre-test-post-test design was employed. Students were assigned to control or experimental conditions. Effect sizes were calculated and non-parametric statistical tests were performed. No statistically significant differences were reported in the pre-test analysis. Statistically significant improvements were reported with the implementation of the programme for all grade courses, both individually and globally (Mann-Whitney U test = 5436.79, p < .01). The reported global effect size may be considered as medium to large (Hedges’ g = 0.72). The comparison among courses did not report any significant differences. It can be concluded that using peer tutoring for learning statistics and probability could be academically beneficial for middle school students.


Introduction
The first studies in the field of peer learning were conducted in the 1970s.The original research articles by Dineen, Clark and Risley (1977), VW Harris and Sherman (1973) and Rosen, Powell, Schubot and Rollins (1978) were followed by hundreds of reports on experiences in the field.In recent years, there has been an increase in the number of published articles that investigated the benefits of peer tutoring in mathematics from academic, social and psychological perspectives (Reber, 2019).Recent studies by Chi, Kim and Kim (2018), Hrastinski, Stenbom, Benjaminsson and Jansson (2021) and Leung (2019b) have repeatedly documented this methodology's potential.Although, not many studies in early childhood education address mathematics contents, extensive and diverse literature exist for primary, secondary and higher education in mathematics and other subjects.Nevertheless, very few peer tutoring reports address statistics or probability.According to Leung (2015), almost all studies addressed arithmetic and geometry and only some of them referred to algebra.Hence, peer tutoring experiences in statistics and/or probability published in indexed journals are hard to find.Recent reviews and meta-analysis in the field did not include any studies of this kind (Leung, 2015(Leung, , 2019a)).In this study, the effectiveness of peer tutoring with middle school students (7th, 8th and 9th grades) working with statistics and probability was examined.The academic achievement of the students was analysed with a pre-test post-test control group design.

Literature Review
Although hundreds of authors have contributed to the knowledge of peer tutoring in mathematics, three of them have excelled in the field for their continuous publications in high-impact indexed journals since the late 1980s and early 1990s: Lynn Fuchs, John Fantuzzo, and Keith Topping.These are the authors with the greatest number of publications in the research area according to Google Scholar.Almost all of their publications refer to arithmetics, algebra and geometry content.
Lynn Fuchs (Fuchs, Fuchs, Hamlett & Karns, 1998;Fuchs, Fuchs, Malone, Seethaler & Craddock, 2019;Fuchs, Fuchs, Phillips, Hamlett & Earns, 1995;Fuchs, Fuchs, Yazdian & Powell, 2002) has repeatedly documented that peer tutoring can help any type of student, even learning-disabled students, to improve their academic achievement in mathematics.Fuchs' studies usually refer to primary education, but she has also documented experiences in early childhood and secondary education.She emphasises that in order to maximise outcome in this type of experiences, the interactions between students must be rich and supervised by a professional in the field.She also states that although peer tutoring experiences many times may report lower improvements for learning-disabled students, practitioners may find academic benefits with this methodology in spite of the students' condition.Several authors in the peer tutoring field such as Clarence (2016) and Swartz, Deutsch, Makoae, Michel, Harding, Garzouzie, Rozani, Runciman and Van der Heijden (2012) also support these statements.
John Fantuzzo has been researching peer tutoring for almost four decades.He has not only focused on academic achievement, but has also addressed social and psychological variables in his studies.Fantuzzo studied variables such as selfconcept or social interactions between students, documenting positive results most of the time (Fantuzzo, Davis & Ginsburg, 1995;Fantuzzo, Gadsden & McDermott;2011;Fantuzzo & Ginsburg-Block;1998;Fantuzzo, King & Heller, 1992).At-risk students with learning problems have repeatedly participated in the experiences he has described.He has referred not only to the academic potential, but also to other social and psychological potentialities of this methodology with different types of students.Fantuzzo indicates that structured peer tutoring is necessary so that students can help themselves to learn in an optimal way; selection and distribution of peers is a key factor according to this author.
Although Topping has studied peer tutoring across different levels and with different subjects since the 1980s, his contributions in areas such as higher education, mathematical vocabulary or strategic dialogue are especially remarkable (Topping, 1996(Topping, , 2005;;Topping, Campbell, Douglas & Smith, 2003;Topping, Miller, Murray, Henderson, Fortuna & Conlin, 2011).Topping states that communication abilities between students may be strengthened thanks to peer tutoring, and variables such as students' attitudes towards mathematics may be positively influenced with this methodology.Topping also notes that peer tutoring is an inclusive methodology in which all students get something in exchange for their interactions, so that all students benefit from its implementation.
During the last decade, several countries have documented dozens of peer tutoring experiences in their academic journals.South Africa is one of the countries in the world whose growth in the reporting of this type of experiences during the last decade has been most notorious.Research by Mkonto (2018), Spaull (2015), Tanga and Maphosa (2018), Tangwe and Rembe (2015) or Taole (2020), to name but a few, have repeatedly shown the potentiality of this methodology across different educational levels in the country.In this sense, the experiences of the above-mentioned authors are very useful for the peer tutoring literature as many of them report how peer learning can be beneficial under low economic conditions.Hence, thanks to these studies peer tutoring may be seen as a valuable educational resource for emerging economies.
Several literature reviews and meta-analyses have analysed studies of peer tutoring in mathematics across the years (Britz, 1989;Robinson, Schofield & Steers-Wentzell, 2005).All concluded that the majority of studies reported positive outcomes and that peer tutoring in mathematics was beneficial most of the time from an academic, social or psychological perspective.It must be noted that the reported effect sizes in these studies were somewhat larger for the academic achievement variable than for other social or psychological variables.

Background of the Study
Cockerill, Craig and Thurston (2018) indicate that students can be effective teachers in different learning contexts; students can interact with their peers using direct speech at the same time as they share cultural and linguistic references.They can help their classmates to learn while they learn at the same time.According to Miravet, Ciges and Garcia (2014), the essence of peer tutoring is helping your peers to learn while you improve your academic and social skills.Several authors have defined peer tutoring since the early 1970s.On the one hand, Miquel and Duran (2017) define peer tutoring as an educational strategy in which students not only help each other, but also learn while helping other peers.On the other hand, Thurston, Van de Keere, Topping, Kosack, Gatt, Marchal, Mestdagh, Schmeinck, Sidor and Donnert (2007) define it as a teaching strategy in which students are paired together in order to practise academic skills and master content.In this methodology, the students with higher academic skills serve as tutors.Their role is to help other peers who serve as tutees while they cooperate in pairs.Although tutees may have lower knowledge or academic skills than tutors, their role during peer tutoring is as important as the tutors' role.The questions asked by the tutees are the basis of the dialogue along with the explanations and answers provided by the tutors.The richness of the interactions and the magnitude of the outcome depend equally on both parts, tutors and tutees.In summary, it can be stated that peer tutoring consists of an asymmetric relationship externally planned by a practitioner in which participants share the same goal: the acquisition of curricular content by any student participating in the experience (Swartz, Deutsch, Moolman, Arogundade, Isaacs & Michel, 2016).Students' inclusion and collaborative learning are enhanced by means of this method (Clarence, 2018).
Most of the authors in the field refer to two key elements when defining the type of peer tutoring experience: the ages and the roles of the participants.Depending on the ages, peer tutoring could be same-age or cross-age, whereas depending on the roles peer tutoring can be fixed or reciprocal.
In same-age tutoring, all participants are from the same grade course.During cross-age tutoring, most of the times older participants from upper courses help their younger peers in lower course grades.In fact, many times tutors are placed in a superior educational level to their tutees (Watts, Bryant & Carroll, 2019).Although previous studies have addressed the differences between same-and cross-age tutoring (Leung, 2019a), there is no strong evidence that suggests that one type is better than the other.Authors such as Hänze, Müller and Berger (2018) have indicated that the age gap is key in the process as tutees learn more with older tutors.In this sense, according to Morris, Edovald, Lloyd and Kiss (2016), the age difference guarantees the quality of the outcome, and they recommend an age difference of 2 to 4 years between tutees and tutors.On the contrary, previous studies by Leung (2015, 2019b) report that no significant differences were found between cross-and same-age experiences in mathematics or other subjects.Moreover, same-age tutoring is less complex to be carried out than cross-age tutoring if organisational issues are considered (Korner & Hopf, 2015), due to the fact that same-age tutoring is often implemented inside the same classroom where all participants usually learn.There is no need to move students from one class or institution to another, making it easier for same-age experiences to be implemented.
During fixed peer tutoring, students maintain their respective roles: tutors always perform as tutors, and tutees always perform as tutees.On the other hand, roles are exchanged during reciprocal peer tutoring experiences (Martin-Beltrán, Chen & Guzman, 2018).Similarly to the same-age versus cross-age debate, it is not clear which type is better than the other.Previous literature reviews and meta-analyses mentioned above reported similar effects for both types of tutoring (Leung, 2015(Leung, , 2019b)).Considering a psychological approach, several authors state that reciprocal is more beneficial than fixed (Bailey, Baek, Meiling, Morris, Nelson, Rice, Rose & Stockdale, 2018).The main reason is that with fixed peer tutoring, the self-concept of the tutees may decrease and their confidence may be affected.Performing permanently as tutors may be harmful for them as feelings of dependency and inferiority may arise if the tutoring implementation is too long (Leung, 2019b).

Aim and Research Questions
The main objective of this study was to assess the efficiency of peer tutoring for learning statistics and probability.Academic achievement was the variable subject of study in this research.Two research questions were: Research question 1: Does students' academic achievement improve significantly after the implementation of peer tutoring when learning statistics and probability?Research question 2: Are there significant differences among students' academic achievement by grade courses after the implementation of peer tutoring in statistics and probability?

Peer Tutoring Programme Mathematics contents
Students in three grades work statistics and probability courses.Seventh graders work with concepts such as population, samples, qualitative and quantitative statistical variables, histograms, bar graphs, diagrams of sectors and tables of frequencies at basic levels, and they calculate means, medians, ranges and probabilities with Laplace's rule.The 8th graders refresh all the 7th-grade content, work with advanced probability and know how to interpret and calculate standard deviations.Ninth graders refresh all the 8th-grade content and also work with quartiles and percentiles, interquartile ranges, box plots and tree diagrams.

Intervention
No intervention was scheduled for the first two terms.The teacher used traditional teaching methods, that is, one-way instructional teaching.Students were not allowed to interact while solving problems.During the third term, the teacher's lessons were complemented with a peer tutoring programme.Tutoring was fixed and same-age.A cross-age experience was dismissed for organisational reasons, as it was impossible to bring students from other classes or institutions to those facilities where the peer tutoring was implemented.According to De Backer, Van Keer, Moerkerke and Valcke (2016), reciprocal peer tutoring requires that researchers know about students' communicative skills and their academic performance in mathematics.As that knowledge was very limited, fixed tutoring was selected.

Organisation and calendar
The tutoring timetable consisted of 18 sessions.Three sessions were held per week for 6 weeks.Communication time between pairs of students lasted about 25 minutes for each session.The schedule was arranged so that sessions took place after the Statistics and Probability 1 exam and before the Statistics and Probability 2 exam.The whole intervention was scheduled during school time.The organisational issues (number of sessions per week, length of the sessions and total number of sessions) were programmed following the advice and suggestions for practice indicated by Leung (2015), so that students' academic performance was maximised.
The selection and distribution of students and the role of the teacher were designed following the indications provided by Leung (2019a), who states that interactions between students must be supervised by the teacher.The teacher should also help the students if one or more of the tutors or his/her tutee(s) are not able to finish the exercise or the problem on time.In order to arrange the pairs, students were classified using the pre-test marks from highest to lowest.After that, the list was divided into two halves.The tutors were those students placed in the first half, and the tutees were those placed in the second half.The tutor at the top of the list was paired with the tutee at the top of the other list and so on.Students' material (workbook, resources, etc.) were the same as those used during the school year.Students were trained on tutoring skills prior to the implementation of the programme.Respect and patience were required of all students in order to work in pairs.Students were also instructed on how interactions should be during the sessions.Firstly, the teacher had to check that tutors had developed an adequate procedure and that the result was correct.Then, tutors must ask tutees about their results.If the result was correct, tutees explained to their tutors the procedure they had followed in order to find the answer to the exercise or the problem.If a tutee's answer incorrect, his/her tutor had to help the tutee.The explanations provided by the tutor should make his/her tutee finish the task on time.Tutees were allowed to ask their tutors when needed, but always on the basis of perseverance and individual work.The goal was that all students had to try as much as possible to get to the correct result.

Classroom dynamics
Tutors and tutees were given a worksheet with two exercises, an exercise and a problem or two problems depending on the Firstly, all students worked individually.They had to finish the first exercise or problem in 5 minutes.Then, students were given 6 minutes to help themselves.They could share their results, ask questions and so on).Following that, students were to complete the second exercise or problem on their own.Students then had another 6 minutes to work in pairs again.The complexity of the worksheet could differ depending on the session.Additional worksheets were also supplied to those students who finished much earlier than the rest of their peers.

Research Design
Zeneli, Thurston and Roseth (2016) have recently studied the influence of the experimental design on the academic performance in tutoring interventions.According to them, a control group is necessary as its omission could produce an overestimation of the students' performance during this type of implementations.Therefore, as a control group is highly recommended, an experimental pre-test post-test with a control group design was used in this study.

Sampling of the Study
Students in this research were selected by means of convenience sampling due to their availability and accessibility at the time (Meng, Zhang, Li & Yu, 2019).One of the researchers in the study was part of the teaching staff of the educational centre in which peer tutoring was implemented.The teacher had already performed several peer tutoring experiences and had a deep knowledge of the field.The prior experience and knowledge of this researcher facilitated the implementation and organisation of the tutoring programme.

Participants
Participants in this study were students enrolled at a Spanish middle school in the 7th, 8th and 9th grade.The school was public and it was located in a suburban area of a city of approximately 60,000 inhabitants.Of the participants, 51.6% were female and 49.4% male; 55.6% Hispanic, 20.2% Caucasian, 19.4% African, and the remaining 4.8% of other ethnicities.The socioeconomic status of the families from that area and the overall academic achievement of the institution were average considering the national standards.Although 210 students were enrolled in the above-mentioned courses, two of them did not come to class most of the time.Hence, 208 students between 12 and 17 years old participated in the study.Seventy-four were in the 7th grade, 68 in the 8th grade, and the other 66 were enrolled in the 9th grade.The average age for 7th graders was 12.9 years old, 13.7 years old for 8th graders and 14.8 years old for 9th graders.Students were randomly assigned to the experimental or the control group.Hence, 38 seventh graders were put in the experimental conditions and the other 36 seventh graders were put in control conditions.For the 8th grade, 34 students were in both the experimental and the control groups.For the 9th grade, 32 students belonged to the experimental group and 34 to the control group.

Instrumentation
The marks for the exam of unit 9 (Statistics and Probability 1) and those of unit 10 (Statistics and Probability 2) were used as a measure of academic achievement.For all grades, Statistics and Probability 1 is an introduction unit to statistics and probability.Students will use the concepts they studied in that unit in Statistics and Probability 2. Unit 9 includes basic concepts and exercises for each course while more complex concepts and problems are included in unit 10.A deep knowledge is required in Statistics and Probability 1 so that students can succeed in Statistics and Probability 2. Exams have 10 exercises or problems and are graded from 0 to 10.One mark is allocated for each correct problem or exercise.If a correct procedure is developed but there is a minor calculation mistake, 0.2 marks are given.Statistics and Probability 1 exam marks served as pre-test scores while Statistics and Probability 2 exam marks served as post-test scores.Hence, scores for both, experimental and control group in both pretest and post-test phases ranged from 0 to 10.

Data Analysis
The data collected from the pre-test and the posttest were analysed using SPSS software version 25.Means, standard deviations, increments and the Mann-Whitney U test (95% confidence level) were used to determine statistically significant differences between the experimental and control groups (Zimmerman, 1987).The non-parametric statistical analysis was complemented with simple quantitative analysis.Results were reported by grades and globally.The Kruskal-Wallis H test was used between group to asses any significant differences among grades (Vargha & Delaney, 1998).Effect of sizes were calculated for all grades separately and all together using the mathematical expression indicated by Rosnow and Rosenthal (1996).Effect sizes were reported using Hedge's g (Orwin, 1983).

Results
Table 1 shows descriptive results for all groups in each grade, which address research question 1.
Means, standard deviations (SD) and number of students (n) by grade (7th, 8th and 9th), group (experimental or control) and phase of the study (pre-test or post-test) are reported.The performance variations, that is, increases and decreases between the pre-test and the post-test by grade and group with (experimental group) or without (control group) the intervention are reported in Table 2.The Mann-Whitney U tests between groups are reported in Table 3.Those tests in which statistically significant differences were reported with a level of significance (p) inferior to .05 are marked with an asterisk (*).In Table 3, no statistically significant differences were reported when analysing the pre-test between groups (tests 1 to 3).Statistically significant differences were found between the pre-test and the post-test for the experimental groups in all grades and all together (tests 4 to 7).No statistically significant differences were reported between the pre-test and the post-test for the control groups in any case (tests 8 to 11).
Analysis by increments, which were obtained subtracting the post-test to the pre-test, also reported significant differences (tests 12 to 15).The 7th-grade implementation reported an effect size of 0.51, the 8th-grade implementation an effect size of 0.55 and the 9th-grade implementation an effect size of 0.43.A Hedge's g global effect size of 0.72 was found.The Kruskal-Wallis H test did not report any significant differences among 7th, 8th and 9th graders in the experimental group for the post-test scores (X 2 = 2.83, p = .42)or the increments (X 2 = 4.36, p = .23).(2018), also show significant academic improvements after peer tutoring implementations for middle school students.The results reported on in this study are consistent with other studies on peer tutoring in mathematics.According to Leung (2019b) fewer than 10% of the studies on peer tutoring in mathematics report large realistic effect sizes.This could be a sign of the potentiality of this methodology with statistics and probability contents.The fact that peer tutoring in statistics and probability may be implemented at any middle school in the world independently of its economic, cultural or achieving conditions implies that results of this study could be of potential interest for teachers and practitioners in education.Nevertheless, it must be taken into account that results from this research can't be generalised due to two main reasons: the small sample size and the fact that convenience sampling was used (DeAngelis, 2021;Zirkel, Garcia & Murphy, 2015).
The reported percentages of improvement (90%) is a little bit higher, although quite similar, to several recent studies in the field (Tanga & Maphosa, 2018;Zeneli, Tymms & Bolden, 2018).The academic benefits of peer tutoring for the majority of participants in the experience have been documented widely for mathematics (Leung, 2019b).Nevertheless, the high deviations for the experimental conditions and the fact that 10% of the students in this study decreased their scores must be considered.Several authors have discussed the fact that peer tutoring is not equally effective for all students (Thurston et al., 2007;Topping et al., 2011).Students involved in a peer tutoring experience must show commitment, and they must believe in the efficiency of the methodology.If not, peer tutoring may result in an academic decrease for them.Several authors have reported that a small percentage of students are most of the time highly reluctant to work in pairs (Baleni, Malatji & Wadesango, 2016;McKay, 2016).Hence, it is expected that peer tutoring may affect their academic achievement negatively.
The fact that no statistically significant differences were found among 7th, 8th and 9th grade increments is consistent with previous literature in the field.Effect sizes are quite homogenous within educational levels (Leung, 2015).As this study compared academic outcomes within the same educational level (middle school), it was expected that we would find similar effect sizes for all of them.

Conclusion
Moderate effect sizes with significant academic increases may be expected when implementing same-age and fixed-age peer tutoring when learning statistics and probability at middle school level.Although the existing literature of peer tutoring experiences addressing statistics and probability is scarce, the results shown in this study were similar to those reported in other experiences of peer tutoring and mathematics.The high global effect size obtained in this study suggests that future research should focus on the implementation of more experiences in the field in order to address its potentiality and compare it with much more similar studies.Although significant academic improvements were reported, it must be considered that peer tutoring was not equally effective for all students, as high deviations were reported for the experimental group, and 10% of the students decreased their scores after the implementation of the programme.Hence, it is important to consider that all students need to show commitment and believe in the potentiality of the methodology or peer tutoring may negatively affect their academic results.
Certain limitations must be taken into account when interpreting the conclusions drawn from this study.Firstly, the 208 students participating in the experience were selected through convenience sampling.This fact may compromise the validity of the study from an experimental point of view.Besides, the sample size was not large enough to make it representative for any significant population.The fact that one of the researchers in the study was a teacher at that institution and had wide previous peer tutoring experience must also be considered.Future researchers or practitioners in the field without similar experience may experience organisational, attitudinal and other inconveniences when replicating this study, which may affect the academic outcome of their experience.

Table 1
Descriptive quantitative results

Table 2
Differences between pre-test and post-test after peer tutoring programme

Table 3
Mann-Whitney U tests between groups