Correlates of Success in Introductory Programming : A Study with Middle School Students

The demand for computing professionals in the workplace has led to increased attention to computer science education, and introductory computer science courses have been introduced at different levels of education. This study investigated the relationship between gender, academic performance in non-programming subjects, and programming learning performance among middle school students with no prior programming experience who took an introductory programming course. We found that girls performed as well as or even better than boys in introductory programming among high-ability Chinese middle school students. However, we found that, instead of gender, students’ performance differences in programming were better explained by their academic performance in non-programming subjects. Students’ math ability was strongly related to their programming performance, and their English ability was the best predictor of their success in introductory programming for these Chinese students. Findings confirm previous studies that have shown a relationship between students’ math ability and performance in learning to program, but the relationship between English ability and introductory programming was unexpected. While this relationship may be specific to students whose first language is not English, aspects of native language may pose hidden barriers that might affect all students’ success in introductory programming.


Introduction
Although job prospects in computer science look very good in today's computing-intensive world, there is still a paucity of the workforce in computer and information science in the U.S. (Bureau of Labor Statistics, 2014-15).The demand for computing professionals in the workplace has led to increased attention to computer science education, and introductory computer science courses have been introduced at different levels of education.In the U.S., efforts, such as the National Science Foundation's CS10K initiative and the new CS Principles course of the College Board, are being made to broaden high school students' participation in computer science.However, computing and introductory programming may be introduced even before the secondary school level.
In the U.S., the Computer Science Teachers Association (CSTA) standards for computer science education address three grades bands: K-3, 3-6, and 7-12 (Seehorn et al., 2011), and to raise awareness of computing at the pre-secondary level the organization produced a special issue focusing on computer science in grades K-8 (Phillips, 2012).In the U.K., computer science is introduced to elementary school students (Brown, Sentance, Crick, & Humphreys, 2014).Even children as young as kindergarteners have been exposed to learning about robotics and programming (Sullivan & Bers, 2013).electronic boards, can improve students' understanding of programming (Rubio, Romero-Zaliz, Mañoso, & Angel, 2015).Visualizing the steps of program execution can also help students to develop their understandings of how computer programs function (Sirkia & Sorva, 2012).For instance, Online Python Tutor is a well-known visualization tool that is widely used by instructors of introductory programming courses (Guo, 2013).Another commonly used instructional strategy in introductory programming is employing automated assessment learning systems (Douce, Livingstone, & Orwell, 2005).In these learning systems, students solve programming problems and submit their solutions.The systems can automatically assess students' solutions, provide immediate feedback, reduce instructors' workload, and significantly improve students' learning experience (De-La-Fuente-Valentín, Pardo, & Kloos, 2013;Wang, Su, Ma, Wang, & Wang, 2011).
Many factors, including prior knowledge, may influence students' success in introductory programming (Alvarado, Lee, & Gillespie, 2014;Bergin & Reilly, 2005;Rubio et al., 2015;Wilson & Shrock, 2001).If we are to engage more students into computing fields and improve our instruction in introductory programming, we need to better understand the factors that influence students' success.
Many studies on factors associated with success in introductory programming have been conducted over the past several decades.While a variety of factors have been investigated, including students' academic background, personal characteristics, and cognitive factors (Bergin & Reilly, 2005;Wilson & Shrock, 2001), three major factors have been studied across numerous studies: gender, previous computing experience, and academic performance in non-programming subjects.

Gender
Because of the low participation of women in computer science (Simard, Stephenson, & Kosaraju, 2010), the gender gap between females and males in computer science has been discussed by many researchers (Alvarado, Dodds, & Libeskind-Hadas, 2012;Guzdial, Ericson, Mcklin, & Engelman, 2014;Patitsas, Craig, & Easterbrook, 2014).Some previous studies have indicated that males have more positive attitudes towards computers than females (Alvarado et al., 2014;Beyer, Rynes, Perrault, Hay, & Haller, 2003;Dambrot, Watkins-Malek, Silling, Marshall, & Garver, 1985;Shashaani, 1997).For example, Shashaani and Khalili (2001), in a study of 375 Iranian undergraduates, found that even though female students strongly believed they had the same ability and competence in using computers as males, they showed less confidence in working with computers.Another study of 56 college students in a computer science course showed that while there were no gender differences in interests in computer science, females had much lower computer confidence (Beyer et al., 2003).However, other researchers have reported no significant gender differences in attitudes toward computers (Hattie & Fitzgerald, 1987).To determine an overall effect from multiple studies, Whitley Jr (1997) conducted a meta-analysis of studies on gender differences in computer-related attitudes and behavior and found that by comparison with females, males held more positive attitudes toward computers and believed they had higher computer-related competence (ES = .232,p < .001).
A few studies have indicated that male students show superior computer aptitude to females (Dambrot et al., 1985;Makrakis & Sawada, 1996).Based on the sample of 941 students in a psychology class, Dambrot et al. (1985) found that male students showed higher computer aptitude and math aptitude than female students, even though female students had higher high school and college GPAs.In a study of 773 ninth-grade students in Japan and Sweden, Makrakis and Sawada (1996) found that overall boys scored higher on computer aptitude and showed more positive attitudes toward computers, math, and science than girls.According to a recent study by Guzdial et al. (2014) of AP CS test takers in Georgia in 2013, male students showed much better performance than female students.However, other researchers have reported that females demonstrate the same performance as males in learning programming (Bruckman, Jensen, & DeBonte, 2002;Linn, 1985;Rubio et al., 2015).

Previous Computing Experience
Previous computing experience is another major factor that may contribute to success in introductory programming.According to a study (Kersteen, Linn, Clancy, & Hardyck, 1988) of freshmen and sophomores in an "Introduction to Computer Science" course, students' prior computing experience showed positive impact on their learning performance in this course.By studying 105 students in a CS1 introductory computer science course, Wilson and Shrock (2001) reported that students' grades were significantly correlated with their previous programming course experience.However, the positive impact of prior computing experience on learning to program may only be effective in the introductory programming course.In a study of students' performance in a sequence of programming courses, Holden and Weeden (2003) found that previous programming experience did help in the first programming course, but "it does not seem to have a significant impact on student performance in subsequent courses" (p.45).
In addition, researchers have found a gender gap in previous computing experience.In general, males have more computer-related experience than females (Shashaani, 1997).Male students not only play more computer games than female students (Lockheed, 1985), but they also take more programming classes (Murphy et al., 2006).Interestingly, all types of prior computing experiences show a positive impact on females' learning performance in introductory programming; however, only certain types of previous experiences help male students (Taylor & Mounfield, 1994).For instance, too much game playing may lead to a negative effect on exam scores (Holden & Weeden, 2003).

Academic Performance
In addition to gender and prior computing experience, students' academic performance in other subjects may also influence success in learning to program.Previous research has revealed that students' academic performance in math has a positive relationship to their performance in introductory programming (Bennedsen & Caspersen, 2005;Bergin & Reilly, 2005;Wilson & Shrock, 2001).In a study of 235 college students in an object-first CS1 course, Bennedsen and Caspersen (2005) reported that students' math score from high school was the best predictor of their final grade in this course and explained over 15% of the variance in their predictive model.By investigating students' learning performance in the introductory programming course and their performance in the Irish Leaving Certificate (LC) examinations in mathematics and science subjects, Bergin and Reilly (2005) found that mathematics and science both significantly correlated with students' performance in programming.It is not surprising that the results of these studies suggest that ability in disciplines such as mathematics and science is an important predictor of students' learning performance in introductory programming courses.

Research Questions
Although there have been studies of factors that influence students' success in programming (Alvarado et al., 2014;Bergin & Reilly, 2005;Rubio et al., 2015;Wilson & Shrock, 2001), most of these studies have focused on college students.A few previous studies have discussed the gender effects in the learning of programming among middle school students (Bruckman et al., 2002;Linn, 1985), but these studies did not discuss other factors that might affect students' programming learning.More importantly, according to previous research, factors such as gender may or may not influence performance in learning to program depending on the age of the students (Bruckman et al., 2002;Guzdial et al., 2014;Sullivan & Bers, 2013).Hence, because introduction to computer science courses have spread from college to high school and now even into lower grades levels, researchers and educators should investigate factors contributing to programming success for pre-college learners.From an instructional perspective, it is also important to identify factors that influence success in learning to program.If educators know what factors are important for the success in computer science, they can better prepare their instruction to support and enhance students' learning.
The purpose of this study was to investigate the relationship between gender, academic performance in non-programming subjects, and programming learning performance among middle school students with no prior programming experience who took an introductory programming course.The following research questions guided the study: 1) Is there a difference in the performance of boys and girls in learning programming?
2) Are there any relationships between gender, academic performance in non-programming subjects, and programming learning performance?

Subjects
The research subjects in this exploratory study were two groups of students taking an introductory computer programming course in 7th grade from a middle school in an eastern city in China.Group A had 33 students, 17 boys and 16 girls.The average age of Group A was 12.3.Group B had 36 students, 23 boys and 13 girls.The average age of Group B was 13.2.Both groups of students were high-ability students who scored in the top 10% of the high-ability identification exam of this school.Normally, in this city, elementary school education is six years long and middle school education is three years long.Due to the students' high academic ability, the students in Group A skipped the 6th grade and started the three-year-long middle school education directly when they finished the coursework of the 5th grade.On the other hand, the students in Group B finished all six years of elementary school education, but they were scheduled to complete the three-year-long middle school in just two years.None of the students had taken any computer programming courses before.The core academic courses required for both groups of students were math, Chinese, and English.

Procedures
Students from both groups took an introductory Pascal programming course taught by the same instructor (the first author) during the same semester.Students took this course for 14 weeks, and during every week students participated in a 90-minute block.However, because of the holiday schedule of the school, students in Group B missed 2 blocks and so had less programming practice time than students of Group A. This course employed an automated assessment system as part of the instruction in order to provide students opportunities of actively constructing new knowledge and enhance their experience of learning to program (De-La-Fuente-Valentín et al., 2013;Douce et al., 2005;Wang et al., 2011).The automated assessment learning system used in the instruction was the SangTian Programming Learning Environment (SPLE), which was designed by the course instructor.SPLE has a problem pool which contains dozens of programming problems for students to solve.All problems were presented in Chinese; a translated example problem is shown in Figure 1.Students started to use SPLE in the 5th week.Every class block, the instructor first introduced new course content such as the syntax of using for loops in Pascal.After about 10-minute lecture, the instructor would demonstrate how to use the new statements to solve programming problems.Examples used in the demonstration usually were similar to the problems in SPLE.After the lecture and demonstration, students had around 60 minutes to solve problems in SPLE individually.The instructor provided necessary support when a student had difficulties and asked for help.Students accessed SPLE only during the class time and did not have assignments after class.
To solve the problems, students were required to write short programs to produce the correct output.All the solutions students submitted to SPLE were graded by the system automatically.Immediate feedback was provided to inform the student whether the submitted solution was correct or not, and if syntax or logic errors existed.While solving a problem, the student accumulated experience points according to the difficulty of the problem.The difficulty of problems was categorized from Level 1 (low difficulty) to Level 10 (high difficulty).The more difficult the problem the student solved, the more experience points he or she could earn.Accruing experience points also raised the student's account level.If a student with a high level account solved a low difficulty-level problem, he or she could only earn half of the experience points available for the problem.Thus, students were encouraged to solve higher difficulty level problems as they gained experience.Once a student solved a certain problem, he or she could not solve it again.If a student did well in programming and solved more high difficulty level problems, he or she could earn more experience points.The total possible experience points students could earn in SPLE was around 3200, and the highest level of student account was 10.Based on the accumulative evaluation system, the final experience points students earned were used as a proxy measure of their learning performance in this course.At the end of the semester, students' learning performance data were extracted from the SPLE database and analyzed.We have an integer which is greater than or equal to 1000.Find the digit in the thousands place.For example, the digit in the thousands place of 36541 is 6.Input Format: an integer which is greater than or equal to 1000 Output Format: the digit in the thousands place.See examples.

The Input and Output Examples:
- ------------------------------Input: In addition, students' academic records of their core courses were accessed to gather data about their performance in other subjects.We calculated the averages of the formal exam scores in each subject for each student.The maximum score on each of these exams was 100 points.These average scores were used as the measure of students' academic performance in each subject.

Analysis Performed
Various statistical techniques were employed for data analysis.We applied t-tests to compare boys' and girls' programming learning performance in both groups.Correlations were computed to analyze relationships between gender, academic performance in non-programming subjects, and programming learning performance.In addition, stepwise regression analyses were used to identify important factors that predicted students' programming learning performance.For the stepwise regression, we set significance levels of F to enter (FTE) of α = 0.15 and F to delete (FTD) of α = 0.075 consistent with recommendations of Derksen and Keselman (1992).As gender is a categorical variable, when computing correlations and regressions, we coded gender as male equals 0 and female equals 1.

Gender Differences in Programming Learning Performance
Overall, in this study, girls performed better than boys in learning programming as measured by SPLE experience points (See Table 1).In Group A, girls gained more experience points in the SPLE than boys; although the difference (t(31) = 1.88, p = .07)is not statistically significant at the  = 0.05 level, the trend suggests that girls performed better than boys.In Group B, girls gained significantly more experience points in SPLE than boys (t(34) = 2.69, p = .01).Therefore, our results suggest that there was a difference in the performance of boys and girls in learning programming with girls showing better learning performance than boys.

Relationships between Gender, Academic Performance, and Programming Learning Performance
Correlations between students' programming performance, as measured by experience points in SPLE, and their academic scores in non-programming subjects were computed (see Table 2).In Group A, students' programming performance as measured by experience points was positively correlated with their English score (r = .57,p < .001),math score (r = .45,p < .01),and Chinese score (r = .36,p < .05).In Group B, students' programming performance as measured by experience points was positively correlated with their English score (r = .58,p < .001)and math score (r = .45,p < .01).In both groups, our results showed that the relationship between students' English performance and programming learning performance was stronger than the relationship between students' math performance and programming learning performance.
Stepwise regression analyses were conducted to investigate whether the various factors studied were able to predict students' programming learning performance (See Table 3).Independent variables were gender, math score, Chinese score, and English score.In Group A, a stepwise regression method resulted in a significant model with only the English score as the predictor, = .32,F(1, 31) = 14.67, p < .001.Students' English score significantly predicted the programming learning performance, β = 140.37,p < .001,explaining 32% of the variance.In Group B, a stepwise regression method also found a significant model with only English score as the predictor, = .34,F(1, 34) = 17.51, p < .001.Students' English score significantly predicted the programming learning performance, β = 90.67,p < .001,explaining 34% of the variance.In the regression model, gender was no longer important in explaining the difference in programming learning performance when the English score was considered.Therefore, these results indicate that the observed gender differences in programming learning performance could be better explained by other factors such as students' English ability.

Gender and Learning Performance in Programming
The result in this study showed that overall girls performed as well as or even better than boys in learning programming.In Group A, girls gained more experience points on average, but the difference was not significant.In Group B, girls performed significantly better than boys.This result is contrary to some studies that have shown males to demonstrate superior performance to females (Dambrot et al., 1985;Guzdial et al., 2014) but consistent with other previous studies of children or middle school students (Bruckman et al., 2002;Linn, 1985;Sullivan & Bers, 2013).For example, in one study of middle school students, Linn (1985) stated, "at several schools, females had higher means than males on this assessment [Final Programming Assessment], although the differences in mean scores were not statistically significant" (p.235).In a study of kindergarteners, Sullivan and Bers (2013) indicated that no gender differences were found in robotics and programming achievement.
Previous studies that have indicated a gender difference that appears to favor males with respect to attitudes towards computers (Beyer et al., 2003;Shashaani, 1997), computer aptitudes (Makrakis & Sawada, 1996), and AP CS test performance (Guzdial et al., 2014) were primarily studies of high school and college students.There is good evidence that the gender gap in computer science does not exist among younger students (Gürer & Camp, 2002).As females' participation in computer science is still low (Simard et al., 2010), if we are to engage more people into computing fields, researchers and educators should consider exposing all students to computer-related courses before the gender gap arises.An introductory programming course for middle school students, such as the course that was the object of this study, might be a possible choice, and computing concepts can be introduced to students in even earlier grades (Phillips, 2012).

Math, Programming, and Gender Stereotypes
Our results indicated that even though girls performed as well as or even better than boys in learning to program, gender was not the primary factor in explaining the differences when we considered other factors, such as students' English and math ability.The influence of English and math ability on students' programming performance is not a surprise given that constructivist theory suggests that students' construction of new knowledge is built upon the foundation of existing knowledge (Ben-Ari, 2001).According to our results, students' academic performance in math showed a significant positive relationships with their programming performance.Many previous studies have revealed that math ability is an important predictor of students' success in programming (Bennedsen & Caspersen, 2005;Bergin & Reilly, 2005;Wilson & Shrock, 2001).This connection between math and programming may be an important consideration when addressing the gender gap in computer science.First of all, female students have been reported to be less confident than male students in their math abilities (Sax, 1994).In addition, although math abilities of young boys and girls are almost equivalent, male students start to show significant superiority in high level math skills by the end of middle school (Entwisle, Alexander, & Olson, 1994).Moreover, culture and social environments can affect people's perception of math (Guiso, Monte, Sapienza, & Zingales, 2008;Hyde, Lindberg, Linn, Ellis, & Williams, 2008).Hyde et al. (2008) stated "Stereotypes that girls and women lack mathematical ability persist and are widely held by parents and teachers" (p.494).As a gender gap and stereotypes exist in math, girls may be enculturated to think computer science, which is highly related to math, is a masculine field as well.
Therefore, as computer science and math are closely linked together, both in the performance and perception, educators should work to counteract stereotypes about gender, math, and computer science among students as early as possible.We suggest that we should provide students with programming experiences at a young age and help them to develop non-stereotyped and gender-neutral perceptions of computer science.

English Ability, Natural Language, and Success in Programming
In this study, the results indicated that students' English ability showed the strongest relationship with programming learning performance and was the most important factor in explaining the differences in programming.In Group A, both programming learning performance and English scores of girls were higher than those of boys, although the differences were not statistically significant.In Group B, both programming learning performance and English scores of girls were significantly higher than those of boys.Thus, the superior performance of girls in English may have translated into superior performance in learning to program computers.Previous research has shown that math and science abilities are important factors that contributed to the success in programming (Bennedsen & Caspersen, 2005;Bergin & Reilly, 2005).However, these studies were based on native English speakers.It is possible that for Chinese students English ability is a more important factor that contributes to programming success, because the programming language used in this study was based on English.
For example, in Pascal, the reserved word "const", which is short for "constant", is used to declare constants.However, these students had not learned the word "constant" in their English class when they took the programming course.Hence, they may have had difficulties in understanding the Pascal reserved word "const."Students may have come to understand the purpose of this and other words in Pascal, but they were hampered by not knowing the meanings of the English words on which they were based.As another example, students had learned the math concept "real number", which in Chinese is called "shi shu", and also knew the English words "real" and "number", but they did not know "real number" means "shi shu".Thus it was hard for them to understand the data type "real" in Pascal.
Spelling was likely another obstacle these Chinese students faced in learning programming.For example, one student often misspelled the function "writeln" as "writeIn", because he could not tell if there should be a lower case "l" or an upper case "I" in it.If he knew "writeln" was short for "write a line", he might not have had the confusion.Therefore, it makes sense that the English ability of Chinese students may have an effect on their ability to learn to program.
Although the relationship we observed could be specific to students whose first language is not English, previous studies based on native English speakers have also indicated that students' natural language can sometimes result in programming errors.According to Du Boulay (1986), English words in programming languages have specific meanings, but the student may not understand them as expected.For example, the word "and" is a Boolean operator in many programming languages, but it is a conjunction in everyday use.Du Boulay (1986) stated, "Exasperation with a programming system can occasionally be caused by the mismatch between the designer's and the user's understanding of what is implied by a particular name" (p.62).Inappropriate use of natural language is another source of programming bugs of novices.After analyzing students' programs, Bonar and Soloway (1985) concluded that novices often made misleading links between SSK (step-by-step natural language programming knowledge) and PK (Pascal programming knowledge).For example, students may interpret the word "then" in the if-then-else construct in a programming language as it is used in natural language.Then, they may add a "then" to the repeat-until construct, which is incorrect in Pascal syntax.To help address this issue, some natural-language-style programming languages such as Hypertalk have been developed (Bruckman & Edwards, 1999;Kelleher & Pausch, 2005).One such language, MOOSE, was deliberately designed to help children learn reading, writing, and computer programming (Bruckman & Edwards, 1999).The programming language MOOSE tries to keep balance between natural language and a regular syntax.After analyzing 2970 errors made by sixteen children using MOOSE, Bruckman and Edwards (1999) indicated that children were less likely to make natural language errors in a natural-language-like programming language such as MOOSE than in a formal language like Pascal.
Because students' natural language may also affect their learning in programming, researchers and educators should pay attention to natural language issues.While girls may have better verbal skills than boys at this age (Guiso et al., 2008), whether this advantage may result in positive or negative effects in learning to program has not been researched.Obviously for these Chinese students who were just starting to learn English, better English ability provided benefits in learning to program.However, for native English speakers, further research is needed.
Although previous research has investigated various factors that might contribute to success in introductory programming, including gender (Alvarado et al., 2014), prior computing experience (Bruckman et al., 2002), and academic performance (Bergin & Reilly, 2005), there may be more factors that are important but seldom researched, such as natural language ability.Therefore, to better understand the factors that influence success in learning to program, researchers need to consider the full range of variables that may contribute to these differences.
Finally, in our study the results showed that students' English ability was more important than their math ability in learning to program, but this may be an artifact of the sample.First, all the students in this study were high-ability students with strong math abilities.Because all of the students had strong abilities in math, there may not have been sufficient variation to account for the differences in programming performance.Second, according to the data of this study, boys and girls did not show differences in their academic performance in math.Finally, the math skills required in this introductory programming course were less advanced than those required in high school or college level programming courses.Thus, the importance of math in explaining performance differences in programming may have been reduced in this study compared to previous studies with older students.

Conclusions
We found that girls performed as well as or even better than boys in introductory programming among high-ability Chinese middle school students.However, gender was not the primary factor in explaining the performance difference in programming.We found that, instead of gender, students' performance differences in programming were better explained by their academic performance in non-programming subjects.Students' math ability was strongly related to their programming performance, and their English ability was the best predictor of their success in introductory programming for these Chinese students.Our findings confirm previous studies that have shown a relationship between students' math ability and performance in learning to program, but the relationship between English ability and introductory programming was unexpected.
In reviewing these results, several limitations of this study should be considered.For one, all of the students were high-ability and so may not be representative of the general population of students learning to program computers.Likewise, because all of these students were English language learners, the influence of English on programming performance of these students may be different than it would be for native English speakers.Finally, the use of experience points in the computer system as a proxy for programming learning performance might be influenced by other factors, such as student persistence, that could vary across genders and were not investigated in this study.However, despite these limitations, the results of this study provide information that may help to illuminate factors that influence the success in introductory programming of various populations.
According to the findings in the study, we suggest that introducing programming courses to students at the early age may help to encourage more females to enter the computer science field, because at least in the middle grades girls do not show any disadvantages relative to boys.In addition, as there is a connection between math and programming, to engage more students into computer science, we also need to confront stereotypes about gender, math, and computer science.Finally, there may be hidden barriers that could influence success in programming, such as native language.Further research is needed to investigate the role of language proficiency in learning computer programming and whether language may create difficulties for students learning computer science.

Figure 1 .
Figure 1.A Translated Example Problem in SangTian Programming Learning Environment (SPLE)

Table 1 .
Gender differences in programming learning performance

Table 3 .
Summary of stepwise selection