Middle School Students' and Mathematicians' Judgments of Mathematical Typicality

K-12 students often rely on testing examples to explore and determine the truth of mathematical conjectures. However, little is known about how K-12 students choose examples and what elements are important when considering example choice. In other domains, experts give explicit consideration to the typicality of examples – how representative a given item is of a general class. In a pilot study, we interviewed 20 middle school students who classified examples as typical or unusual and justified their classification. We then gave middle school students and mathematicians a survey where they rated the typicality of mathematical objects in two contexts – an everyday context (commonness in everyday life) and a mathematical context (how likely conjectures that hold for the object are to hold for other objects). Mathematicians had distinct notions of everyday and mathematical typicality – they recognized that the objects often seen in everyday life can have mathematical properties that can limit inductive generalization. Middle school students largely did not differentiate between everyday and mathematical typicality – they did not view special mathematical properties as limiting generalization, and rated items similarly regardless of context. These results suggest directions for learning mathematical argumentation and represent an important step towards understanding the nature of typicality in math.

developed expertise in thinking critically about examples.The current paper explores middle school students' and mathematicians' judgments of important characteristics of mathematical objects.Our overarching goal is to consider how to support strategic reasoning with examples to promote activities related to mathematical justification.
The general explanation for why equilateral triangles are more often judged to be triangles than scalenes, and why 400 is said to be a better even number than 798, is that the former member of each pair is understood to be more typical than the latter.Typicality refers to how representative a given item is of a general class.Typical instances are highly similar to many other members of their class; atypical instances are dissimilar from most others (see Murphy, 2002 for review).Typicality has important consequences for reasoning -people are more likely to generalize from typical instances than atypical instances.All else equal, learning that a dog has a particular property makes people more confident that most other mammals also have the property than does learning that a whale has it (Osherson, Smith, Wilkie, Lopez, & Shafir, 1990).Typicality can lead people to treat information as more or less valuable.
One of the most interesting features of mathematical typicality is not that people judge some numbers or shapes to be better instances of their classes than others, but how they do so.Mathematically, an equilateral triangle is atypical.Equilateral triangles have many distinctive properties, and facts about equilateral triangles may not generalize.For example, Angle-Side-Side congruence holds for equilaterals, but not for all triangles.
Similarly, given our base 10 system of counting, 400 is a relatively special quantity.Notably, many calculations are much easier with 400 than with 497 (e.g., squaring).However, equilateral triangles and numbers that are multiples of 100 are likely more familiar and frequently encountered in daily life (Lakoff, 1990).One possibility, then, is that there may be different, even competing, senses of typicality in mathematics.Everyday typicality, based on familiarity may conflict with mathematical typicality, based on generalizability of mathematical properties.
We examine middle school students' and mathematicians' judgments of typicality in two contexts -how often numbers and shapes are seen in everyday life, and how generalizable numbers and shapes are in terms of their mathematical properties.These judgments have important implications for teaching K-12 students about the strategic use of examples in mathematical arguments, which aligns with current standards in mathematics education.

Literature Review
We begin our review in the area in mathematics education that we have been exploring issues of typicalitymathematical justification, generalization, and proof.We then discuss typicality relations in general, before moving to discuss their significance in mathematics.

Mathematical Justification, Generalization, and Proof
Justification and proof are important activities in mathematics education (National Council of Teachers of Mathematics, 2000;Stylianides, 2007;Yackel & Hanna, 2003).We follow Harel and Sowder (1998) in defining proving as "the process employed by an individual to remove or create doubts about the truth of an observation" (p.241).This definition allows for a wider range of approaches and modalities to be considered than the traditional "two column" proof (Herbst, 2002).The development of proofs includes several integrated stages -identifying patterns, making conjectures, providing non-proof arguments, and providing proofs -and Typicality Judgments 244 these activities involve both making generalizations and providing support for mathematical claims (Stylianides, 2008).This perspective closely mirrors the Common Core State Standards for Mathematical Practice (Common Core State Standards Initiative, 2010) which state that K-12 students should construct mathematical arguments and critique the reasoning behind arguments.Specifically, K-12 students should "make conjectures and build a logical progression of statements to explore the truth of their conjectures" as well as "reason inductively about data, making plausible arguments." Research suggests K-12 students often struggle to learn how to construct viable and convincing mathematical arguments and provide valid generalizations (Dreyfus, 1999;Healy & Hoyles, 2000;Martin, McCrone, Bower, & Dindyal, 2005).However, research in cognitive science has revealed that children show a surprisingly strong ability to reason and construct arguments in domains like science (Gelman & Kalish, 2005;Gopnik et al., 2004), where inductive, example-based reasoning has long been considered appropriate.Here, inductive reasoning refers to using examples to provide support for understanding a conjecture, which is different from the more formal notion of "proof by mathematical induction," as well as from traditional notions of formal, deductive proofs.In science, if one wishes to test a conjecture like "All birds have hollow bones," there is no deductive way to prove this conjecture -reasoning from examples to create causal theories is often the most appropriate form of inference.
Deductive reasoning is, however, fundamental to human reasoning generally, and mathematical reasoning specifically (English, 1997).Piaget's developmental model of cognitive development suggest that it is not until the formal operational stage (11-12 years and older) that students are capable of reasoning hypothetically in service of logical deduction (Piaget, 1928).Research since has challenged the notion that preadolescent children do not have access to deductive reasoning (e.g., Mody & Carey, 2016) although their understanding may be fragile (English, 1997).During middle childhood, children can distinguish between deductive inferences, inductive inferences, and uninformed guesses (Pillow, 2002).When students do reach adolescence, the production of deductive reasoning can depend on the perceived relevance or familiarity of the content structure (Ward & Overton, 1990).

Example Use During Justification and Proof
Examples are culturally-mediating tools that connect learners and mathematical ideas -they provide a specific and familiar context for exploration and understanding of nuances and constraints (Goldenberg & Mason, 2008).Learners consider how features of an example can be modified while still maintaining its membership to the class of mathematical objects it is intended to exemplify (Watson & Mason, 2005).Sometimes, learners' notions of what can be changed about an example are unnecessarily restricted based on inappropriate perceptual features, like believing that a long skinny triangle is a "stick" and not a triangle (Goldenberg & Mason, 2008).A learner's repertoire of available examples, along with their methods for constructing those examples and inter-relations between examples, constitutes their example space (Watson & Mason, 2005).Both K-12 students (Cooper et al., 2011;Knuth et al., 2011) and mathematicians (Lockwood, Ellis, Dogan, Williams, & Knuth, 2012) use examples to explore mathematical conjectures.Mathematicians actively consider Walkington,Cooper,Leonard et al. 245 examples and "it is probably the case that most significant advances in mathematics have arisen from experimentation with examples" (Epstein & Levy, 1995, p. 6).There is often a back-and-forth interplay between mathematicians' example-related reasoning and their deductive proof activities (Alcock & Inglis, 2008;Lockwood, Ellis, & Knuth, 2013).For examples to be most useful, the prover may need to intentionally choose examples with a specific purpose in mind, while confronting a task that lends itself well to their use (Sandefur, Mason, Stylianides, & Watson, 2013).
The role examples play in the justification activities of K-12 students is quite different.K-12 students can be overly reliant on examples and mistakenly use examples alone as a form of proof (Harel & Sowder, 1998;Healy & Hoyles, 2000;Knuth et al., 2009;Koedinger, 1998;Porteous, 1990).There is also evidence that K-12 students may not analyze examples strategically in order to make sense of a mathematical statement and to develop insight into its proof (Cooper et al., 2011;Knuth, Kalish, Ellis, Williams, & Felton, 2011).The contrast between the role examples play in the work of mathematicians and K-12 students is not surprising given that these students typically receive very little instruction on how to analyze examples when developing, exploring, understanding, and proving conjectures.Although K-12 students may see confirmatory examples as sufficient for proof, questioning whether the confirmatory result was an accident because of a "special case," or whether the example is "generic" and appropriately represents a general case, may trigger a need for proof (Buchbinder & Zaslavsky, 2011).

Typicality and Expertise
Research in other domains suggests that experts have a set of heuristics for judging which relations are important in their domain of expertise (Bédard & Chi, 1992).Experts effectively employ formulas and principles because they are able to identify which features of a problem are important ones to focus on and how to generalize from one problem to another.Novices also generalize, but they sometimes focus on irrelevant or non-productive features (Chi, Feltovich, & Glaser, 1981;Chi & VanLehn, 2012).Novices are often described as focusing on "surface" features, those that are immediately apparent but not necessarily important in the domain, while experts focus on "deep" features, which are more productive for reasoning about properties in a domain.Part of being an expert is having an educated sense of typicality, a sense that reflects real relations in the domain (Feeney & Heit, 2007).With more knowledge about the domain, ideas about relevant features and relationships may shift; a landscape architect, a horticulturalist, and a botanist focus on different features of trees (Medin, Lynch, Coley, & Atran, 1997).
While deductive inference appears to be a late-emerging skill, requiring explicit practice and instruction, inductive inference seems to be a cognitive primitive (Kuhn, 1989).In a very basic sense, any form of conditioning is a kind of inductive inference (e.g., learning to expect a more frequent event over a less frequent event).Preschool-aged children, and even infants, are sensitive to some principles of inductive inference (e.g., generalizations are more likely to be true for more homogeneous classes ;Gelman, 1988).Research on scientific reasoning and critical thinking (see Kuhn, 1989) suggests that explicit, deliberate inductive argumentation or proof has an extended developmental and instructional timeline.
By 2 years of age, children understand that objects can belong to the same category, even if they are perceptually dissimilar, and use this to reason successfully about properties of those objects (Gelman & Coley, 1990).Kindergarteners can reason that a property is more likely to hold for one member of a class given that it holds for another member when those two members are similar (López, Gelman, Gutheil, & Smith, 1992).By Typicality Judgments 246 second grade, children can recognize typicality -how representative the member is of its class, and how that should impact judgments (López, Gelman, Gutheil, & Smith, 1992).Later, children gain the ability to also consider the diversity of examples -the idea that sets of examples that better cover the space of the domain are stronger for inferences.While 6-year-olds prefer sets of typical exemplars as evidence, 9-year-olds favor diverse samples, except when they are atypical, while adults consistently favor diverse samples (Rhodes, Brickman, & Gelman, 2008).The most sophisticated forms of inductive argumentation, a normative theory of induction, is formal statistics (Hacking, 2001).

Typicality and Mathematical Expertise
Although  Walkington,Cooper,Leonard et al. 247 It also is important to consider mathematicians' use of examples when exploring specific conjectures, particularly conjectures that would not be trivial to them with their expertise.In a follow-up study (Lockwood et al., 2016)

How Do Middle School Students Think About Typicality?
Middle school students may have less well-developed notions about choosing examples.In a recent study, middle school students and STEM doctoral students were engaged in card-sorting tasks where they constructed sets of numbers, triangles, and parallelograms (Williams et al., 2011).Both groups tended to sort numbers by whether they were even/odd or prime, with middle school students focusing more on multiplicative factors of the numbers, and STEM doctoral students focusing more on square numbers.Middle school students were also more attuned to surface-level features of numbers, like how many digits a number had and whether it contained a certain digit.For shapes, both STEM doctoral students and middle school students were most attuned to number of sides.STEM doctoral students also considered whether shapes were regular and whether they were symmetrical; middle school students were more likely to consider whether shapes were mathematically similar (i.e., same angles but a different size).Both middle school students and STEM doctoral students were also attuned to features of shapes like size, orientation, and familiarity.
Although this study provided useful ideas about properties of mathematical objects that are salient to different groups, we wanted to more specifically explore ideas about typicality.We engaged in a series of interviews with middle school students where we probed their ideas about what made mathematical objects "typical."This pilot interview study then led to a large-scale survey with middle school students and mathematicians about conceptions of typicality.

Pilot Study
For our pilot interview study, our research question was: What reasons do middle school students give for classifying numbers and shapes as typical or unusual?

Method
One-on-one interviews with 20 middle school students (11 female, 9 male; 8 sixth-grade, 7 seventh-grade, and 5 eighth grade or higher math courses) from a large suburban school district in a Midwestern state were conducted.Parental consent forms were distributed to students in 6 middle schools and students who returned consent forms were selected on a first-come-first-serve basis.The district used a reform-oriented curriculum,

Typicality Judgments 248
Connected Mathematics 2 (Lappan, Fey, Fitzgerald, Friel, & Phillips, 2006).Students were engaged in a 1-hour video-recorded interview where they were presented with the tasks in Table 1.There were 2 interviewers who were math education graduate students, and they followed a semi-structured protocol for interactions with participants (Kvale, 1996).The first four tasks were specific conjectures the participant would attempt to prove or disprove (Eric, Amy, Lewis, & Bob in Table 1).An analysis of how participants proved these conjectures (e.g., empirical versus formal algebraic approaches) is reported elsewhere (Cooper et al., 2011); testing examples was a primary strategy.Note.There were 10 Generic conjectures, but only 2 are shown for illustrative purposes.
We focus on a series of questions the interviewer asked each participant after they had determined the truth of each conjecture.The interviewer pointed out each example participants had tested when exploring the conjecture and asked the student if each example was typical or unusual and why it was typical or unusual.
Participants were also given 10 generic conjectures (such as Matt and Caro in Table 1) within 5 mathematical domains (whole numbers, even numbers, odd numbers, triangles, and parallelograms).Participants were either asked to pick either a "very typical" example to test the unspecified conjecture, or a "very unusual" example.
The examples that middle school students generated for specific and generic conjectures were coded based on (1) whether they deemed the example to be "typical" or "unusual" and (2) the reasons the middle school students gave for why their examples were "typical" or "unusual."The reasons were coded using emergent categories derived from constant comparisons (Glaser & Strauss, 1967), with multiple codes possible for a single example.For the four specific conjectures, the 20 participants generated 278 total examples (181 typical, 72 unusual, and 25 not specified).For the ten generic conjectures, the 20 participants generated 182 total examples (they were instructed to generate one typical or unusual example per conjecture).For the generic conjectures, in some cases the interviewer ran out of time and not all participants received all tasks -this happened in 18 instances.For the specific conjectures, in some cases, the interviewer did not ask participants to provide a reason why their example was typical/unusual, either because of time considerations or because each generated example was not exhaustively cycled through -this happened in 35 instances.A second Walkington,Cooper,Leonard et al. 249 trained coder coded a 10% subset of participants' evaluations of (1) whether numbers/shapes were typical/ unusual and (2) the reasons given for why, and obtained Cohen's kappa reliability values of 1.0/1.0(number/ shape) and 0.81/0.73(number/shape) respectively.

Results
Results showed that for the two specific conjectures related to numbers (Eric and Amy in Table 1), the most common reasons given for numbers being unusual (with category prevalence in parentheses) were: a number being uncommonly encountered in life (26% of instances), a number being prime (23%), and a number being odd (13%).The most common reasons given for numbers being typical were: a number being even (35% of instances), being commonly encountered in life (20%), numbers that were composite (17%), and numbers that were small (8%).For example, one student said that the numbers he tested were typical because "Well they're like even, and I seem to like even numbers more than I like odd 'cause I like-I like symmetrical things.That's just-that's just me." For the two specific conjectures related to geometric objects (Lewis and Bob), the most common reasons given for shapes being unusual were: a shape having unequal sides (33% of instances), shapes that were elongated or in non-standard orientations (14%), shapes that were not commonly encountered in life (9%), and shapes with right angles (9%).The most common reasons given for shapes being typical were: a shape being commonly encountered in everyday life (22% of instances), a shape having equal sides (14%), a shape being labelled as a square or equilateral triangle (12%), and a shape being labelled as an acute triangle (8%).One Being encountered in everyday life was often mentioned as a reason for being typical, so the patterns for these justifications were examined more closely.Most often (37% of typical cases coded as "everyday life"), the reason would describe how frequently the object was encountered or used.Other common reasons within this category related to how the object was seen when working with money, packaging, school books or tasks, and when examining architecture or other real world structures/objects.Thus both the idea of frequent real world encounters as well as specific, salient real world connections seemed important for typicality.
The results for general conjectures were similar.The most common reasons for a number being typical were being even (45% of instances), often encountered in everyday life (44% of instances), Base-Ten properties (22%), divisibility properties other than 10s/2s (20%), or being small (11%) or an "origin" number (like 1; 11% of cases).The most common reasons for a number being atypical were not being encountered in everyday life (47% of instances), being odd (35%), being prime (22%), being large (16%), and being difficult computationally (16%).The most common reasons given for a shape being typical were being encountered in everyday life (62% of instances), being an equilateral triangle or square (21%), having equal sides (18%), or being symmetrical or not elongated (15%).The most common reasons for a shape being atypical were not being encountered in everyday life (32% of instances), having unequal sides (32%), and being elongated or in nonstandard orientation (12%).

Discussion
This study suggested that middle school students were able to decide whether the examples they tested were "typical" or "unusual," and give reasons for this classification based on mathematical and non-mathematical properties.There were characteristics of both number and shape that were consistently cited as either relating to typicality or atypicality, with frequency in everyday life being a major consideration for both.The ways in which these typicality relations interact with middle school students' ideas about mathematical generalization is less clear, as are the ways in which their judgments relate to the judgments of mathematicians, who are more experienced with mathematical justification.We explore these questions next.

Survey Study
We frame our next study by discussing two aims that came out of prior research.Taken together, the card sort study (Williams et al., 2011), the mathematician questionnaire study (Lockwood et al., 2016), the mathematician interview study (Lockwood et al., 2016) and the pilot interview study generated interesting insights about the possible typicality relations in the domain of mathematical proof.
Our first aim was to explore which particular items mathematicians and middle school students consider typical or atypical, without imposing any particular categorization of these items or calling upon specific properties.The ways mathematicians and middle school students reason about objects may vary depending on whether the context is explicitly mathematical (e.g., conjecture-testing), or an everyday life context outside of canonical mathematical practices (e.g., walking through their neighborhood).We are interested in exploring typicality relations drawn upon in both mathematical and everyday contexts (which we refer to as "mathematical typicality" and "everyday typicality").We hypothesize that mathematicians may differ from those less experienced with mathematics in that mathematicians have a sense of mathematical typicality that is distinct from everyday typicality.
Relating to our second aim, recall that in other domains, like biology, an awareness of typicality relations means understanding deep features, those that will impact your ability to generalize from examples, while disregarding surface features that do not impact important domain relations.In our prior studies, middle school students and mathematicians brought up a variety of mathematical properties and other characteristics that contributed to an object being deemed "typical" or "unusual" (e.g., being a multiple of 10, a small number, or an equilateral triangle).Our second aim was to compile a list of specific properties cited in these prior studies and examine how an object having each of these properties was associated with typicality ratings in mathematical and everyday contexts.This is useful because understanding how experts and novices construct typicality relations from mathematical properties has practical, instructional implications.In addition, a finding that distinctions between mathematical and everyday typicality correspond to an existing network of known mathematical properties would allow for a deeper understanding of how typicality structures operate in mathematics, compared to simply observing that there are differences in a series of isolated cases (as in RQ1).Table 2 describes the properties of numbers and shapes that were compiled, as well as which prior studies were drawn upon when considering whether to include that property.We hypothesize that middle school students and mathematicians will both attend to specific properties for judgments about typicality; however, whether a property makes an object typical or atypical may vary.

Research Questions
We investigate the following research questions (each related to one of our aims): R1.Is there a sense of "mathematical typicality" within mathematical contexts that is different from "everyday typicality" within everyday contexts, and what items are considered typical or unusual in these contexts?We investigate whether mathematicians and middle school students distinguish Typicality Judgments 252 mathematical from everyday typicality by looking at how correlated their everyday typicality ratings were to their mathematical typicality ratings, as well as what specific objects they rate as more or less typical.

R2. What specific characteristics or properties drawn from prior work make objects typical and atypical
for mathematicians and middle school students in each context?We investigate the overarching mathematical properties (congruency, divisibility, etc.) and everyday features (size, orientation) of mathematical objects that influence typicality judgments in mathematical and everyday contexts.

Method Participants
Middle school student sample -A total of 475 middle school students (46% female) took the survey with pencil and paper.These students were from a school in the pilot study.The school was 48% Caucasian, 21% African American, 14% Asian, 11% Hispanic, and 1% Native American, with 37% free/reduced lunch and 10%

Instruments
Middle school student survey instrument -Each survey form contained questions from two of four different domains: numbers, parallelograms ii , triangles, and birds (birds are omitted from this analysis).For each domain, middle school students were presented with mathematical objects or items in that domain (e.g., a small equilateral triangle or the number "6") and asked to rate each item's typicality on a 1 (not at all) -7 (very much) scale (Figure 1) in a mathematical context and in an everyday context.These contexts were accompanied by instructions that told the student to consider the survey item in the relevant way, referring the student to their experiences in math class or at home.Mathematical typicality pertained to whether conjectures that held for the mathematical object would generalize to most other objects of that type.Although this is a generic prompt (as it does not give a particular conjecture), given that we were administering a large-scale survey to make broad comparisons between contexts, objects, and groups, it made the most sense for our research purpose.Walkington,Cooper,Leonard et al. 253 Everyday typicality asked about the typicality of the item in everyday life.The full text of these instructions is shown on the left side of Figure 1; these instructions were only given once at the beginning of each section.In addition, each question, as shown in the second column of Figure 1, re-iterated the context.Item selection -Mathematical objects to be placed on the survey were selected by the researchers to either cover the space of possible mathematical properties in the domain (e.g., the parallelogram in Figure 1 is a rectangle; we also included squares, rhombi, etc. to include the mathematically special classes of parallelograms) or to have few properties that would distinguish the object mathematically (e.g., a long, skinny rhomboid with no 90-degree angles).See Table 2.We also chose items that were common in everyday life (e.g., the number 10) and items that were uncommon in everyday life (e.g., the number 102), following our preceding discussion.See Appendix for a complete list of the items used on the survey.

Survey design -
The number items were divided into three sets; there were two sets of parallelograms and two sets of triangles.There was also an additional set where middle school students were asked to judge the similarity or dissimilarity of two items, not considered here.Each student received two of these sets on their survey, and for each set they were asked to give ratings of the items in the set in both the mathematical and everyday contexts.Whether they rated items in a mathematical context or everyday context first was randomized.The order of the items (the 9 triangles, 9 numbers, or 10 parallelograms) within each context was randomized, as was the order of the 2 sets each student received.
Mathematician survey instrument -On Version 1 of the mathematician survey (given to 186 of the 326 participants), mathematicians would rate 7 number items in a mathematical context, and 7 number items in an everyday context.Then they would either rate 5 triangles or 5 parallelograms in a mathematical and everyday context.The instructions, prompts, and rating scales were all identical the middle school student survey to ensure strict comparability (Figure 1).The specific prompts were also identical -for everyday typicality, participants were asked "How typical is this number/triangle/parallelogram of those you see in your everyday life?" and for mathematical typicality participants were asked "Imagine that we learned a new mathematical property that was true of this number/triangle/parallelogram.How likely is it that the property will be true of most other numbers/triangles/parallelograms?" (Figure 1).

Typicality Judgments 254
The items each mathematician received were randomly selected by the Qualtrics survey environment from a set of items that was twice as large (sets of 14 numbers, 10 triangles or parallelograms).A new version of the survey (Version 2) was introduced approximately halfway through the data collection period and was taken by 140 of the 326 participants.In this version the number and shape items were different, and the participants would rate a total of 8 number items in each context instead of 5. When collapsing both versions, the mathematician survey used the same number and shape items as the middle school student survey, except that one parallelogram and one triangle were left off of the mathematician survey.However, two additional number items (163452 and 3432984) were added to the mathematician survey to see if mathematicians rated the typicality of very large numbers differently than smaller numbers.

Data Collection Procedures
Middle school student data collection procedures -Middle school students completed the paper-based survey during their regular math class.They would first respond to a prompt about how typical it was to eat oatmeal for breakfast to familiarize themselves with the Likert rating scale and the idea of typicality.They were instructed to answer every question on the survey and were told that there were no "right" or "wrong" answers.
Demographic data relating to gender, grade level, and teacher was requested.
Mathematician data collection procedures -Mathematicians were recruited by sending emails to mathematics departments at 39 universities in the United States.The email was sent to department personnel with the request that it be forwarded to mathematics faculty and doctoral students.The email invited the mathematicians to participate in an online survey in Qualtrics -there was no compensation offered and participation was voluntary.The email and the introduction to the survey stated that the purpose of this research was to better understand how middle school students use evidence in mathematics by examining mathematician data on example usage.Demographic data relating to gender, education, area of specialization within mathematics, language, and country of origin was collected from the mathematicians.

R1: Mathematical Versus Everyday Typicality
The first part of the first research question asked whether participants distinguished between two contexts: everyday and mathematical typicality.We addressed this question by calculating and visualizing the correlation between everyday context and mathematical context typicality ratings for each group (mathematicians, middle school students) in each domain (numbers, parallelograms, triangles).We computed Pearson correlations by averaging for each group their ratings of each item within a domain, for everyday and mathematical contexts.
We then looked at the correlation between those two sets of item averages.
Middle school students' everyday and mathematical typicality ratings were strongly correlated for the triangle that were typical in their everyday lives as mathematically typical, and the numbers that they considered everyday atypical were also treated as mathematically atypical.In contrast, mathematicians tended to see highly familiar and common shapes (everyday typical) as mathematically atypical, and they rated unfamiliar, Walkington,Cooper,Leonard et al. 255 uncommon shapes as mathematically more typical, resulting in a negative correlation.Additionally, for mathematicians there was a less reliable relation between everyday and mathematical typicality for numbers.
We explored these trends further by plotting the average mathematical typicality rating for each item versus the average everyday typicality rating for each item for both middle school students and mathematicians.If mathematical typicality and everyday typicality are not distinct, we would expect all items to fall on the line y = x, meaning that their everyday typicality rating (x coordinate) and mathematical typicality rating (y coordinate) are identical.This graphical approach highlights items whose typicality varies based on the context, allowing us to see which items are rated differently in a mathematical versus everyday context.
Figure 2 (6 panels) presents mathematicians' and middle school students' average mathematical and everyday ratings for number, triangle, and parallelogram items.The items in the upper half of each plot are items that were considered typical in a mathematical context, meaning that participants thought properties that hold for these items are likely to generalize.Items in the bottom half of each plot are items that were considered mathematically atypical, meaning that generalization from these examples to all items would generally be weaker.Items on the right side of the plot are considered to have high everyday typicality, with the items on the left side had been rated as relatively uncommon in everyday life.These plots reveal general trends for what objects are considered typical/atypical, the second part of our first research question.Middle school students found the numbers 1, 2, 10 and 25 to be typical in both mathematical and everyday contexts, while numbers like 83, 102, and 57 were atypical in both contexts.Mathematicians showed somewhat similar trends for everyday typicality, but only found 0 and 1 to be mathematically atypical.Middle school students found groups of triangles and parallelograms that could be described as part of well-known classes (e.g., equilateral, rectangle -see Figure 2) as typical in both contexts, while mathematicians found these same objects typical in an everyday context but atypical in a mathematical context.
Of interest in the number domain are a group of apparent outliers where everyday and mathematical typicality do not correspond.Mathematicians rated the additive and multiplicative identity elements (0 and 1) as less mathematically typical than other numbers, despite their high level of commonness in everyday life.For middle school students, there is one extreme outlier in the number domain -13 -which is rated as less mathematically typical than other numbers of similar everyday typicality (e.g., 14, 11).This deviation likely reflects the status of 13 as a "superstitious" number.However, it is also possible that 13 is a salient example of the mathematical property of being a prime.A second outlier, 10,000, was rated by middle school students as more mathematically typical than other numbers, but less everyday typical.This rating may reflect the special significance of powers of 10 in our base ten counting system.Thus there was evidence that middle school students may distinguish the mathematical and everyday typicality of numbers.However, it is not clear that they are basing their judgments on mathematically significant features of the numbers.We see fewer obvious outliers in the shape domain -middle school students have a very consistent pattern where they rate shapes similarly in everyday and mathematical contexts, and mathematicians have a very consistent pattern where they rate shapes with high everyday typicality as having low mathematical typicality.

R2: Properties That Make Objects Typical or Atypical
The second research question asked what specific properties make objects typical or atypical for mathematicians and for middle school students.The preceding analyses demonstrate that mathematicians and middle school students show very different judgments about mathematical typicality.Moreover, they suggest Walkington,Cooper,Leonard et al. 257 that middle school students may use everyday typicality as the basis for judging mathematical typicality.To explore this issue further, we used a list of specific properties relevant to typicality, determined from prior research (Table 2).
Data were analyzed using mixed-effects linear regression models (Snijders & Bosker, 1999) run using the lmer command (Bates & Maechler, 2010) in the R software package (R Core Development Team, 2010).All regression models included participant ID and item ID as random effects and the typicality rating (1-7) as the dependent measure.Model selection was conducted using a chi-square test to examine significant reductions in deviance, and significance levels were computed using MCMC sampling (Baayen, 2008).We fit a separate model for each domain: number, triangle, and parallelogram, as each domain has different specific properties of import.We also fit separate models for middle school students and mathematicians.The models included context (mathematical, everyday) as a fixed effect, indicator variables for the specific mathematical and everyday properties (Table 2), and the interaction of context and the properties.Ratings in everyday and mathematical contexts was placed into a single model, rather than separate models, for several reasonsprimarily, it allowed us to detect cases where, for instance, a property increased typicality in both contexts, but increased it significantly more in one context.In addition, keeping the data together allows for better estimation of participant and item random effects, and reduces Type 1 error.
Tables 3, 4 and 5 present regression results showing the contribution of various properties to typicality judgments.In each regression, one level of a factor was designated as the reference level.The coefficients indicate the magnitude of the effect with a change to the other level of the factor (all factors were binary, except magnitude).For example, in Table 3 (number) mathematical typicality is designated the reference level, as is "being a large number."Other reference categories for the binary predictors are not explicitly labeled, but are the opposite of the level indicated -for example, the references levels include not being an Identity Element, not Ending in 5, etc. Rows that are blank indicate variables that were significant in one model (e.g., mathematician), but not significant in the other model (e.g., student).
In general, the analyses presented in Tables 3, 4 and 5 support the qualitative patterns suggested in Figure 2.
For mathematicians, only Identity had a significant effect on judgments of mathematical typicality -it decreased mathematical typicality.For middle school students, Small Magnitude, Multiple of 5, Power of 10, and Even Parity increased mathematical typicality.Thus middle school students felt that conjectures were more likely to generalize if the numbers tested were small or even numbers, or multiples of 5 or powers of 10.For middle school students, magnitude had a significantly larger effect on everyday than mathematical typicality, but significantly increased typicality in both contexts.Notably, only middle school students' everyday typicality judgments were influenced by a number being 0 or 1 (which we refer to as an identity element), not their mathematical typicality judgments.Mathematicians and middle school students both saw 0 and 1 as more everyday typical than non-identity numbers, but mathematicians saw Identity Elements as less mathematically typical than other numbers.Mathematicians made sharper distinctions between the mathematical and everyday typicality, rating primes, perfect squares, powers of 2, and even numbers as significantly more everyday typical than mathematically typical.
Typicality Judgments 258 Analyses of triangles and parallelograms showed similar patterns.Mathematicians judged Isosceles and Equilateral triangles as less mathematically typical than non-Isosceles and non-Equilateral triangles (respectively).However, the effects of these dimensions reversed for everyday typicality (e.g., Isosceles more typical than non-Isosceles).Right triangles were more everyday typical than non-Right triangles for mathematicians.Middle school students judged non-Skinny, Isosceles, and Equilateral triangles to be more typical in both contexts, although equilateral triangles were significantly more typical in an everyday context, compared to a mathematical context.Standard orientation and Right contributed significantly to middle school students' everyday typicality but not mathematical typicality judgments of triangles.Thus for triangles, mathematicians recognized that special mathematical properties (Isosceles, Equilateral) make triangles less mathematically typical, while middle school students had the opposite trend.In an everyday context, both groups recognized properties (Equilateral, Isosceles, Right) that make triangles more typical in an everyday context.Walkington,Cooper,Leonard et al. 259  Note.Blank rows indicate that predictors were not significant for that group.
For parallelograms, mathematicians judged Squares to be less mathematically typical than non-Squares, Rectangles tend to be less mathematically typical than non-Rectangles, and Rhombi less mathematically typical than non-Rhombi.However, in each case these effects reversed for everyday typicality (e.g., Squares more typical than non-Squares).Parallelograms with near Golden-ratio proportions were rated by mathematicians as more everyday typical than other parallelograms, but proportion had no significant effect on mathematical typicality.For middle school students, being Square and being Rectangular significantly increased mathematical typicality.These same features also increased everyday typicality, with squares being significantly more everyday typical than mathematically typical.Several other features, including Size, Orientation, and Skew, affected everyday but not mathematical typicality.For mathematicians, special mathematical properties like Rectangle and Square increased everyday typicality but decreased mathematical typicality; for middle school students, these properties increased both ratings.Middle school students and mathematicians also attended to other more everyday properties (Golden-ratio, Orientation) when considering everyday typicality.
Typicality Judgments 260 Note.Eday = every day.Blank rows indicate that predictors were not significant for that group.

General Discussion
Expertise is often characterized by distinctive judgments regarding typicality relations among objects in a domain.In mathematics, it is reasonable to assume that experts see different connections among objects than others.Psychological principles of inductive inference have been explored in other domains, such as living things (Osherson et al., 1990).However, little research has examined how these inductive principles may have relevance in mathematics.In addition, while deductive reasoning is generally thought to become fully available during adolescence (i.e., middle school), little prior research has examined when and how principles of strong versus weak inductive inference in mathematics tend to develop in children, and how understanding of such principles can be facilitated.Here, we explored the nature of mathematicians' and middle school students' judgments of mathematical objects with respect to typicality relations -judgments of how appropriate it is to generalize from a particular object to the domain.Results suggest that mathematicians have some special and somewhat consistent criteria for judging the typicality of objects in their domain.Middle school students, on the other hand, did not yet have fully developed ideas related to the significance of mathematically-specific organizations for examples.The appropriate use of principles of inductive inference, such as typicality, has been found to develop early in domains other than mathematics (e.g., López, Gelman, Gutheil, & Smith, 1992).
Children can recognize which members are typical instances of their class, and use typicality as a basis to draw valid inferential conclusions.However, as mathematical domains have the problem of conflating everyday and mathematical typicality, it does not appear that development happens in the same manner.

Everyday Typicality
Mathematicians and middle school students both showed robust and consistent judgments about what we termed "everyday" typicality.Some numbers and shapes were consistently cited as more common in everyday life than others (e.g., small numbers and equilateral shapes).Middle school students' reasoning often cited that these objects are encountered more often in their experience.People may have perceptual (Feldman, 2000) or experiential biases from interacting with the world (Lakoff, 1990) that lead them to represent some objects as more typical and then report that typical objects are most common.Further, items with high utility in everyday life (smaller numbers, "nice" numbers) tend to be used more often, and may thus be more salient in everyday contexts.We suspect that everyday typicality judgments reflect a mix of frequency, expectations, and utility.The numbers 1 and 10 were cited as being highly typical in everyday life, likely due to both the frequency with which they are encountered and their usefulness for making quick calculations within Base-Ten.In sum, mathematicians and middle school students generally agreed with each other on their ratings of everyday typicality.

Mathematical Typicality
Both studies showed interesting contrasts between this everyday sense of typicality, and a "mathematical" sense of typicality -i.e., the degree to which one mathematical object was a reliable basis for inferences about other objects.Middle school students showed robust and consistent judgments about mathematical typicality for both number and shape that did not differ from their judgments about everyday typicality.In contrast, mathematicians made sharp distinctions between everyday and mathematical typicality; these judgments were almost perfectly negatively correlated for geometric shapes.The most typical shapes in an everyday context, like squares and right triangles, were the least typical in a mathematical context.For numbers, mathematicians either did not have consistent judgments about the mathematical typicality or did not see a relevant network of typicality relations when considering a generic conjecture in this domain.Whatever this signals about the nature of expertise in the domain of numbers, the contrast with middle school students is important.Mathematical typicality is not the same as everyday typicality for mathematicians, for numbers or shapes.

Limitations
Our studies varied whether participants were given a specific mathematical conjecture to explore (like in the pilot interview study) or given a generic prompt about conjecture-testing (like in the survey study).Typicality judgments are certainly applied most productively when a specific mathematical task given.For example, when considering conjectures about additive properties, concerns relating to whether a number is positive or negative might be most paramount for generalizability.In contrast, when considering conjectures about multiplicative properties, properties relating to divisibility might matter most.The type of mathematical task may be especially important for numbers (which have many potential properties of importance) compared to shapes (which have a clearer hierarchical organization of relevant properties).
In addition, in the survey study the classes to which generalizations were invited were more specific for shapes than for numbers.Participants were asked whether facts about one number generalized to "most other numbers," while facts shapes were generalized to "most other triangles/parallelograms" instead of the larger class of all geometric figures.There was also a difference in specificity -shapes were presented without side or angle measurements, whereas the number items were specific -"13" or "102."Finally, the example numbers Typicality Judgments 262 generally noticed the same mathematical properties mathematicians did -they just struggled to apply this knowledge in a way that follows principles of inductive inference.
In other research, we have questioned middle school students on whether typicality considerations are important to take into account when they are proving conjectures (Cooper et al., 2011), or when they are evaluating the empirical proofs of others (Cooper, Dogan, Young, & Kalish, 2012).Results suggest that while middle school students do have good informal ideas about the relationship between typicality and justification through inductive inference, formally applying these principles in practice is challenging.For example, one student in Cooper et al. (2011)

Implications for Instruction
With generalization being framed as central to mathematics education, our results suggest that classrooms would benefit from exploration of the properties of the examples K-12 students choose, and discussion of how these properties interact with ideas relating to generalization in the context of the particular conjecture at-hand.
Instruction on exploring conjectures could include discussions about the typicality of mathematical objectsi.e., which objects it makes the most sense to test as K-12 students first consider the conjecture, as they try to "break" the conjecture, and as they try to find a pattern to prove the conjecture, and why.Understanding that special mathematical properties of an object should be a consideration when determining whether a conjecture is likely to generalize to all other objects could be another instructional goal relating to the nature of inductive evidence in mathematical domains.Such discussions would also include what constitutes strong versus weak evidence in inductive reasoning.It is also important to consider how reasoning inductively can support the development of deductive proofs.Testing examples can be especially useful for understanding the structure of conjectures, or how they work, in order to reveal an underlying mathematical pattern.Finally, everyday aspects of typicality could be discussed -K-12 students could critically examine whether surface features like a number's magnitude or digit patterns or a shape's skinniness or orientation are important considerations in the context of their current mathematical exploration.

Concluding Remarks
Experts understand objects and ideas in their domain of expertise as a complex interwoven network of relationships based on important and useful properties.This network of relationships allows them to strategically confront and solve novel tasks through a consideration of relevant domain principles.Typicality is one such principle that is relevant to practices of inductive inference, but this principle has mainly been identified as being useful in fields other than mathematics.Here we provide evidence that mathematicians have a distinct sense of mathematical typicality that can guide their activities related to justification and generalization.
Typicality Judgments 264 Example spaces have structural characteristics like the density of available examples, the generativity potential for new examples using existing examples, the connectedness of examples, and "the extent to which a given example is specific or whether it is representative of a class of related examples" (Sinclair, Watson, Zazkis, & Mason, 2011, p. 302).
student discussed the triangles he generated, saying "I guess I think of the typical ones are all equal.And I guess I just think of triangles that are typical that are all in-just-I don't know….equilateral I guess.And then these that are all different lengths, I think they're just different.I don't think of them as typical."Another student said a parallelogram was unusual because "Probably because it's so like thin and long.Cause when I think of a parallelogram I think of like a house or a door."

Figure 1 .
Figure 1.Example of items from survey instrument.

How Do Mathematicians Think About Typicality?
be typical."),as well as random, exhaustive, and dissimilar examples.
, six mathematicians were presented with four conjectures (three of which were Putman exam conjectures) during one-on-one interviews.Results reinforced and extended the findings from the survey -

Table 1
Tasks Used in Middle School Student Interview Pilot Study Eric came up with a new mathematical property.He thinks this property is true for every whole number.First, pick any whole number.Second, add this number to the number before it and the number after it.Your answer will always equal 3 times the number you started out with.2.Amy came up with a new mathematical property.She thinks this property is true for every even number.First, pick any even number.Second, add this number to half of itself.Your answer will always be divisible by 3.
3. Lewis came up with a new mathematical property.He thinks this property is true for every triangle.For any triangle, two of the sides added together are longer than the third side.4.Bobcame up with a new mathematical property.He thinks this property is true for every parallelogram.The angles inside any parallelogram add up to 360 degrees.Generic Conjectures 5. Matt came up with a new mathematical property.He thinks this property is true for every whole number.If someone asked you to pick a very typical whole number to test if this property is true, what whole number would you pick? 6. Caro came up with a new mathematical property.She thinks this property is true for every whole number.If someone asked you to pick a very unusual whole number to test if this property is true, what whole number would you pick?

Table 2
Mathematical and Everyday Properties Considered for Entering Individual Properties Into Model (RQ2), With Justifications for Inclusion a Items with this property were given to mathematicians only.

Table 3
Middle School Students' (Left) and Mathematicians' (Right) Regression Model Outputs for the Effects of Mathematical and Everyday Properties of Numbers on Typicality Ratings

Table 4
Middle School Students' (Left) and Mathematicians' (Right) Regression Model Outputs for the Impact of Mathematical and Everyday Properties of Triangles on Typicality Ratings

Table 5
Middle School Students' (Left) and Mathematicians' (Right) Regression Model Outputs for the Impact of Mathematical and Everyday Properties of Parallelograms on Typicality Ratings described how "If he didn't use unusual numbers, you know, you can never be sure if his property is correct," while another said "If I used some ones that you people wouldn't normally use, besides 10, and if I did a little more maybe it wouldn't be or maybe it'd still be true."In addition, middle school students in Cooper et al. (2012) recognized that testing a conjecture with more examples is better than fewer examples, and that using dissimilar examples is better than similar examples.Thus middle school students seem to have the building blocks of a more systematic sense of typicality as it relates to mathematical conjectures, but this relationship needs further development in the classroom.