Calculated avoidance: Math anxiety predicts math avoidance in effort-based decision-making

Math anxiety predicts how much effort people are willing to put into doing math.


This PDF file includes:
Materials and Methods S1. Participants: Recruitment details Materials and Methods S2. The CAST: Creation and validation of the problem set Fig. S1. Temporal stability of the math/word ADLs and HCPs. Fig. S2. Test-retest reliability of the math/word ADLs and HCPs. Fig. S3. Relationships between math anxiety and problem-solving variables. Table S1. Descriptive statistics and correlation matrix of questionnaires and behavioral measures of study 1. Table S2. Descriptive statistics and correlation matrix of questionnaires and behavioral measures of study 2. Table S3. Results of study 1 LMM analysis for the math-specific effort avoidance. Table S4. Results of study 2 LMM analysis for the math-specific effort avoidance. Table S5. Results of comprehensive LMM analysis for study 1. Table S6. Results of confirmatory generalized regression analysis for study 1. Table S7. Results of confirmatory generalized regression analysis for study 2.

Materials and Methods S1. Participants: Recruitment details
As indicated in the main text, participants were recruited using TurkPrime to complete the online study via the online labor market Amazon Mechanical Turk (AMT) in which workers perform human intelligence tasks (HITs) for requesters. The study was conducted during weekdays only; participation was allowed only between 8 a.m. to 8 p.m. CST. When participants failed more than two (out of ten) attention check questions embedded in the questionnaires that preceded the choose-and-solve task (CAST), the study session was terminated.
In Study 1, AMT worker qualifications included location in the United States, a HIT approval rate greater than or equal to 98%, and the number of HITs approved greater than or equal to 5,000. A target sample size of 194 was set to detect an expected correlation of .2 between selfreported and behavioral measures with 80% power at a 5% significance level. However, only 154 participants were able to complete the study due to computer errors, and demographic information from 9 (out of 154) participants was lost due to computer errors.
In Study 2, TurkPrime Panel Options were used to recruit specific demographic groups without asking for demographic information before the questionnaires and CAST to avoid triggering stereotypes that could affect the self-report and behavioral measures. Specifically, we targeted 188 female and 188 male participants who speak English as their first language and are between the ages of 18 and 35 in addition to using the AMT worker qualifications of location in the United States, a HIT approval rate greater than or equal to 90%, and the number of HITs approved greater than or equal to 500.

Materials and Methods S2. The CAST: Creation and validation of the problem set
Math problems: To create a large pool of 3-alternative choice math problems, candidate math problems were first generated by multiplying a list of 3-digit numbers (100-999) with a list of 1digit numbers (2-9), and by multiplying a different list of 3-digit numbers (100-299) with a list of 2-digit numbers (11)(12)(13)(14)(15)(16)(17)(18)(19). A new list of 3x1-digit problems whose solutions did not exceed 3000 and 3x2-digit problems whose solutions did not exceed 6000 were compiled, and 1500 problems from this set were randomly selected for further validation. The set of 1500 problems was adapted to fit our paradigm by removing one digit from the solution of each problem. The missing digit, which could be the 1 st , 2 nd , or 3 rd digit in the solution, was then used as one of the 3 alternative choices participants could select to solve the problem. An attempt was made to balance the frequency with which digits 0-9 were removed across problems. The other two alternative choices were always one unit higher and one unit lower than the removed digit (i.e. if "2" was the removed digit, the 3-alternative choice options would be "1, 2, 3").
Word-spelling problems were then created by replacing the third letter of each word with a blank space. Additional letters of some words were replaced by a tilde (~) to make it harder to determine which letter should go in the blank space. Tildes replaced the second letter of words that were 5-7 characters long (e.g., EVENTS → E~☐NTS), both the second letter and the sixth letter of words that were 8-10 characters long (e.g., EVIDENCE → E~☐DE~CE), and the sixth and ninth letters of words that were 11-13 characters long (e.g., ANNOUNCEMENTS → A~☐ OU~CE~ENTS). Words that became the same after having letters replaced with tildes were removed from the testing set. The solution to the word-spelling problem, which was always the third letter of the word (which we replaced by the blank space), was included as one of the three alternative choices participants could select to solve the problem; the 3-alternative choice sets were always either (a, e, and i; if the solution was either a, e, or i) or (n, r, t; if the solution was either n, r, or t), presented in this order. All letters in the word-spelling problems were presented in capital letters.
Problem set validation: Candidate math and word problems were grouped into sets of 60 to create 25 sets of math problems and 31 sets of word problems. Word problems were grouped so that 10 problems for each letter solution (a, e, i, n, r, and t) were included in the set. The positioning (left, middle, right) of the problem's solution in the solution bank was balanced across the problems in each math and word set. Each problem had to be solved within 8 seconds, however, key responses were not registered by the paradigm until 2 seconds after problem onset to discourage participants from making quick guesses. These sets were then uploaded to AMT as 56 separate HITs (25 math and 31 word HITs) that paid $1.50 each. A maximum of two math HITs and two word HITs were uploaded per day. 50 AMT workers were recruited to complete first ten of the math and word HITs, and 30 workers were recruited to complete the remaining 15 math HITS and 21 word HITs. AMT worker qualifications included location in the United States, a HIT approval rate greater than or equal to 98%, and number of HITs approved greater than or equal to 5,000. Workers provided informed consent before completing the HITs. Data from workers who got less than 28 (out of 60) problems were discarded. 692 unique workers completed at least one HIT (333 unique math workers, 545 unique word workers); 24 workers completed 10 or more math HITs, and 14 workers completed 14 or more word HITs. Workers' response time and accuracy were used to sort 1,999 math problems and 1,858 word problems into seven difficulty levels, which are available at https://osf.io/t4wju/.

Notes.
Correlations values were based on 142 participants who satisfied the problem-solving accuracy criteria (see Materials and Methods). †Correlation values were based on 135 participants, after excluding participants who never chose the hard options in either math or word conditions. *p < .05, **p < .001.

Table S2. Descriptive statistics and correlation matrix of questionnaires and behavioral measures of study 2.
Notes. Correlations values were based on 332 participants who satisfied the problem-solving accuracy criteria (see Materials and Methods). In contrast to Study 1, all Study 2 participants encountered at least 10 easy and 10 hard problems in both the math and word conditions throughout the task. *p < .05, **p < .001.