We can remember all kinds of details about our experiences in the world, but our visual systems have the capacity to ignore all kinds of details as well. Generalization relies on dual processes: attending to similarities while simultaneously ignoring differences. Efficient learning minimizes the necessary experience with learning instances (e.g., the number of learning instances needed or the time spent learning) and maximizes appropriate generalization.

Simple instances—those that are idealized or contain just the information that is relevant for generalization—have been shown to engender rapid learning with selective attention to the right information for the task. In the classic study by Biederman and Shiffrar (1987), novices briefly trained with simple line drawings of diagnostic features were able to classify chicks with the accuracy of expert chicken sexers, a skill that used to take thousands of hours classifying actual chicks to perfect. Those simple line drawings provided idealized versions of the defining features, making explicit what were subtle but critical cues for encoding and classification. More recently, young children who were taught category labels with simple objects, defined as those with fewer features and details, were more successful at generalizing to novel category members than they were when shown more complex learning objects (Son, Smith, & Goldstone, 2008). We refer to this asymmetry of transfer from simple versus complex training instances as the simple advantage.

Most of the research demonstrating the simple advantage has focused on learning and transfer of relational concepts in mathematics (Kaminski, Sloutsky, & Heckler, 2008; McNeil, Uttal, Jarvin, & Sternberg, 2009; Sloutsky, Kaminski, & Heckler, 2005) and science (Goldstone & Sakamoto, 2003; Goldstone & Son, 2005). In these relational domains, in order to generalize learning to a new situation, one must pay more attention to structural information than to superficial details that may differ across instances. For example, in McNeil et al. (2009), students who were given highly realistic bills and coins performed worse on word problems involving money than did students who were given either bland bills and coins (plain rectangular pieces of paper and circular tokens) or no bills and coins at all. McNeil et al. proposed that the extraneous surface features of the realistic bills and coins that were irrelevant to the task distracted students’ attention from the underlying mathematical structures of the problems (see also Goldstone, 2006; Sloutsky, Kaminski, & Heckler, 2005). Similarly, Goldstone and Sakamoto (2003) trained undergraduates on complex adaptive systems principles with simulations containing either idealized, abstract graphic elements (e.g., dots and blobs) or perceptually rich, concrete elements (e.g., ants and little fruits), and found that those trained with the perceptually rich display had more difficulty transferring their knowledge to a new domain that looked different but shared the same underlying concept as the one they had learned. This idea is not new; Clark Hull (1920) used slightly deformed Chinese characters to demonstrate that concepts were learned more quickly when they were learned in the order from simple to complex. Simple learning instances can facilitate such structural extraction by limiting the extraneous details that must be ignored and guiding attention to the right features at the time of encoding.

Little is known, however, about the perceptual-encoding mechanisms that may drive this simple advantage in these highly conceptual instances of transfer. To explore the perceptual mechanisms of the simple advantage, we trained and tested English-speaking adults in both category and perceptual generalization in a domain that contains a large number of complex and simple corresponding forms: Chinese character scripts. On the basis of the results of these studies, we propose a process model to explore the hypothesis that the simple advantage is driven primarily by differences in encoding perceptual elements while learning from simple and complex instances.

For a number of political and historical reasons, the traditional Chinese writing system was simplified in 1949. The simplified characters have approximately 22.5% fewer strokes than the more complex traditional script (Gao & Kao, 2002). Several different simplification processes were employed—some were based on Chinese history and meaning, but others were more straightforward perceptual simplifications. As a result, many characters and their components (recurring groups of strokes that make up the characters) have taken on quite different appearances (Harbaugh, 2003). Whether these differences between scripts affect the learnability of characters is the subject of ongoing debate among researchers who study Chinese language acquisition (see Chen & Yuen, 1991; McBride-Chang, Chow, Zhong, Burgess, & Hayward, 2005; Seybolt & Chiang, 1979). For example, Seybolt and Chiang argued that the traditional script, because it contains more visual features, may be easier to discriminate initially. It also contains more regular meaning- and phonetic-based components, which may better promote semantic and sound-based strategies earlier than among simplified-script learners. However, it is equally plausible that traditional characters are more difficult to learn because of the large numbers of strokes across characters. The current preponderance of evidence suggests that simplified characters are easier to learn than traditional ones (Hodge & Louie, 1998). This conclusion, however, is somewhat controversial, because it is hard to equate teaching curriculum, instruction, and cultural differences (see also Chung & Leung, 2008). Mostly importantly, and more relevant for our purpose, the bulk of this previous research has been on the acquisition of the two scripts, not on transfer from one to the other. It has been maintained that switching or transferring from one script to the other is straightforward; for example, Wang (2009, para. 4) stated that “the structural continuity makes the switch between them easy and smooth, a skill any educated person can quickly acquire.” This assumption, however, has not been empirically tested, partially due to complicated issues of aesthetics, history, politics, and tradition; partially due to these complicated issues, little research has examined differences in reading the distinct scripts. Such an endeavor, primarily motivated by issues in cognition and learning, would provide a quantitative way of exploring the different contentions. The rich sets of naturally occurring simple and complex corresponding characters provide a domain ideally suited to the purpose of examining the simple advantage in perceptual categorization.

Purpose of present work

The primary motivation of this set of experiments and the accompanying process model was to examine the hypothesis that simple learning instances have an advantage in encoding that drives later advantages in generalization. Our first experiment replicated previous findings that learning with simplified instances leads to greater category generalization than does training with complex forms. In a second experiment, we examined the question of perceptual generalization: Does the simple advantage occur even when participants have superficial and minimal exposure to the simplified forms? The third experiment tested novel predictions of the process model, by examining the effect of longer exposure times on the simple advantage.

Experiment 1

Participants were asked to study flashcards with a Chinese character on one side and an English definition on the other side. After each set, the participants’ memorization was measured with a matching-to-sample task in which students were briefly shown the English definition and had to pick out the matching character from among four answer choices. After the memory test, generalization was measured in the same matching task, except that participants had to match the definitions with characters from the unlearned script. In the traditional-first condition, participants studied traditional characters and their English definitions. During the traditional-first memory test, they were shown traditional characters paired with English words and first were tested on the traditional characters that they had learned (TT trials, indicating traditional learning instances and a traditional test); then they were given a generalization test in which the choices were replaced with the corresponding simplified characters (TS trials, indicating traditional learning but a simple test). During the simple-first condition, participants studied and had a memory test with simplified characters (SS), but their generalization test had traditional versions of the learned characters (ST). If simplified learning instances promote category generalization, participants should show better generalization in the simple-first (ST) than in the corresponding traditional-first (TS) condition.

Method

Participants and design

A group of 14 undergraduates (seven females and seven males) participated for course credit. All reported to having no prior experience with Chinese characters. In this within-subjects experiment, half of the participants experienced the traditional-first condition (learning, memory test, generalization test) before the simple-first condition, each on a different set of characters, and the other half experienced the two conditions in the reverse order.

Materials and procedures

Although historical or semantic reasons lie behind some types of Chinese character simplifications, the subset of characters chosen for this study were perceptually simplified forms of their traditional counterparts. In each pair of characters, up to two components (stroke groups called radicals) of the traditional characters were omitted in order to produce their simplified versions. Thus, the simplified characters had fewer strokes as well as fewer components. The simplified characters used had 3–13 strokes per character (average: 7.23 strokes), and their traditional versions had 8–22 strokes per character (average: 14.06 strokes). We created four sets of 12 unique character pairs, but each participant only studied two of these sets in either the simplified or the traditional script. Thus, each participant studied 12 traditional and 12 simplified Chinese–English pairs. The number of omitted strokes, the number of omitted components, the location of the omitted components within each character, and the usage frequency were balanced across the character sets. The four sets were used equally often across participants.

In the training phase, each participant received a randomly assigned set of 12 flashcards of either traditional or simplified characters, according to their assigned condition. Each character was printed in black, 36-point SimSun (宋体) font, and the English words were printed in black, 24-point Calibri font. Participants were told to study the Chinese–English pairs and that they would be tested on them later. Participants were not given a time limit for studying, and everyone finished within 20 min.

Once participants had handed in the flashcards, they were administered the memory and generalization tests, in that order, on a computer using E-Prime 2.0 (Psychology Software Tools, Inc., Sharpsburg, PA, USA). For both tests, there were 12 trials, one for each of the 12 characters in the training set. A trial began with a fixation cross lasting for 0.5 s, followed by an English word for 2 s, then four Chinese characters. The distractor characters were randomly chosen from the set of trained characters. The location of the correct answer in the set of four alternatives was controlled so that each location occurred equally in each condition. The intertrial interval was 1 s, and the order of the trials was random across participants. No feedback was provided after each trial, but the average accuracy and response time were given at the end of each test. Figure 1 shows a sample trial and the procedure.

Fig. 1
figure 1

(a) Training phase, (b) memorization test procedure, and (c) generalization test procedure in the simple-first condition of Experiment 1

In the memory test, participants chose from Chinese characters identical to those in their training set. The generalization test was set up identically to the memorization test, except that the answer choices in this test were characters written in the unlearned script. Before the generalization trials, these instructions appeared: “There are two types of scripts in the Chinese written language, Traditional and Simplified. You have just studied characters written in one of these two scripts, and now we would like to see how well you can recognize the same characters written in the other script.”

Participants were given a 5-min break before they were given a different, randomly selected set of 12 flashcards with characters written in the other script. The entire procedure was repeated for the second set of characters.

Results and discussion

Proportion correct and average response time data for correct responses are presented in Fig. 2 and Fig. 3, respectively (please see the left panels).

Fig. 2
figure 2

Accuracy data from the memorization and generalization tests in Experiment 1 (left panel) and from the exact-match and generalization tests in Experiment 2 (right panel). Error bars indicate ±1 SE

Fig. 3
figure 3

Response time data of accurate responses from the memorization and generalization tests in Experiment 1 (left panel) and from the exact-match and generalization tests in Experiment 2 (right panel). Error bars indicate ±1 SE

Preliminary analysis

We observed no statistically significant difference in study times for the traditional and simplified characters (p > .05). We also found no significant differences among the four sets of characters (ps > .10) and no effect of condition order (ps > .10), so the accuracies and response times for each condition were collapsed across those variables. One participant (out of 14) performed at a chance accuracy level on all tests and was dropped from the following analyses. Preliminary analyses that included this participant’s data did not impact our results.

Memorization and transfer results

Because accuracy performance on the memory test was near ceiling with little variance, our data violated the normality assumption [Shapiro–Wilk, W(14) < 0.8, ps < .05]. Thus, we opted to confirm our findings with Wilcoxon signed ranks tests, a nonparametric version of dependent t tests on accuracy.

Accuracy

Performance was better on the memory test (M = .99, SD = .03) than on the generalization test (M = .86, SD = .10), Z = 2.99, p = .003. Although the two conditions exhibited similar memory performance, the simple-first condition generalized more accurately than the traditional-first condition. Participants in both the traditional-first (M = .99, SD = .03) and simple-first (M = .98, SD = .04) conditions successfully learned the word pairs and recognized them equally well, Z = 0.45, p = .66. Generalization accuracy was significantly higher in the simple-first condition (M = .91, SD = .06) than in the traditional-first condition (M = .80, SD = .14), Z = 2.52, p = .01.Footnote 1 As predicted, participants who initially learned simplified characters generalized their learning to the transfer script better than those who learned traditional characters.

Response times for correct trials (RTs, given in seconds per trial)

Participants were faster on the memorization trials (M = 2.71, SD = 0.92) than on generalization (M = 5.54, SD = 2.15), paired t(13) = 6.63, p < .001, d = 1.77. Those in the simple-first condition (M = 2.34, SD = 0.74) were faster than those in the traditional-first condition (M = 4.38, SD = 1.67) on the memory test, paired t(13) = 3.63, p = .003, d = 0.97, but not on the generalization test, paired t(13) = 0.11, p = .92. Thus, when trained with simplified characters, participants tended to make more correct matches on the generalization test and were faster on the memory test than those who had trained with traditional characters.

Importantly, even though simplified and traditional characters were remembered equally well, the simplified training exemplars led to better generalization than the traditional ones. However, the simple advantage may have been dependent on the amount of exposure to the learning instance. In Experiment 2, we asked whether training with simplified characters is more efficient than training with traditional characters, even without extended training experience.

Experiment 2

To extend the findings of Experiment 1, we removed the training phase and modified the memorization and generalization tests to examine matches based purely on perceptual similarity. If simplicity promotes transfer by providing only the relevant perceptual features, then the simple advantage should persist even when generalization relies only on the perceptual similarities between simplified and traditional characters.

Method

Participants and design

A group of 23 undergraduates (10 males, 13 females) who reported having no knowledge of Chinese characters participated for course credit. Experiment 2 was also based on a within-subjects design, so the order of conditions was counterbalanced across participants: Twelve were randomly assigned to participate in the traditional-first before the simple-first condition, and the other 11 participated in the simple-first before the traditional-first condition.

Materials and procedures

The stimuli and procedures were nearly identical to those of Experiment 1. The key difference in Experiment 2 was the lack of a training phase. Thus, participants never connected any of the characters to their English meanings. Each trial began with a fixation cross, followed by a Chinese character for 2 s and four answer choices. In exact-match trials (SS and TT), participants matched characters to identical characters. On the generalization trials (ST and TS), participants were shown a character in one script and had to choose the match from among characters written in the other script. A sample trial and the procedure are shown in Fig. 4.

Fig. 4
figure 4

(a) Exact-match test procedure and (b) generalization test procedure in the traditional-first condition of Experiment 2

Results and discussion

Preliminary analysis

As in Experiment 1, we observed no effect of character set nor of condition order (ps > .10) in accuracy and RTs, so the data for each condition were collapsed across those variables.

Exact-match and generalization results

Average proportions correct and average RT results are presented in Figs. 2 and 3, respectively (see the right panels). As in Experiment 1, accuracy on the exact-matching task was uniformly high; thus, we used Wilcoxon signed rank tests to confirm differences in accuracy performance.

Accuracy

The results were consistent with the findings from Experiment 1. Participants made significantly more correct responses on the exact-matching test (M = .97, SD = .04) than on the generalization test (M = .68, SD = .12), Z = 4.20, p < .001. We also found a differential effect of the sample script on generalization. There was no significant difference between the simplified and traditional exact-match-to-sample tests, Z = 1.26, p = .21. However, the simple-first condition produced significantly better generalization performance (M = .79, SD = .14) than did the traditional-first condition (M = .57, SD = .18), Z = 3.56, p < .001.Footnote 2 Again, as in Experiment 1, training with simplified characters promoted greater generalization to traditional characters than training with traditional characters promoted transfer to simplified characters.

RTs for correct trials (given in seconds per trial)

Participants were generally faster in the exact-matching test (M = 1.44, SD = .07) than in the generalization test (M = 3.02, SD = .23), t(22) = 7.71, p < .001, d = 1.61. Although the simple-first condition was faster than the traditional-first condition in the exact-matching test, t(22) = 3.91, p = .001, d = .81, RTs in the generalization test were similar, t(22) = 1.05, p = .31. Whereas there had been no difference in accuracy on the exact-matching trials, traditional characters required more time per correct response than did the simplified characters (1.55 vs. 1.32 s). This result is interesting in light of classic experiments and theories of similarity.

As in Podgorny and Garner’s (1979) classic work, which demonstrated that participants judge the similarity of two Ss on a screen faster than that of two Ws, we also found that some Chinese characters were self-identified faster than others. Our results run contrary to the prediction derived from Tversky’s (1977) feature-based contrast model of similarity: Complex objects that share a greater number of overlapping features are more self-similar than simple objects, and therefore should be easier to self-identify. Traditional characters contain more strokes, so one might assume that they should be more self-similar and should result in shorter RTs in our exact-match test. However, it is important to keep in mind that the distractors in the field were also complex. These complex characters may also be more similar to each other, thus forcing participants to spend more time to distinguish the target among them.

MemSam: A computational model of the simple advantage

The simple advantage is thus far empirically limited to situations in which learners must generalize to new instances from only one learning instance (either a simple or complex one). Aside from the two experiments covered in this article thus far, most of the existing research on literacy with traditional and simplified Chinese scripts has not made any attempt to connect this effect to general theories of categorization. To understand the basic cognitive mechanisms that might underlie the simple advantage, we propose a simplified version of an exemplar-based process model of categorization (see Medin & Schaffer, 1978; Nosofsky, 1986) in which there is only one exemplar. In this memory-sampling model (MemSam), we assume that a probe stimulus functions as a retrieval cue to access already-stored information that is similar to the probe. We also assume that features from all items are sampled rather than encoded veridically in memory. Furthermore, we assume that traditional characters have more features than simplified ones. Although none of these assumptions and processes are particularly newFootnote 3 or innovative, MemSam puts these assumptions together to provide a coherent, process-driven account that can explain the simple advantage and generate novel predictions about the conditions in which we should observe it.

Key to this process model is the encoding of information during learning. In all cases, we assume that learners sample features of the presented example that are available during learning and that they do not generally have a complete and veridical representation of the exemplar. In a given task or type of stimulus, learners have a capacity limit of memory, \( {K}_m \), on the number of features they can sample and store. In the case of a simple learning instance, there are fewer features to sample, and accordingly, learners can encode a greater proportion of the features than in the case of a complex, traditional-learning instance. We will call the features successfully sampled and stored from the learning example the memory trace.

To prompt generalization, learners are presented with a probe and must decide whether the probe is sufficiently similar to the memory trace to give the response associated with the memory trace. We assume that only Kp of the probe’s features are sampled, because it is unlikely that all features of a complex novel figure will be mentally available for consideration. However, the number of features sampled for the probe trace is assumed to be greater than the number of features sampled and retained for the memory trace, because it has been presented more recently (K p >K m .). In the specific case of our present experimental paradigm, the probe remains visually present while the participant chooses, a limiting case of recency.

To describe MemSam’s encodings of the objects based on their visual appearances, we make the simplifying assumption that every stroke counts as a single feature. So, an originally trained traditional character with eight strokes would be represented as the memory trace m ={1,2,3,4,5,6,7,8} if all of its strokes were stored (i.e., if \( {K}_m\ge 8 \)), with each number representing a unique stroke in the character. The subsequently probed, simplified character would then be represented as the probe trace p ={1,2,3,4,5,6} indicating that it possesses a subset of six of the traditional character’s eight strokes (i.e., if Kp ≥6). In this case, the intersection m ∩ p is \( \left\{1,2,3,4,5,6\right\} \), which has a set size of 6, and the union \( m\cup p \) is \( \left\{1,2,3,4,5,6,7,8\right\} \), which has a set size of 8. The number of distinctive strokes in this pair would be 2. \( {K}_m \) and \( {K}_p \) are the parameters that limit the sizes of the samples, \( m \) and \( p \).

The likelihood of generalization is determined by the probability of choosing the memory trace \( \Big(m \)) response for the probe (p),p (C m,p ):

$$ p\left({C}_{m,p}\right)=\frac{{\left(F\cdot \frac{m\cap p}{m\cup p}\right)}^{\gamma }}{{\displaystyle \sum {\left(F\cdot \frac{m\cap p}{m\cup p}\right)}^{\gamma }}+{b}^{\gamma }} $$

This choice probability is defined as the evidence for a match between m and p, divided by the evidence for a match between m and all four choices (including p). The evidence for a match between m and p is the intersection of sampled features from the memory and probe traces divided by the union of the features in these traces. This proportion is multiplied by a feature match parameter, F, to represent whether a feature matches perfectly between the memory and probe traces (as in the memory conditions, SS and TT) or is similar but slightly distorted (in the generalization conditions, ST and TS). To make this more concrete, take the case of the TT condition. When a stroke is present in the traditional memory trace and that identical stroke (placement, angle, size) is present in the traditional probe trace, the feature match parameter is perfect (e.g., \( F=1 \)). However, in conditions such as ST, a stroke feature might be slightly larger and in a different position in the simple memory trace than in the traditional probe trace. Thus, the feature match parameter is less than perfect (e.g., \( F=.8 \)). When there are mismatching strokes, both matching and mismatching features are similarly distorted relative to the matching features. This evidence for a match is then used in a choice rule to account for the forced choice in this particular experimental paradigm. The choice probability also includes a parameter, \( \gamma \) (set to 4 in our simulations), interpretable as the determinism of responding (0 = chance responding, \( \infty \) = always choose the character that has the greatest evidence). Also, the baseline attraction of choices other than the correct one is represented by \( b \) (set to .2 in our simulations). Neither the choice probability nor the baseline attraction parameters significantly change the qualitative patterns exhibited in the model, but they have effects on the relative sizes of the effects.

This simple model captures some basic patterns in the experimental data and also generates some novel predictions. The first important behavioral characteristic is the advantage of learning with simple forms. MemSam demonstrates that simple learning instances would lead to enhanced memory performance—that is, SS would be greater than TT performance. For TT trials, the strokes in the complex, traditional form are likely to exceed the memory capacity K m , with the result that only a subset of the character’s strokes are stored, leading to imperfect match to the same character when it is later presented as the probe. By contrast, when a simplified form is presented, it is likely to be perfectly, or nearly perfectly, encoded into memory and matched to the simplified probe.

A similar account explains why generalization is also better—that ST exceeds TS performance. In the case of TS, when the traditional form is in memory, a relatively small proportion of its features are likely to be stored, meaning that its trace will match the probe trace relatively poorly. Thus, the generalization likelihood is less than in the ST condition, in which the simple form is in memory and all or most of its features are likely to be stored, meaning that it will match the probe trace well.

Another, more obvious and empirically observed pattern is that memory conditions (SS and TT) would produce greater performance than generalization conditions (ST and TS). These patterns that demonstrate the simple advantage are robust to the model parameters described above, as long as K p is sufficiently larger than K m . The baseline attraction parameter (b) mostly changes the overall performance such that if baseline attraction were 1, generalization in all conditions would be low. The choice probability parameter (\( \gamma \)) has an effect on the relative size of these patterns. For instance, if we let \( \gamma \) be 1 (the parameterless version of Luce’s, 1959, choice axiom), the simple advantage—although present—is less pronounced between non-exact-match trials (ST and TS) and exact-match trials (SS and TT). However, the overall pattern of results is similar, which shows the robustness of the model’s predictions to \( \gamma \) variation.

We want to highlight two novel predictions of MemSam. Usually when researchers test the “simple advantage” they make a straightforward prediction: Simple instances are better for learning than complex ones. However, what is good for learning is not necessarily good for transfer. The public debate on Chinese scripts is mostly centered on learning simplified versus traditional scripts, whereas we are empirically looking at the question of how transferable is the reading skill learned from one script to the other. Most studies do not go further to examine the conditions under which that is true or when that advantage might be most prominent. MemSam makes “novel predictions” in the sense that these are not predictions that have been made by other researchers examining the simple advantage. The predictions borne out by MemSam are both interactions that would be difficult to predict without the rationale provided by a model simulation.Footnote 4

First, since fewer features are sampled for the memory trace, there should be a greater advantage for learning with simple characters. Conversely, as the memory trace becomes more accurate (better sampling), the simple advantage should be diminished. For example, as learners are given more time studying the training exemplar, their encoded memory trace becomes more accurate because more features are accurately encoded. This suggests that a longer study time should result in a less pronounced simple advantage. That is, a longer time for learning a traditional character should result in greater memory and generalization performance than a shorter time. However, learning from simple figures should not benefit as much from longer learning times. The plots in Fig. 5 show MemSam’s predictions of generalization under each condition for relatively few (left panel) versus many (right) features sampled for the memory trace. The advantages of ST over TS and SS over TT are larger when fewer memory samples are taken. In creating these plots, MemSam’s inputs were the actual numbers of shared and distinctive strokes for each of the traditional and simple characters used in the experiments. The Appendix contains a table of the stimuli and their respective feature counts.

Fig. 5
figure 5

Predicted generalization from the MemSam model under each condition for few (i.e., \( {K}_m=6 \), left panel) versus many (i.e., \( {K}_m=11 \), right panel) sampled features. Each dot represents the mean proportion accuracy from a particular condition. Solid lines represent the best-fitting linear regression lines for generalization tests, and dashed lines represent the best-fitting linear regression lines for the memorization tests. To produce these figures, \( {K}_p \), the capacity limit for the probe trace, was set to 20. Having fewer features sampled for the memory trace may correspond to situations in which there is limited time or resources for initial learning

The second novel prediction of the model is that a greater number of distinctive features in the traditional form should lead to a greater simple advantage. In Fig. 5, as the number of distinctive strokes in the traditional form increases, the gap between SS and TT becomes larger for both memory sample sizes, and that between ST and TS becomes larger for the larger sample size (but remains relatively constant for the smaller sample size). This can be empirically investigated by regressing memory and generalization performance against the stimulus characteristics of the specific characters used in the experiment—namely, the number of strokes comprising the simple form of a character, the number of strokes comprising the traditional form, the number of shared strokes, and the number of distinctive strokes in the traditional form.

To examine these two predictions, we conducted a third experiment in which participants were given an opportunity to study the exemplar for either a relatively long or short period of time. Additionally, we conducted an analysis of the data by the particular stroke counts of the learning exemplar and generalization probes in each trial.

Experiment 3

Method

Participants and design

A total of 68 undergraduatesFootnote 5 who reported having no knowledge of Chinese characters participated for course credit. As in Experiments 1 and 2, we used a within-subjects design, so the order of conditions (SS, TT, ST, TS) was counterbalanced across participants. This time, however, exposure time during the training phase was also a within-subjects variable.

Materials and procedure

To examine the simple advantage to generalization from the particular stroke counts of the learning exemplar and the generalization probes, we expanded the stimulus set to contain 120 traditional–simplified character pairs, in which the simplified form contained a subset of the strokes contained in the traditional form. The full set was randomly divided into four word lists with 30 character pairs in each list, while maintaining the stroke count distribution across lists. The stroke count of simplified characters ranged from 2 to 15 strokes (mean = 7.92), traditional characters ranged from 8 to 24 strokes (mean = 14.47), and the number of distinctive features (traditional strokes minus simplified strokes) ranged from 1 to 14 (mean = 6.55).

The procedures were similar to those of Experiment 2, except that participants had either 0.5 or 6 s to study each exemplar before the generalization phase. Each trial began with a fixation cross, followed by a Chinese character displayed for either 0.5 or 6 s, and four answer choices. In SS and TT trials, participants matched simplified and traditional characters to the respective identical characters. In ST and TS trials, participants were shown a character in one script (S or T, respectively) and were asked to choose the best match among characters written in the other script (T or S, respectively). Participants were asked to respond with a numeric keypad.

Trials were blocked by conditions (SS, ST, TT, and TS), and the order of conditions was counterbalanced across participants. Each condition used only one of the four word lists, counterbalanced across conditions. Thus, each condition contained a total of 60 trials. Words were picked randomly from each word list, and each word was shown twice for either 0.5 or 6 s. The presentation times for each word were counterbalanced across participants, so that a character presented for 0.5 s for half of the participants was presented for 6 s to the other half of the participants. The presentation times appeared in random order within each condition. Participants could take a short break after every 60 trials.

Results and discussion

Preliminary analysis

We found no effect of character set or condition order (ps > .10) in accuracy and RTs, so the data for each condition were collapsed across those variables. Three participants’ data (out of 68) were dropped because of chance-level accuracy performance throughout the experiment. Inclusion of their data did not change the results reported below.

Model Prediction 1

The simple advantage is stronger when fewer features are sampled for the memory trace. Given that a longer viewing time is expected to provide a more accurate memory trace, our model predicts a greater simple advantage for generalization with a 0.5-s than with 6-s presentation time.

Figure 6 plots the accuracy by condition and by presentation time. As in Experiment 2, the exact-match performances from both conditions were near ceiling with little variance, which violated the assumption of normality [Shapiro–Wilk, W(65) < 0.70, ps < .05]. Thus, we confirmed the condition difference on the exact-matching task with the Wilcoxon signed rank test, and also conducted ANOVAs and paired t tests for the remaining analyses when their assumptions were met.

Fig. 6
figure 6

Accuracy by presentation time data from the exact-match and generalization tests of Experiment 3. Error bars indicate ±1 SE

As in the first two experiments, participants were generally more accurate on the exact-matching task (M = .90, SD = .15) than on the generalization task (M = .67, SD = .14), Z = 6.79, p < .001. We also replicated the simple advantage. The ST condition produced significantly better generalization performance than the TS condition, as was confirmed by a significant Condition × Test Type interaction, Z = 5.70, p < .001. Interestingly, the SS condition (M = .92, SD = .14) also had higher overall accuracy than the TT condition (M = .89, SD = .16), Z = 3.09, p = .002.Footnote 6

We conducted a 2 Study Time (0.5 s, 6 s) × 2 Condition Order (simplified first, traditional first) ANOVA on accuracy, and confirmed a main effect of condition, with higher accuracy on the simple-first conditions (SS and ST, M = .83, SD = .13) than on the traditional-first conditions (TT and TS, M = .75, SD = .13), F(1, 64) = 61.732, p < .001, η2 p = .49. No main effect of study time was apparent, F(1, 64) = 1.20, p = .28. Importantly, however, we did find a significant interaction, F(1, 64) = 10.84, p < .01, η2 p = .15: The simplified-first condition produced higher accuracy than the traditional-first condition with both short, paired t(64) = 8.57, p < .001, and long, paired t(64) = 4.80, p < .001, study times, suggesting that in either study time condition, there was a simple advantage. No difference across study times emerged for the simplified-first condition, but we did observe a significant difference in the traditional-first condition, paired t(64) = 2.35, p < .05, such that words studied for the longer time of 6 s were identified more accurately (M = .76, SD = .15) than those studied for 0.5 s (M = .73, SD = .14). Note that this suggests that the disadvantage of learning from a traditional instance was diminished when participants had a longer study time. Thus, as was predicted by our model, the traditional disadvantage (the complement to the simple advantage) was more apparent with the 0.5-s presentation time than with the 6-s presentation time.

Can RTs explain away this effect? In other words, in the TS condition, were participants more accurate on trials with a 6-s viewing time simply because they also took longer to answer on those trials than on those with 0.5-s viewing time? Analyses of RTs showed that this was not the case. The correlation between proportion accuracy and the RTs of accurate responses was negative, r = –.30, p < .05 (the correlation of accuracy and the RTs of all responses was –.34, p < .01), indicating that slower participants were also less accurate. Thus, we cannot attribute differences in the simple advantage between presentation times to a speed–accuracy trade-off. Figure 7 displays the mean RT data of accurate responses by condition for each presentation time.

Fig. 7
figure 7

Response times of accurate answers by presentation times from the exact-match and generalization tests in Experiment 3. Error bars indicate ±1 SE

Interestingly, participants were slower to respond correctly on all trials with a 6-s viewing time (M = 2.33, SD = 1.35) than with a 0.5-s viewing time (M = 1.66, SD = 0.45); slower RTs were not specific to TS trials. This was confirmed by a 2 Condition Order × 2 Test Type × 2 Presentation Time repeated measures ANOVA on the RTs of accurate responses. This produced a main effect of presentation time, F(1, 64) = 18.49, p < .001, η2 p = .22, and a significant Presentation Time × Test Type interaction, F(1, 64) = 5.75, p < .05, η2 p = .08: Participants took more time to answer memorization trials correctly with a 6-s presentation time (M = 1.79, SD = 0.85) than with a 0.5-s presentation time (M = 1.32, SD = 0.38), t(64) = 5.18, p < .001. They also took more time to correctly respond to generalization trials with a 6-s presentation time (M = 2.86, SD = 1.95) than with a 0.5-s presentation time (M = 2.00, SD = 0.66), t(64) = 3.74, p < .001. Although the explanation is purely speculative, perhaps this reflects a priming effect in which fast presentation times prime participants to respond more quickly in this self-paced task.

As in the first two experiments, participants were generally faster to answer correctly on exact-match trials (M = 1.55, SD = 0.54) than on generalization trials (M = 2.43, SD = 1.11), as was confirmed by a significant main effect of test type, F(1, 64) = 83.55, p < .001, η2 p = .57. We also observed a significant Condition Order × Test Type interaction, F(1, 64) = 25.06, p < .001, η2 p = .28. Participants were generally faster when they were correct on TS trials (M = 2.15, SD = 0.73) than they were on ST trials (M = 2.7, SD = 1.84), paired t(64) = 2.63, p < .05. They were also faster on SS trials (M = 1.48, SD = 0.76) than on TT trials (M = 1.65, SD = 0.48), paired t(64) = 2.33, p < .05.

We found no main effect of condition order, no Presentation Time × Condition Order interaction, and no three-way interaction (all ps > .05).

Model Prediction 2

The greater the number of distinctive features in the traditional form, the greater the simple advantage for generalization.

To examine the contribution of the number of distinctive features in the traditional form on the simple advantage for generalization, we conducted a multiple regression analysis using forward difference dummy coding to compare proportion accuracies according to the number of distinctive features between conditions SS and TT and between ST and TS. Interaction variables were created to estimate the slopes (accuracy by number of distinctive features) of the best-fitting regression lines for each condition. All variables were entered simultaneously. The accuracies in each condition by the number of distinctive features are shown in Fig. 8.

Fig. 8
figure 8

Accuracy as a function of the number of distinctive features in the traditional form from Experiment 3’s stimulus set. Each dot represents the mean proportion accuracy from a particular condition. Solid lines represent the best-fitting linear regression lines for generalization tests, and dashed lines represent the best-fitting linear regression lines for the memorization tests

The MemSam model fit the data well, R 2 = .68, F(5, 62) = 26.60, p < .001; the model explained 68.2% of the variance in accuracy. The resulting regression equation was Accuracy = .93 – .018(Distinctive Features) – .035(SS – TT) + .067(ST – TS) + .02(SS – TS × Distinctive Features) + .022(ST – TS × Distinctive Features). Conditions started with the same initial mean on accuracy, but the effects of the number of distinctive features were different for different conditions. The difference in slopes between the SS (.002) and TT (–.005) conditions was statistically significant, b = .02, t(62) = 2.45, p = .017, suggesting that we can reject the null hypothesis that the regression lines were parallel for SS and TT. The difference in slopes between the ST and TS conditions was also statistically significant, b = .022, t(62) = 2.701, p = .009, suggesting that the slopes of ST (–.03) and TS (–.04) were not equal. Thus, as the number of distinctive features in the traditional form increases, the generalization accuracy drops faster in the TS than in the ST condition. This finding is congruent with our model hypothesis that a greater simple advantage should appear with larger numbers of distinctive features in the traditional form.

To examine the simple effects, we used the recentering strategy to test whether there were differences in accuracies between the conditions at the mean number of distinctive features (M = 9, SD = 4.94) and at ±1 SD of the mean (approximately at 4 and 14, respectively). Consistent with Model Prediction 2, we expected that the gaps between SS and TT and between ST and TS would increase with increasing numbers of distinctive features in the traditional form. When there were approximately four distinctive features, the overall effect of condition was significant, F(2, 62) = 6.83, p = .002: Conditions SS and TT did not differ from each other, t(62) = 1.37, p > .05, but ST showed significantly higher accuracy than the TS condition, t(62) = 2.78, p < .01. At nine distinctive features, we also found a significant main effect of condition, F(2, 26) = 21.27, p < .001. SS had a higher accuracy than TT, t(62) = 2.63, p = .011, and ST had a higher accuracy than TS, t(62) = 4.75, p < .001. At 13 distinctive features, the overall effect of conditions was also statistically significant, F(2, 62) = 47.21, p < .001. SS was statistically higher in accuracy than TT, t(62) = 4.229, p < .001, and ST was largely more accurate than TS, t(62) = 6.84, p < .001. As the number of distinctive strokes in the traditional form increased, there was an increasing gap between SS and TT as well as between ST and TS for the larger sample sizes.

General discussion and conclusion

We examined the simple advantage for generalization between simple and complex Chinese scripts in order to explore the hypothesis that differences in encoding opportunities drive this effect. In Experiment 1, participants studied the characters and their English translations before attempting to generalize their learning to the same characters of the unlearned script. In Experiment 2, participants had only brief controlled exposure to the characters before undergoing the generalization test. In both experiments, we found a generalization advantage when the initially shown exemplar was simple. Experiment 2 showed that the asymmetry can be localized to generalization itself, rather than being unique to associating characters with English words.

Contrasting the results of Experiments 1 and 2, generalization performance was more accurate yet slower in Experiment 1 than in Experiment 2. This pattern is reasonable, given the differences in the tasks across experiments: Those in Experiment 1 had to recall the characters from memory when given their English definitions, whereas those in Experiment 2 saw exemplar characters immediately before making their choice. Taking more time to recall the trained characters may have helped the participants in Experiment 1 generalize more accurately. A longer RT is probably less effective, though, when generalization was more purely perceptual (as in Exp. 2).

To explain the simple advantage, we proposed MemSam, a simple process model, and tested its predictions in Experiment 3. Our model posits that the simple advantage is driven primarily by differences in perceptual encoding of the available information between learning from simple and complex instances. Simple learning instances contain fewer features to be sampled, allowing learners to encode and store more of those features. Thus, when learners are given more time to study the exemplar, their memory trace becomes more accurate, because more features are accurately encoded. Consistent with this hypothesis, Experiment 3 showed that the disadvantage of learning from the traditional characters diminished if participants had a longer learning time. Furthermore, as the number of distinctive features between the simple and traditional forms increases, the model predicts that the asymmetry between the TS and ST conditions should increase. Experimental confirmation for this prediction was found, in that the magnitude of the simple advantage increased as the number of distinctive features in the traditional form increased. The model thus unifies the results by making quantitative predictions for all conditions and showing how they interact with stimulus complexity and presentation time.

In the following subsections, we will discuss the theoretical and educational implications of these findings.

Theoretical implications

These findings are consistent with the results of past research on generalization by shape with young children (e.g., Son et al., 2008)—in short, simple instances promote better category generalization. Why are these instances advantageous for transfer? Simple training instances may allow for efficient encoding of the right initial features and/or for retrieval of useful representations. Learning from complex characters may be detrimental just because of the presence of additional nondiagnostic features that are not present in novel transfer cases. Furthermore, complex instances may generally require greater attentional resources to learn and use.

Novices of all stripes seem to exhibit similar difficulties in both categorization and perceptual learning. A perceptual explanation that may be illuminating is that potentially useful and distracting features may not be psychologically separable at the time of learning (Schyns & Rodet, 1997). Being exposed to a simplified perceptual instance first may have enabled our learners to recognize the complex character as containing the simple character along with other, new features. Initial learning with a complex stimulus does not provide a decomposed perceptual vocabulary, and thus the learner might miss the shared components between the complex and simple stimuli. An analog of this perceptual mechanism may underlie the simple advantage found in studies of conceptual transfer, given the parallels between perceptual and conceptual learning (Goldstone, Landy, & Son, 2010; Kaminski et al., 2008).

Additionally, this work raises more issues regarding the relationship between similarity, recognition memory, and category generalization. If recognition memory or category generalization is taken as a measure of similarity, this set of results provides further evidence for the asymmetry of similarity. Accuracy and RTs are asymmetrical between the initially viewed exemplar and the potential matches, such that generalization performance is aided by an initially simple exemplar. Furthermore, this work raises the possibility that similarity judgments based on perceptually available features may operate differently than when such judgments are based on features retrieved from exemplars in memory.

Another theoretical issue that arises from the results and the model is the question of the role of encoding in the simple advantage. Both in the model and in the three experiments, the learned stimulus was not present at the time of identification, so encoding the learning exemplar into memory was part of the process. Would our model’s predictions continue to be empirically supported even when participants simultaneously viewed the learning materials with the transfer choices, thus eliminating memory requirements? Although we lack empirical data, the model could account for a demonstration of the simple advantage in this situation by more broadly defining what the memory trace stands for. Instead of interpreting the memory trace as something registered in permanent memory, we could construe it as creating a representation of a base case that we know about in order to make predictions about unknown objects. Even if the learning exemplar was present at the time of generalization, limited attentional resources would probably preclude a viewer from attending to every feature accurately (assuming that the object is complex/novel enough). Our model predicts the simple advantage as long as the number of features sampled from the probes (K p ) is sufficiently larger than the number of features sampled from the learning object (K m ).

Practical implications

If one of the most important goals of education is appropriate generalization, the simple advantage appears to have broad implications. Even though generalization would likely occur if enough time and resources were devoted to training with many complex, detailed instances (see, e.g., Kellman, Massey, & Son, 2010), the present research provides further support for the idea that simple training instances may be able to foster generalization more efficiently. MemSam, a stripped-down process model, provides a step toward a true account of previous research that had examined the simple advantage within academic domains.

More directly, these results bear on the cognitive role of scripts in Chinese reading. Broadly speaking, there are no measurable differences in reading or spelling between the two scripts (Chan & Wang, 2003). A few studies have suggested that learning to read with simplified characters is more related to visual skills than is learning to read traditional characters (Chen & Yuen, 1991; McBride-Chang et al., 2005). Young children learning to read in mainland China (using simplified script) were more likely to base similarity judgments of characters on visual characteristics than were children from Hong Kong (primarily taught with traditional script; Chen & Yuen, 1991). Although further research will be necessary to determine whether learning a few characters in a lab setting is similar to learning hundreds of characters in order to gain literacy, our findings suggest that there might be a benefit of starting with simplified characters. This empirical exploration of the supposedly “easy and smooth” switching from one script to the other clearly demonstrates an asymmetry: The two directions of switching are not equally easy and smooth. Particularly if the goal is to read both scripts, learning the simplified script may be more helpful for learning the traditional script than learning the traditional script is when transferring to the simplified characters.

Simplified characters contain fewer but more diagnostic components (radicals), so it may be advantageous to treat these recurring radicals as basic orthographic units. Perhaps an emphasis on explicitly learning these units early on may foster better generalization to full-blown characters. Research on Chinese literacy (e.g., Tsai & Nunes, 2003) has shown that expert readers are generally quite sensitive to these components. Whether such pedagogical practice supports future learning of new Chinese characters is a question for future research.

The relevance of these findings for Chinese literacy is limited in two significant ways. First, the characters used in these studies were only simplified via the component omission process. Future research should incorporate character sets created through other simplification methods, such as replacing a complex component (e.g., four dashes) with a simpler one (e.g., a single line), to draw broader conclusions about the simple advantage for Chinese reading. Second, reading is more than merely identifying or recognizing characters. Traditional characters include cues to pronunciation and meaning that have been removed in simplified characters. These cues may be equally, or even more, important to full-fledged reading than is ease of recognition.

Conclusions

The present results show that the simple advantage extends to a naturally occurring generalization problem—transferring from one Chinese script to another. This adds to the growing evidence that this advantage is stable across a variety of tasks and domains, from categorization and object recognition to more complex forms of formal learning. The MemSam model illustrated how this effect could be driven by a domain-general encoding mechanism that bridges or incorporates both perceptual and conceptual learning. In some sense, all learning situations are ill-constrained, because a novice does not know what information is relevant or irrelevant. Simplicity supports learning by getting at the heart of this problem: The few features that are presented are all relevant.