Semantic memory redux: An experimental test of hierarchical category representation

https://doi.org/10.1016/j.jml.2012.07.005Get rights and content

Abstract

Four experiments investigated the classic issue in semantic memory of whether people organize categorical information in hierarchies and use inference to retrieve information from them, as proposed by Collins and Quillian (1969). Past evidence has focused on RT to confirm sentences such as “All birds are animals” or “Canaries breathe.” However, confounding variables such as familiarity and associations between the terms have led to contradictory results. Our experiments avoided such problems by teaching subjects novel materials. Experiment 1 tested an implicit hierarchical structure in the features of a set of studied objects (e.g., all brown objects were large). Experiment 2 taught subjects nested categories of artificial bugs. In Experiment 3, subjects learned a tree structure of novel category hierarchies. In all three, the results differed from the predictions of the hierarchical inference model. In Experiment 4, subjects learned a hierarchy by means of paired associates of novel category names. Here we finally found the RT signature of hierarchical inference. We conclude that it is possible to store information in a hierarchy and retrieve it via inference, but it is difficult and avoided whenever possible. The results are more consistent with feature comparison models than hierarchical models of semantic memory.

Highlights

► Four experiments examine the classic problem of how semantic memory is structured. ► Subjects learned novel hierarchies, eliminating stubborn problems of confounding variables. ► Results gave little support for the Collins and Quillian network-inference model. ► Results were consistent with feature-based models, including recent connectionist models.

Introduction

Hierarchical classification has long been identified as one of the most important aspects of human knowledge representation. In the sciences, management, and law, hierarchies have been used to structure the relations among domain entities, and tree diagrams representing such relations can be found in many different texts. Hierarchical structure has also been found in human knowledge representation (Markman and Callanan, 1984, Rosch, 1978). Our concepts seem to be structured in levels of classification in which specific concepts fall under increasingly higher-level concepts. For example, an object identified as a beach novel also falls under more general classes of novel, book, and publication, forming a series of inclusion relations: Beach novels are novels, novels are books, and books are publications.

The advantage of hierarchical representation has long been noted (Linnaeus, 1758, Quillian, 1968). The main benefit is that facts known about higher-level concepts apply to lower ones as well. So, after learning that all publications have an author, one knows that all novels have an author. This is an important benefit, because there are dozens or even hundreds of types of dogs, cars, musical instruments, hammers, contracts, investments, cultures, and so on, and if we had to learn the properties of each type separately, it would be extremely difficult and time-consuming. For example, if you had to learn that Scottish terriers have skin, move, breathe, have livers, have a four-chambered heart, and all their other biological properties, you might never get around to learning about Airedales, Jack Russell terriers, or Yorkshire terriers (much less poodles). However, by knowing that those properties are true of animals or mammals, you do not have to relearn them for dogs, terriers, and every type of terrier separately. Over and above this benefit, the power and flexibility of the representational format is greatly increased with the notion of a “default hierarchy” (Quillian, 1968), in which lower branches can contain exceptions to the general properties stored higher up. For example the fact that penguins do not fly is treated as an exception to the general rule stored higher up that birds do fly. Default hierarchies are an essential tool in database design and in knowledge-based systems architecture in Artificial Intelligence, suggesting their direct relevance for representing human conceptual knowledge.

The hierarchical structure of categories seems to be descriptively correct of a significant subset of semantic memory, but what is less well understood is how that knowledge is stored and accessed in memory. A major research question in the 1970s proposed two general approaches to explaining hierarchical structure (see Smith, 1978 for an excellent contemporary review). One view proposed that something much like an actual hierarchy was represented in memory, through an associative network in which different categories were connected by “IS-A” links: a terrier IS-A dog, a dog IS-A mammal, and so on (Collins & Quillian, 1969). To represent the information associated with each category, other links such as “HAS” or “CAN” would connect properties to the categories. So, the dog concept would have a HAS link to the legs concept, and the animal concept would have a CAN link to the breathes concept. Such a structure follows the principle of cognitive economy. By linking “breathes” to the animal concept, one does not have to link it to the concepts of fish, birds, mammals, and all of their many subtypes—the information is placed at the highest level in the hierarchy only. However, a corresponding drawback to such efficiency is that processing is slowed when deriving general features for lower-level categories (Collins & Quillian, 1969). To realize that Airedales breathe, one must traverse the hierarchy through the concepts dog and mammal to arrive at animal, which is linked to the breathes feature. Similarly, classification judgments such as that an Airedale is a living creature, require traversing the links in memory between Airedale and the living creature concept, which must take longer than judging that the Airedale is a dog, since these two concepts are linked directly. In short, there is a distance effect between levels of the hierarchy, such that the farther apart information is stored in the hierarchy, the longer it takes to retrieve or confirm it. Although Collins and Quillian found such a distance effect, others have not or have questioned whether it is due to the inferential process they propose (see Chang, 1986, Smith, 1978).

The inferential-network model has had as much lasting power as any idea in cognitive psychology. A survey of our cognition textbooks finds very similar illustrations to Collins and Quillian’s (1969) Fig. 1 in almost every one, ranging from 1972 (Lindsay & Norman, 1972) through 2010 (Ashcraft & Radvansky, 2010).

A different approach to hierarchies in semantic memory proposes that the hierarchies are only implicit in our category knowledge rather than characterizing memory structures. Instead, each concept is represented by its defining and characteristic features (Smith, Rips, & Shoben, 1974). The relations between the features of different concepts would define their categorical relation, if any. For example, the concept animal is associated with the relatively few features that are common to (all) animals. To decide whether an Airedale is an animal, one could check whether those animal features are found in the features known of Airedales: Given that Airedales move independently, breathe, and reproduce, they must be animals. This feature-comparison process yields no distance effect. Furthermore, given that categories are associated to characteristic features, the similarity of two concepts could determine how long it took to judge their relation, independently of their distance in the hierarchy. Such typicality effects are extremely widespread (Hampton, 1979, Hampton, 1997, McCloskey and Glucksberg, 1979, Rips et al., 1973, Rosch, 1973, Rosch and Mervis, 1975).

Ultimately, these two approaches generated considerable research but no clear resolution. Chang’s (1986) comprehensive review makes it clear that all models have unexplained phenomena. Our interpretation of this is that people take advantage of both processes proposed by these approaches, in various combinations. Imagine learning that your friend has a new kind of dog, a muffelet. Without knowing anything about it, you can infer that muffelets have four legs, breathe, probably bark, wag their tails, and so on. You would hardly be puzzled if your friend said that her muffelet chewed up her slippers. Since you have no features associated to the name muffelet, you could not have been using the feature comparison process to draw these conclusions but were likely performing the kind of inference envisioned by Quillian’s theory: The muffelet chews slippers because it is a dog, and that is what juvenile dogs do. On the other hand, the evidence that this inference process takes place when making judgments about familiar categories is weak. The distance effect is often not found and unpredicted effects often are (Chang, 1986). Sometimes inference is not transitive, as it should be according to this view (Hampton, 1982).

Hampton (1997) demonstrated that categorization can use both stored associations and featural similarity, finding independent effects of category production frequency (how likely an exemplar is to be generated as a category member) and typicality (how representative a member is of its category) on categorization times. A double dissociation was obtained, with a priming task removing frequency effects, and a manipulation of task difficulty affecting typicality effects (see also Moss, Ostrin, Tyler, & Marslen-Wilson, 1995). Similarly, Kounois, Osman, and Meyer (1987), in a study using speed–accuracy decomposition, proposed fast retrieval of some facts followed by a slower feature comparison process as one explanation of their results.

Typicality effects fall more readily out of the similarity-comparison model (McCloskey and Glucksberg, 1979, Smith et al., 1974), and it now seems to be the more popular approach—except for a general rejection of the notion of defining features (Hampton, 1979, Rosch, 1973). However, even featural similarity may not explain all category judgments (e.g., Hampton, 1998).

The importance of hierarchically organized knowledge has been recognized in recent models of semantic memory, most notably the very ambitious project of Rogers and McClelland (2004; see Close & Pothos, 2012 for an alternative). They addressed issues of why very general categories may be learned first and are the most resistant to effects of brain damage. They also addressed the presence of a preferred, basic level of categorization (Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976).

Their connectionist model does not align neatly with either of the two previous approaches. They used a Rumelhart network in which input nodes interpreted as objects activate two hidden layers, which, along with context units, activate an output layer containing features and category names. After training, the network was able to respond that a given object breathes or is a canary. The context units refer to behaviors/functions, properties, and names, serving to selectively access the information in the output layer. So, with one context unit activated, the network might respond that a given object has legs, wings, and eyes; with another context unit activated, the same object might yield the response that it is a canary and a bird.

Because of the distributed nature of the conceptual representations and the network architecture, the Rogers and McClelland model is different from the two approaches we have been discussing. Perhaps the greatest difference is that there are no “concept nodes” in the system. Input nodes correspond to objects, and output nodes include features and the objects’ names. In between are hidden nodes that form semantic representations of the kinds of objects the network has learned. There is no node corresponding to the concept of canaries, which is then related to its features or subordinate and superordinate categories. Instead, the semantic representations in the hidden layers activate various features in a graded response. This directly yields typicality effects, as typical objects (like robins) will activate category names and properties most strongly, whereas less typical objects (like penguins) will activate them less strongly.

There is no distance effect in the network corresponding to the Collins and Quillian inference effect. The semantic representations activate specific and general names, and there is no link between the names themselves. As a result, their model does not provide a simple way to evaluate statements such as “A robin is a fish.” However, following a procedure they use for introducing novel category exemplars (p. 64), one can derive a way for the model to answer such questions. If the node representing the first term of the sentence is activated, that activation can be backwards-generated to derive the hidden layer representation that is most compatible with it (the prototypical robin). Then, that activation pattern can be run forward in order to discover whether the second term of the sentence is activated (whether the prototypical robin is a fish). As this description shows, name activation in the model occurs through semantic representations and not through networks of associations between categories or category names. As a result, this model is closer to the feature-based accounts of semantic memory than to the network-based accounts. It seems very likely that the model, like Smith et al.’s (1974), could predict that some long-distance inferences like “A penguin is an animal” are faster to confirm than short-distance links like “A penguin is a bird,” if the penguin’s features overlap more with the typical animal’s than with the typical bird’s. (Indeed, Rogers & McClelland, 2004, chap. 5, document in detail the effects of the similarity of such atypical items to other categories.)

In summary, Rogers and McClelland’s (2004) semantic memory model seems much closer to the featural approaches, as do recent competitors such as Close and Pothos (2012). It clearly does not contain a hierarchical network of associations that directly lead to the Collins and Quillian effects, and its predicted effects are largely based on semantic similarity and details of the learning regimen (chap. 5). In Experiment 2, which had stimuli comparable to their simulations, we will attempt to draw specific predictions from their model.

It is not our intention to attempt to resolve the semantic memory debate 25 years on. If our conclusion is correct, there is no simple right answer to the question of how hierarchical information is represented. It may be either inferred or explicitly represented, depending on the categories and features. As people become experts or learn specific facts, their knowledge could pre-empt more general retrieval processes. Someone with great experience with killer whales might well store the fact “killer whales breathe air” but would not store the fact “robins breathe air.” Therefore, retrieving information about breathing killer whales might not involve hierarchical inference, whereas retrieving this fact about robins might.

One reason for confusion in the literature is that researchers do not have experimental control over the stimuli of semantic memory and people’s experience with them. People may form implicit categories such as four-legged mammals, which investigators do not take into account, making predictions of hierarchical distance incorrect. People may also have learned some of the specific categorical relations tested in an experiment, like whales being mammals, but have never even encountered others. Familiarity with properties and categories has also been argued to underlie some effects (Malt and Smith, 1982, McCloskey, 1980). Such confounding variables could obscure the basic properties of semantic memory retrieval but are very difficult to control in naturally occurring semantic domains.

In part because of such problems, it is still not clear how people structure and retrieve information from hierarchically organized domains. One important question is whether people spontaneously form memory structures of the Quillian type—efficient hierarchical networks of associations. Although such a structure seems ideal, in practice people may make redundant links or omit links in a way that results in a much more complex memory structure. Another question is whether retrieval of information about hierarchically structured material has the profile that Collins and Quillian (1969) originally identified for it, and in particular, whether it shows the distance effect. Later theorizing weakened that prediction (e.g., Collins & Loftus, 1975), but this was in large part due to uncontrolled associations of the whale-mammal sort.

Whether people form internal hierarchies when all those confounding variables are absent remains an open question. Our goal was to investigate not retrieval of information from familiar semantic domains but the underlying psychological question of whether people create and use mental hierarchies when the conditions are ideal to do so. The answer to this question will then inform the debate about how information is stored in the messier, more complex world of actual semantic memory. If people do not form mental hierarchies even under these ideal circumstances, this will cast strong doubt on whether such hierarchies play a role with real semantic information. If they do so, this will suggest a stronger potential role for such hierarchies in everyday semantic memory.

Our approach was to teach people novel, hierarchically organized information and then to perform the classic tests of information retrieval. In the first experiment, the hierarchy was implicit in the features of a set of learned exemplars. For example all the shapes of a given color were always shaded in a particular manner. In this case, people would have had to notice the hierarchical structure on their own and use it to represent the information. Since it is possible that the usual profile of hierarchical retrieval will only be found when the information is presented as explicitly hierarchical (“Robins are birds; birds are animals.”), in a further two experiments we explicitly taught people this information. An early experiment by Smith, Haviland, Buckley, and Sack (1972) also taught people hierarchies with novel features. However, their hierarchies were considerably more modest than ours, and they used already familiar categories such as hawk-bird-animal. Thus, they did not avoid the problems associated with familiar items.

Like the traditional semantic memory literature, our experiments focused on categorical relations, comparable to verifying sentences such as “A fish is an animal” or “A claw hammer is a tool.” The main effect to be expected according to the hierarchical retrieval model (Collins & Quillian, 1969) is the distance effect. When the two categories are directly linked, confirming their relationship should be faster than when there is an intervening category; and that should be faster than when there are two intervening categories. By using novel categories and names, we avoided problems such as implicit categories people might form (e.g., four-legged mammals) and specific facts that people might memorize, pre-empting inference (e.g., killer whales being mammals and breathing air).

Learning hierarchically organized categories is not a trivial task. People can only learn and remember so much information in an experimental session, and hierarchies have the unfortunate property of expanding by a factor of two or more with each level that is added. (If they do not, then they are probably not really hierarchies, as we explain below.) We constructed hierarchies with four levels, each of which had a binary branching structure. However, we pruned the category tree in order to limit the number of categories to be learned.

Past research using a similar method has found that order of learning the levels can have an effect. Murphy and Smith (1982) found that the first-learned level was faster in perceptual classification, and it is likely advantaged in sentence verification tasks as well. We addressed this issue by using two different learning orders. If there is a distance effect, it should be present when averaged across such orders. In addition, there may be an effect of the overall level of category asked about. For example, questions involving the highest level of categories could be answered faster than those involving lower levels, as in Rogers and McClelland’s (2004) model. The distance and level effects can be partly separated (see below), and the effects of these different variables should give insight into how hierarchical information is represented and then retrieved. Of course, retrieving information from recently learned material may be different from retrieving it from very familiar concepts, a possibility we address in the General Discussion.

Our expectation was that under some conditions, with the confounds of differing familiarity and pre-emptive associations gone, people would show the classic distance effect proposed by Collins and Quillian (1969). We thought it was an open question whether such evidence of hierarchical memory structure would be found in all conditions or only when the hierarchy was clearly evident. The pattern of results would be revealing about when we might expect such effects in natural categories. However, our expectations were not actually met, as we did not find distance effects until Experiment 4, and so we postpone consideration of interpretations until the General Discussion.

Section snippets

Experiment 1

The first experiment used a set of items that had an implicit hierarchical structure: The properties of the stimuli were structured in inclusion relations as shown in Fig. 1. The stimuli were all rectangular colored shapes with different sizes, screen locations, and textures. Initially, people simply studied these shapes for a memory test. Afterwards, they judged the truth of sentences about the stimuli, such as “All pink things are empty” or “All left things are small.” Of the possible ways of

Experiment 2

Fig. 3 depicts one of the taxonomies used in Experiment 2, and Fig. 4 shows exemplars of two categories, HOBNIKs and LARs. The stimuli were schematic drawings of bugs which varied in their shape, pattern, number of legs, and color. We constructed categories at four different levels, as shown in Fig. 3, by successively combining lower-level categories into more general ones. To make learning easier, the categories at each level were defined by the features of the category immediately above them

Experiment 3

Our goal in this research has been to investigate the development and use of hierarchical memory structures for artificial materials that did not have the potential confounding variables that could influence natural category hierarchies. For example, if children are told that penguins are birds or worms are animals, these learned facts could influence their sentence verification, probably pre-empting the use of hierarchical inference or feature comparison. After all, a learned fact is likely to

Experiment 4

The repeated finding of no distance effect—or even a negative distance effect—within hierarchies is surprising. In fact, the result may raise a concern that there is something wrong with our tested hierarchy, the names, or some aspect of the testing procedure. There is a certain logic to the claim that drawing inferences must take longer than retrieving known information and that inferences involving more steps must take longer than those involving fewer steps. The failure to find such effects

General discussion

We began this investigation by asking whether retrieval of information from a newly learned set of categories would produce the pattern predicted by Collins and Quillian (1969) in their classic semantic memory model, when confounding effects of familiarity, differences in associations, and specific learned facts are removed. This question is really two interrelated questions: Do people actually form mental representations in the efficient hierarchical structure C&Q assume? And does retrieval

Conclusion

Even taking into account the diversity of ways that hierarchical information might be encoded and retrieved, we did not find that the traditional Quillian hierarchy was the favored method. Instead, it appeared to be used only when other sources of information and retrieval strategies were entirely removed. Therefore, we suspect that in everyday life, such a model of hierarchical concepts is probably not the default way that information is retrieved from semantic memory.

Acknowledgments

We thank Rebecca Bainbridge for her help in collecting and analyzing data and the Concats Lab Meeting for helpful comments. The authors dedicate this article to the memory of Edward E. Smith, who died on August 17, 2012. His groundbreaking research helped create the field of semantic memory and inspired the present study.

References (43)

  • E.H. Rosch

    On the internal structure of perceptual and semantic categories

  • E.E. Smith et al.

    Retrieval of artificial facts from long-term memory

    Journal of Verbal Learning and Verbal Behavior

    (1972)
  • E.E. Smith et al.

    Semantic memory and psychological semantics

  • J.R. Anderson

    Arguments concerning representations for mental imagery

    Psychological Review

    (1978)
  • F.G. Ashby et al.

    A neuropsychological theory of multiple systems in category learning

    Psychological Review

    (1998)
  • M.H. Ashcraft et al.

    Cognition

    (2010)
  • T. Chang

    Semantic memory: Facts and models

    Psychological Bulletin

    (1986)
  • J. Close et al.

    “Object categorization: Reversals and explanations of the basic-level advantage” (Rogers & Patterson, 2007): A simplicity account

    Quarterly Journal of Experimental Psychology

    (2012)
  • A.M. Collins et al.

    A spreading-activation theory of semantic processing

    Psychological Review

    (1975)
  • A.M. Collins et al.

    Retrieval time from semantic memory

    Journal of Verbal Learning and Verbal Behavior

    (1969)
  • D. Gentner et al.

    Studies of inference from lack of knowledge

    Memory & Cognition

    (1981)
  • Cited by (11)

    • The organization of words and environmental sounds in memory

      2015, Neuropsychologia
      Citation Excerpt :

      The organization of semantic memory for one form of meaningful information, linguistic items (e.g. words), has been well investigated, and is based on several factors. Among the most important is featural similarity (i.e. the perceived likeness between concepts), which aids in categorization (Kay, 1971; Murphy et al., 2012; Paczynski and Kuperberg, 2012; Rosch et al., 1976; Sajin and Connine, 2014). Far less is known about how the brain processes and organizes meaningful auditory information that is not linguistic (e.g. environmental sounds).

    • Cognitive Psychology in a Changing World

      2023, Cognitive Psychology in a Changing World
    View all citing articles on Scopus
    View full text