Received 24 June 2012; Revised 24 October 2012; Accepted 4 January 2013

Abstract

While it is generally agreed that the concept of homology refers to individuated traits that have been inherited from common ancestry, we still lack an adequate account of trait individuation or inheritance. Here I propose that we utilize a counterfactual criterion of causation to link each trait with a developmental-causal (DC) gene. A DC gene is made up of the genetic information (which might or might not be physically contiguous in the genome) that is needed for the production of the organismic attributes that comprise the trait. I argue that individuated traits—phenes—correspond to organismic features that are caused by DC genes. Using such an approach, we can define a DC map, which shows the relations between each pair of phenes and provides a succinct summary of genotype-phenotype relationships and phenotypic complexity. Phenes in parents and offspring are judged to be homologous if their DC genes are composed of orthologous genetic factors. When comparing more distantly related organisms, traits are homologous when linked by a chain of parent-offspring homologs along the path of ancestry that links the two organisms. There are three possible ways to deal with the potential for multiple equivalent DC genes: maximal, minimal, and consensus homology. Whereas maximal homology has limited utility, the other two approaches have value and can help to guide research at the intersection of evolution and development.


1. Introduction

The homology relation refers to the idea that different organisms of the same or different species share the same traits. For example, it is generally agreed that my middle finger nail and your middle finger nail are homologous with one other and with the hoof of a horse. As this example shows, homology does not require that the structures in question be identical in form or function. So what notion of “sameness” is connoted by homology? Darwin and subsequent biologists have generally agreed that the concept of homology relates in some way to descent from common ancestry (e.g., Lankaster 1870). A fingernail and a hoof are homologous because the common ancestor of a human and a horse had a structure that became modified to give rise to the human middle finger nail and the horse’s hoof.

While the idea is simple, nailing down (no pun intended) the concept of homology has proved challenging, to say the least. While all agree that homology depends in some way on common ancestry, the manner of this dependence is far from clear. In particular, how can homology refer to the inheritance of traits from common ancestry when phenotypic traits are not passed on from generation to generation? Thankfully, we do not pass on our fingernails to our children by direct grafting!

This paper is built on the premise that in order to get traction on homology, and put the concept to work in comparative biology, we first need to solve a much less studied conceptual problem: the nature of traits. When observing an organism we get the sense that it may be atomized into parts in some objective way. At the very least, it would be agreed that some pieces of an organism are not meaningful traits. For example, half my fingernail plus a corner of my earlobe do not together constitute a meaningful part of my body. Nor even half my fingernail plus a small piece of contiguous finger. But what makes some parts of an organism meaningful and some not? How are traits individuated?

The inheritance of traits through time and the atomization of organisms into individuated traits are linked because each are seen as depending on development: the process by which a genotype interacts with a specific environment to generate a phenotype (e.g., Wagner 1989; Abouheif et al. 1997; Laubichler 2000; Wagner et al. 2000; Hall 2003; Wagner 2007). There have been attempts to equate traits with modules of developmental programs (e.g., Wagner 1996), but, while the idea of developmental modules is appealing, precisely defining these modules has been difficult. Furthermore, the claim that modules emerge as a result of selection (whether at the individual or clade level) remains unsubstantiated. In this paper I develop an alternative conception of trait individuation based on developmental causation, which leads to a precise definition of homology, or rather a family of three closely related homology concepts. This causal criterion avoids some of the pitfalls of similarity-based conceptions of homology. The three closely related homology concepts can be marshaled to clarify empirical research programs in comparative morphology, systematics, and evolutionary developmental biology.

2. A Counterfactual Causal Criterion for Developmental-Causal Genes and Phenes

In order for traits to be individuated in some way that would allow for them to be seen as heritable, the basis of their individuation must relate to genotypic information. I propose that traits be attached to the genetic factors that are needed for them to be present in a specific organism in a specific environment. Take one organism in one environment and conduct the following thought experiment. Visit each genetic factor in the genome, which is to say each gene, functional subdomain of a gene, cis-regulatory element, and so on, and ask: What features (if any) would be absent from the organism in the current environment if this genetic factor were not present?[1] In other words, apply a counterfactual criterion of causation to ask if this genetic factor causes some aspect of the organism’s phenotype. We do this by noting exactly which features of the organism depend on this genetic factor. By “feature” I mean any attribute that is measurable or observable, at least in principle.[2]

Once we have considered every individual genetic factor in the genome,[3] we proceed to consider every possible set of genetic factors that have non-additive causal effects. We could start by visiting each possible pair of genetic factors (adjacent or distant) and ask: What aspects of the phenotype depend on the presence of these two genetic factors together? That is, what features depend on this combination of genetic factors and cannot be explained simply by adding up the effects of the individual genetic factors in question? As an example, suppose that the counterfactual absence of genetic factor 1 results in the lack of attribute A, and the absence of genetic factor 2 results in the lack of attribute B. If the counterfactual loss of both genetic factors 1 and 2 at the same time resulted in the loss of attributes A, B, and C, then C could be viewed as being causally dependent on a compound genetic cause, 1+2. Moreover, we could consider triplets of genetic factors that together cause further phenotypic features, and so on, until we have considered every possible subset of genetic factors in the genome.[4]

Let us define an individual or set of genetic factors that causes some phenotypic attribute as a developmental-causal (DC) gene. A DC gene is a subset of the genome, which might correspond to a localized segment of DNA or a heterogeneous set of genomic regions (in the case of compound genetic causes), that is needed for some particular aspect of the phenotype to be expressed. An aspect of the phenotype that is dependent on a DC gene is termed a phene.[5] These and other key terms are summarized in Table 1.

Table 1 — Glossary of key terms
TermDefinition
Developmental-causal (DC) geneOne or multiple genetic factors that collectively cause some feature of the organism (in a specified environment)
Developmental-causal (DC) mapA representation, perhaps in Euler diagram format, of the relationships among multiple phenes
Equivalent DC genesDC genes that cause the same phene
Genetic factorA localized part of the genome (or epigenome), perhaps corresponding to a gene, functional gene domain, or regulatory element, which has the potential to be required for some aspect of the phenotype
PheneThe set of features of an organism that are caused by a DC gene or a set of equivalent DC genes
TraitA feature or part of an organism that is hypothesized to correspond to a phene

The strategy of providing a new conception of “gene,” distinct from the transcript-encoding units of molecular genetics, is not unique to this context. A similar strategy has been used in population genetics, where “gene” often has its own special purpose definition to refer to a contiguous stretch of DNA that has not been subject to recombination. The DC gene concept is different from the molecular genetic and recombinational gene concepts in that it may be composed of stretches of DNA on different chromosomes. However, like these other gene concepts, it is a useful abstraction because it captures the idea that our focus in developmental genetics is on portions of the genome that cause aspects of the phenotype.

The counterfactual thought experiment atomizes a single organism, asking what would be absent from the phenotype if particular genetic factors had not been present in the genome. It is, of course, impossible to literally run this experiment. However, assuming that the mapping from genotype to phenotype (the DC map) is conserved among individual organisms within a model species, we can use studies of mutants and other genetic manipulations to explore the phenotypic attributes caused by different genetic factors within this shared DC map. Thus, the research program implied by the counterfactual casual criterion is eminently practical; it has been progressing since the early 20th century under the moniker of genetics. Much genetic research is motivated by the molecular analysis of genes and the reconstruction of genetic “pathways.” However, another important goal of genetics is the identification of gene function, where function is often understood as a process or feature that depends essentially on the gene (e.g., Stadler et al. 2009). As such, “gene function” is similar to “phene,” showing the rather close analogies between the developmental-causal (DC) approach and standard genetics. Indeed, mutagenesis studies have sometimes been conducted with the explicit aim of individuating traits (e.g., Monteiro et al. 2003).

3. Implications of the Counterfactual Approach to Trait Individuation

On the DC approach, phenes have objective reality in much the same way that clades have reality in phylogenetic systematics. A clade, or monophyletic group, corresponds to exactly that set of organisms that would not exist if a specific ancestor (a historical cause) had not existed. Similarly, a phene corresponds to that piece of an organism that would not be present if a specific DC gene had not existed in the genome. Thus, insofar as one considers clades to be individuals, phenes are too. An additional point of similarity to clades can be highlighted. A taxon might be equally dependent on one, two, or more specific ancestors—counterfactually, the loss of any of several ancestral organisms would result in the loss of exactly the same set of extant organisms. Likewise, multiple DC genes might cause the same phenotypic attribute, in which case the phene is multiply determined by several equivalent DC genes.

I contend that most visible and familiar phenotypic features (e.g., lungs, leaves, and antennae) correspond to multiply determined phenes. My reasoning is that such organs tend to have complex, highly regulated developmental programs in which many gene products interact in completing many processes. As a result, there ought to be many combinations of hypothetical mutations that would result in the obliteration of an entire organ. If this assumption is correct, these organs will be associated with multiple equivalent DC genes. Even though my contention that many familiar phenes are associated with multiple equivalent DC genes colors my conceptual framework, the model does not depend on the veracity of this claim.

There are very many but not infinitely many genetic factors in the genome and, consequently, there are a finite number of DC genes and phenes (the number of phenes being equal to or less than the number of DC genes because of equivalency). The total number of phenes in an organism, a simple measure of phenotypic complexity, will likely be influenced by the size of the genome, which aligns with the perception that multicellular eukaryotes are more complex than unicellular eukaryotes and prokaryotes. However, we should not expect a perfect correlation with genome size. For example, parts of the genome that are developmentally inert and have no impact on the phenotype would not be DC genes, and therefore could not contribute to phenotypic complexity. Thus, an organism whose genome contains much junk DNA could be less phenotypically complex than an organism with a smaller, but more consistently functional, genome. Furthermore, organisms with the same number of DC genes may differ in phenotypic complexity if one organism has a higher number of equivalent DC genes.

An important implication of the finite number of phenes is that some parts of organisms are not phenes. So, while the phene concept is permissive in that a single organism can be composed of a very large number of phenes, some “parts” of an organism that we might refer to may not be individuated validly. Some putative traits may correspond to relations among phenes, while not being phenes themselves. For example, it is plausible that there is no individuated “right forelimb” in vertebrates. Instead, this part could represent merely the intersection of two individuated phenes: one composed of the two forelimbs, and one composed of the two limbs on the right side of the body. If this is the case, then there would be no set of mutations, however improbable, that could result in an organism that was intact in all regards except lacking its right forelimb.[6] Other putative traits may not even be relations among phenes, especially those mentioned earlier: “half my fingernail plus a small piece of contiguous finger” and “half my fingernail plus a corner of my earlobe.”

This leads to another parallel with phylogenetic systematics. The relationship between trait and phene is very much like that between taxon and clade. “Good” traits are phenes, in much the same way that “good” taxa are clades. In other words, indicating the existence of a trait, perhaps by anointing it with a name, amounts to positing the hypothesis that this part of the organism is a phene. This hypothesis is, at least in principle, testable. My ambition is that the phene concept might help focus the attention of developmental biologists on the opportunities to test trait individuation hypotheses by reference to genetic experiments in much the same way that phylogenetics gives systematists rigorous ways to test taxon hypotheses.

4. A Developmental-Causal Map Summarizes the Relations among Phenes

Before exploring the implications of developmental causation for homology, it is useful to introduce a framework for representing the relationship between genotype and phenotype: a developmental-causal (DC) map. This map summarizes the logical relationships among the phenes caused by each DC gene. To build such a map we would consider each pair of DC genes in the genome and note whether the phenes they cause intersect (meaning that some piece of the organism is dependent upon both genetic factors), and if they intersect, how. Table 2 lists the five possible relations among each pair of DC genes. A representation of the relations among the DC genes in a genome—a DC map—can be summarized in the form of a triangular matrix whose entries display which of these five relations holds among each pair of DC genes. This would be the most efficient format for computational tracking of developmental causation, and could be added to existing databases that contain information on mutant phenotypes (e.g., FlyBase or TAIR). However, for conceptual purposes, a more attractive representation is an Euler diagram, where phenes are depicted as ovals and areas of overlap between ovals indicates whether those phenes share phenotypic features. Figure 1 is a cartoon of a very limited DC map in Euler and matrix forms.

Table 2 — Possible relationships between the phenotypic features caused by two genetic factors (A and B) and their corresponding symbolic representation.
CaseContent of the IntersectionSymbol
A and B are disjunctNothing[A] ∅ [B]
A is a proper subset of BAll of A and some of B[A] ⊂ [B]
B is a proper subset of ASome of A and all of B[A] ⊃ [B]
A and B partially overlapSome of A and some of B[A] ∩ [B]
A and B are equivalentAll of A and All of B[A] = [B]
Figure 1 — A cartoon of a hypothetical DC map. This map contains four DC genes and three phenes. The relationships among the phenotypic attributes caused by each DC gene can be indicated by overlap in the Euler diagram or in matrix format. The format shows the directed relationship between the DC gene to the left relative to that above: ∩ (partial overlap); ∅ (disjoint); = (equivalent); ⊃ (includes); ⊂ (included by).Figure 1 — A cartoon of a hypothetical DC map. This map contains four DC genes and three phenes. The relationships among the phenotypic attributes caused by each DC gene can be indicated by overlap in the Euler diagram or in matrix format. The format shows the directed relationship between the DC gene to the left relative to that above: ∩ (partial overlap); ∅ (disjoint); = (equivalent); ⊃ (includes); ⊂ (included by).

A DC map succinctly summarizes the relationship between the genotype and the phenotype. This map is not built upon human judgment as to which traits are valid. Rather, all that humans need to do is document, with studies of actual or theoretical mutants, the attributes of the organism that are dependent on each DC gene and the overlap between those phenes. The phenotypic attributes per se are not important, only their relationships matter.[7] While this organization of genetic data is practical, it is not the standard approach, which typically entails descriptions of phenotypes in terms of loosely defined trait descriptors (“leaf,” “humerus,” etc.).

The DC map is distinct from other ways of summarizing developmental programs: genetic network diagrams, or Boolean operator representations. These summarize alternative trait states that may arise from a series of genetic or molecular interactions and are optimized for communicating local mechanistic aspects of development. In contrast to DC maps, these graphs of development are not readily expandable to a large number of genes. Furthermore, these approaches generally presume trait individuation rather than providing a framework for understanding how development serves to bound traits.

Stadler et al. (2001) and Wagner and Stadler (2003) proposed an alternative format for representing the genotype-phenotype map. The core idea is a graph in which all possible complete genotypes are associated with nodes, and edges link pairs of genotypes that are one mutational step apart. Equivalence sets of nodes are defined as those that are associated with the same phenotype. This conception allows for a mathematical description of character individuation by the criterion of quasi-independence. The underlying assumption of quasi-independence—that “natural selection can adjust one character without permanently altering other attributes of the phenotype” (Wagner and Stadler 2003)—comes very close to the DC mapping framework developed here. However, their model differs in two important regards. First, they organize the graphical space around genotypes rather than phenotypes. Second, they do not condition on the current genotype and phenotype, but simultaneously consider the universe of all possible genotypes and phenotypes. Their approach may allow for a smoother integration with theoretical quantitative genetics but is less intuitive than DC mapping and provides a less direct approach to trait individuation and homology.

5. Parent-Offspring Trait Homology

We should expect a developmental-causal concept of homology to provide an account of trait inheritance between one generation and the next. Ramsey and Peterson (2012), in an otherwise very useful discussion of homology, suggest that the sameness of traits between parents and offspring rests merely on similarity: “the sameness of the traits from one generation to the next is derived from their similarity” (p. 265). This is unsatisfactory because similarity lies in the eye of the beholder. Such a view implies that trait inheritance is not purely biological but depends in some way on human perception.

Even before the molecular basis of the gene was discovered, it was clear that the sharing of genetic information is an important explanation for the sharing of traits between species, i.e., homology (e.g., Boyden 1943). However, it also has long been clear that there is not a one-to-one correspondence between individual genes and traits. One response is to fall back on concepts other than the gene to explain trait persistence through time, including “essential genetic agreement” (Hubbs 1944), “continuity of information” (Van Valen 1982), and “sharing of pathways of development” (Roth 1984). I am attempting a different strategy that defines a developmentally relevant notion of “gene” (the DC gene) in order to explain the persistence of phenotypic attributes (phenes) through time. This approach avoids an ontological dependence on similarity (an epistemological role is permissible, and probably unavoidable), while also avoiding a simplistic view of the genotype-phenotype map. In this section I focus narrowly on homology statements that connect traits in parents and offspring, whereas the next sections will expand this treatment to consider more distantly related pairs of organisms.

I inherited my parents’ middle fingernails precisely because I inherited DC genes from my parents that cause the production of these fingernails. This leads to a generic definition of the homology relation: phene X in a parent is homologous to phene Y in its offspring if the nucleotide positions that make up the DC genes of X are orthologous to the nucleotide positions that make up the DC genes of Y.[8] This definition is straightforward when there is precisely one DC gene that causes phene X and one that causes phene Y. But what if phene X or Y are associated with multiple equivalent DC genes?

Suppose that four equivalent DC genes cause phene X in a parent: (i) genetic factor 1; (ii) genetic factor 2; (iii) genetic factors 3 and 4 together; and, (iv) genetic factors 5 and 6 together. Now suppose that changes somewhere in the genome have altered development such that the phene caused by DC gene A is different than the phene caused by DC genes B and C. Further, let us suppose that, although genetic factors 5 and 6 have orthologs in both genomes, there is no part of the offspring that is dependent on genetic factors 5 and 6 together (i.e., not caused by either 5 or 6 alone), meaning that there is no DC gene D in the offspring. As a result of these changes, there are two phenes in the offspring, Y1 and Y2, which each have some claim to be the homolog of the single phene in the parent (Figure 2). Are Y1 or Y2, or both, or neither homologous to X? Three alternative approaches to answering this question present themselves.

The first approach may be called maximal homology. It asserts that two phenes are homologous if all DC genes are orthologs: a phene in a parent is homologous to a phene in its offspring if all equivalent DC genes for the phene in the parent also are DC genes for the phene in the offspring, and vice versa. In the hypothetical case (Figure 2), neither phene Y1 nor phene Y2, nor any other part of the offspring is homologous to the parental phene X. The parental phene has disappeared, and two new phenes have appeared in the offspring. Maximal homology is not a viable concept because it implies that traits cannot persist through time in the face of even minor changes in genetic causation. However, for completeness, and as a contrast to the other two approaches, I will continue to explicate this approach in subsequent sections.

Figure 2 — Part of the DC map of a parent and an offspring. Phene X in the parent is multiply determined by four distinct DC genes. One of these DC genes, D, is non-causal in the offspring. Additionally, DC genes A and B+C individuate different phenes in the offspring: Y1 and Y2, respectively.Figure 2 — Part of the DC map of a parent and an offspring. Phene X in the parent is multiply determined by four distinct DC genes. One of these DC genes, D, is non-causal in the offspring. Additionally, DC genes A and B+C individuate different phenes in the offspring: Y1 and Y2, respectively.

The second approach may be called minimal homology. It asserts that two phenes are homologous if there is at least one shared, orthologous DC gene. In the hypothetical case, phenes Y1 and Y2 of the offspring are both (minimally) homologous to the parental phene X. Or, turning it around, the parental phene shows homology to two offspring traits. This illustrates that minimal homology relations can split and merge over time. Nonetheless, they are much more durable than maximal homology relations in the sense that the only way that homology can be lost is if a DC gene in one organism has no ortholog in the other organism, or if the ortholog is present but is not a DC gene (i.e., it does not cause any part of the phenotype). Thus, losing homologs is much harder under minimal homology than under maximal homology. The minimal homology framework embodies the idea that homology relations need not be all-or-none but are instead made up of multiple lower-level (DC gene) homology relations that are potentially at odds with one another.

The third approach to parent-offspring homology is consensus homology. It asserts that at most one trait in an offspring should be considered homologous to a specified trait in a parent and vice versa. In cases where different DC genes suggest different homology relations, the single true homology relation is that which applies (reciprocally) to the largest number of shared DC genes. Suppose the parent has a phene Pp that is determined by N DC genes and the offspring has phene Po with M DC genes. We may then specify that Pp and Po are consensus homologous if and only if the number of DC genes shared by Pp and Po (NM) is greater than the number of DC genes that Pp shares with any other phene in the offspring, and also greater than the number of DC genes that Po shares with any other phene in the parent. In the example shown in Figure 1, Trait Y2 of the offspring (which shares two DC genes), but not Y1 (which shares just one DC gene), is homologous to X of the parent.

The consensus approach seems most compelling in cases where a phene is determined by many DC genes. If this were so, then, under the reasonable assumption that few DC genes change their causal impact between one generation and the next, consensus homology would be objective and unambiguous. However, it is an empirically open question how many DC genes determine typical organismic traits, meaning that the applicability of the consensus approach remains open to scrutiny.

6. Homology over the Long Haul

The preceding section discussed homology relations between parents and offspring, resulting in the recognition of three different causal homology concepts that differ in how they deal with generation-to-generation changes in the relationships among equivalent DC genes. In order to extend this analysis across additional generations, we need to accommodate the potentially much longer path of ancestry separating the two organisms under consideration. As illustrated in Figure 3, the path of ancestry between two organisms, A and B, may be defined as the set of all organisms that are ancestors of A or B and are either the last common ancestor of A and B or are descendants of that last common ancestor.[9] One way to understand the homology relationship between particular phenes (PA and PB) in a pair of focal organisms (A and B, respectively) is to proceed as if A and B were a parent-offspring pair and look at which DC genes control the phenes of each organism. That is to say, we would directly compare A and B and require that PA and PB depend upon orthologous DC genes.

Figure 3 — Homology across a path of ancestry. The path of ancestry from organism A to organism B includes the parents of A (PA) and B (PB), the grandparents of A (GA) and B (GB), the great-grandparents of A and B, and so on back to the last common ancestor of A and B (LCA). The homology of phene PA in organism A and phene PB in organism B can be assessed directly (double-headed arrow, at top) or generation-by-generation (small double-headed arrows, below).Figure 3 — Homology across a path of ancestry. The path of ancestry from organism A to organism B includes the parents of A (PA) and B (PB), the grandparents of A (GA) and B (GB), the great-grandparents of A and B, and so on back to the last common ancestor of A and B (LCA). The homology of phene PA in organism A and phene PB in organism B can be assessed directly (double-headed arrow, at top) or generation-by-generation (small double-headed arrows, below).

The direct approach faces one immediate problem. By looking at the homology of DC genes causing PA and PB, we are effectively assessing the similarity of genetic causation of two traits rather than their sameness (Ramsey and Paterson 2012). But most biologists would not want to treat the traits of two species as homologous if they independently acquired a causal dependence on orthologous DC genes. Such a conclusion would go against conventional understanding, which would view this as an instance of homoplasy, not homology.

We could rescue the direct approach to homology by modifying the definition to require continuity of dependence on the same DC genes along the path of ancestry back to common ancestry. In addition to being inelegant, this approach fails to solve another problem. It is now widely accepted that traits can persist indefinitely in the face of gradual turn over in their genetic causation, so-called developmental system drift (True and Haag 2001; Haag 2007). Consequently, most biologists would consider it at least theoretically possible for traits in two species to be true homologs, but lack any orthologous DC genes. The lenses of shark and frog eyes could be homologs even if there were no homologous DC genes that individuated the anatomical lens of both sharks and frogs.

An alternative approach that solves these problems is to not directly compare the DC maps of the extant taxa but to look at the parent-offspring pairs that make up the path of ancestry. In this case, PA and PB are homologous if they are linked by traits in intervening generations such that each parental phene is homologous to each offspring phene.[10] Under this approach, the DC genes that confer homology are reset each generation; the DC genes that confer homology from an offspring to parent may be different from the ones that confer homology of that parent to its parent (the grandparent). As a consequence, the genetic causes of traits can shift along the path of ancestry such that PA and PB could be homologous even if they share no DC genes in common. While there may be some phenomena that are better captured using the direct approach,[11] the reset version of homology is preferable and will be adopted for the remainder of the paper. The reset approach can be applied equally under maximal,[12] minimal,[13] and consensus homology.[14] In the minimal and consensus formulations, but not the maximal one, turnover in DC genes across the path of ancestry permits traits in two extant species to be homologs while lacking any orthologous DC genes.

7. Shared Features of Developmental-Causal Homology Concepts

Before progressing to discuss the pros and cons of these three homology concepts, it will be useful to highlight a few features that they share. This will serve to clarify how my approach differs from other attempts to solve the homology problem. First, it is worth stressing that phene homology is established only by the homology of DC genes, which in turn is established by the orthology of the causal genetic factors—the inheritance of genotypic information. In this chain of reasoning, similarity has no place. The statement that the traits of two organisms are homologous is true or false regardless of how similar or different the traits are in position, structure, or development. The homology of traits might be hypothesized originally because of some observed similarity, and the similarity of traits may provide evidence for or against a homology hypothesis, but the presence of similarity is not required for a homology hypothesis to be correct. This fact is important because it allows that homology—descent from common ancestry—can be offered as an explanation of similarity without leading to a tautology.

A second major implication of the developmental causal approach is that it satisfies the criterion of transitivity, which has been highlighted as a critical feature of homology (Ghiselin 2005; Ramsey and Peterson 2012). Transitivity means that, if PA in organism A and PB in organism B are homologs, and PA is homologous to PC, in organism C, then PB and PC must also be homologs. Assuming tree-like ancestry, such that there is only one path of ancestry between two terminal taxa, transitivity holds for maximal and consensus homology. This follows because under these concepts each phene in one generation has precisely one homolog in the generations on either side (parental and offspring). In the case of minimal homology, transitivity holds, but only if we modify the definition of transitivity to specify the DC gene conferring homology. If PA in organism A and PB in organism B are minimal homologs due to the sharing of a DC gene and PA is homologous to PC, in organism C, due to the sharing of the same DC gene, then PB and PC must also be homologs. This provides yet another contrast with similarity-based homology concepts, which will generally fail to satisfy a transitivity criterion (Ghiselin 2005).

A third major implication of the DC approach relates to the phenomenon of serial or iterative homology. Many approaches to homology imply that serially repeated units within a single organism are homologous to one another in the same way that the units themselves are homologous between individuals. For example, it is commonly held that my fore and hind limbs are homologs in much the same way that my fore limbs are homologous to the wings of a chicken. The general thrust of the argument is that fore and hind limbs depend on the reutilization of largely (but not entirely) the same developmental program, in much the same way that shared genes make structures in different organisms homologous via continuity of information (Van Valen 1982; Roth 1984). How does the framework described here accommodate the phenomenon referred to as serial homology?

The use of one genetic program to generate multiple features of an organism implies that each structure is dependent on this program. This would mean that the “repeated” structures are all dependent on the same DC gene. However, this does not make the repeated structures homologs. On the contrary, this shows that the repeated structures are all parts of a single, more inclusive phene. For example there is likely to be a phene in the tetrapod “limb” that includes, nested within it, two other traits: “fore limbs” and “hind limbs.” This means we could consider the fore limbs of two different animals to be homologous, but does not justify considering the fore and hind limbs of one animal as homologous to each other. Instead the fore and hind limbs of a single animal are best viewed as two aspects of a single inclusive trait, limbs.

A final general point is that my framework ensures that homologs are heritable (in the sense of “capable of being inherited”) over multiple generations. This is important because it explains why homologs can come to characterize clades of organisms and, consequently, why the sharing of putative homologs can provide prima facie evidence that a certain group of organisms forms a clade. On the other hand, this approach does not sit well with synonymizing homology and synapomorphy (e.g., Patterson 1982). Homology is here understood to be a pairwise relationship between the traits of different organisms, whereas synapomorphies are traits shared by a set of organisms, where that set manifests a derived state relative to the plesiomorphic state that is seen in another subset of organisms. My approach allows that plesiomorphic character-states shared by a pair of organisms may nonetheless be homologous.

8. Choosing among Developmental-Causal Homology Concepts

The preceding discussion has identified three developmental-causal definitions of homology: maximal, minimal, and consensus homology. These concepts are identical when each phene is associated with a single DC gene but, when there are multiple equivalent DC genes, a pair of traits judged homologous by one criterion might not be homologous under another. It behooves us, therefore, to consider whether any one concept is universally superior, or if two (or all three) might have value in certain contexts.

The maximal homology concept ties phenes to exactly specified sets of DC genes. Within this framework traits are evolutionarily ephemeral, disappearing the minute there is a loss or gain of any of the DC genes associated with a multiply determined phene. Maximal homology implies that phenotypic evolution largely occurs via the disappearance of old traits and the appearance of new ones. Since one important motivation of homology is the idea of trait persistence through evolutionary time in the face of genetic turnover, maximal homology is unlikely to be of great utility for comparative biology.

The minimal approach to homology has immediate value as a way to communicate when two traits share orthologous DC genes, but could be considered too fine-grained. By associating traits with DC genes that can pass into or out of equivalency, phenes tend to split and merge over evolutionary time; one trait in species A can have multiple minimal homologs in species B. While some biologists will find this implication distasteful, others have already come to terms with such conclusions (e.g., Sattler 1984, 1988; Wake 1999). For example, the idea of mixed (or partial) homology has been found useful for exploring cases of developmental evolution through “transference of function” (Corner 1958; Baum and Donoghue 2002). Furthermore, we have a linguistic framework for handling situations of mixed homology. It is the norm to modify homology statements along these lines: “The wings of birds and bats are homologous as fore limbs.” The implication is that these traits are not homologous in some other context (e.g., as wings). The minimal homology approach suggests a more mechanistic version of such modifying clauses, along the lines: “phenes PA and PB are homologous as traits dependent on DC gene X.”

Minimal homology defines a challenging yet tractable research program. To validate a claim of homology between traits in two taxa we would first identify DC genes that cause these phenotypes. This might entail studies of mutants in both taxa or, possibly, inferences based on nearby taxa and gene expression data. Either way, the aim would be to identify sets of genetic factors required for the production of the two traits. Then we would use molecular evolutionary analysis to test the orthology of these causal genetic factors. Finally, we could use data from other related taxa, combined with ancestral state inference methods, to assess whether ancestral organisms had phenes caused by the same DC genes. That being said, while it might be practical to validate a minimal homology hypothesis, refuting such a hypothesis would be much harder. Analogous to Gould and Lewontin’s (1979) famous critique of the adaptationist program, a hypothesis of minimal homology could be indefinitely defended by claiming that there is some as-yet-unidentified, shared DC gene.

The consensus approach also has value. In particular it provides a good way to think about how a phene that has many equivalent DC genes can persist as a single entity even if individual DC genes intermittently lose causal efficacy or peal off to cause different phenes. The consensus concept will be less useful in cases where phenes have very few equivalent DC genes, or when nearly equal subsets of equivalent DC genes separate en masse, but it is an unsettled empirical question how often these conditions will obtain. As a research endeavor, however, testing a hypothesis of consensus homology would be challenging. The number of DC genes that potentially causes each trait may be quite numerous, and each would require the same kind of analysis demanded by minimal homology. Nonetheless, because the focus of inference is quite well defined, this research program is at least viable and may become easier if genomic methods advance to the point where we can determine the relationships among the phenes caused by a large numbers of potentially equivalent DC genes.

All together, it appears that both minimal and consensus homology have useful roles to play in enhancing clear communication and helping frame viable research programs. In contrast, given the typical baggage associated with the term homology, it is far from obvious that maximal homology has much to offer.

9. Areas for Further Work

Much further work on the developmental causal approach is certainly needed. This can be divided into three areas: philosophy, developmental theory, and empirical application. I will briefly summarize some issues needing attention in each of these three areas.

All three homology concepts depend on a counterfactual causal criterion, which is not without its complications (e.g., Pearl 2000; Godfrey-Smith 2010). I presented a thought experiment built around the question: What attributes of the organism would be absent if particular genetic factors were absent? An alternative formulation would be to ask: What aspects of the organism would be absent if such and such a genetic factor were different? That is, consider not just the possibility that genetic factors were absent, but also that they had different sequences. While this might usefully expand the information that could be captured in a DC map, it also seems to define an infinitely large and unwieldy space (similar to the genotype space of Stadler et al. 2001). We would need to consider every possible genomic configuration that could in any way yield a differently constructed organism. Indeed, since all pairs of living organisms are a finite number of mutational steps away from one another, this implies that we could find ourselves asking unsatisfactory questions such as: Which genetic factors explain the lack of wings in humans? Is the lack of wings in humans homologous to the lack of wings in crabs? This is a pathological outcome. There may be good philosophical grounds for defending the counterfactual reasoning laid out in this paper. Alternatively, we may find that counterfactual reasoning is best deployed to test a priori hypotheses rather than to ascertain trait boundaries. What the biologist needs is not a way to define abstract traits that could be homologs, but a reasonable way forward when assessing whether two perceived traits are indeed homologous.

In the area of developmental theory, more work is needed on the relationship between my conceptual model and the line of research pursued most prominently by Günter Wagner and collaborators (e.g., Wagner 1989; Wagner and Misof 1993; Wagner and Altenberg 1996; Wagner and Laublichler 2000; Stadler et al. 2001; Wagner and Stadler 2003). Their approach proposes that the most obvious traits of organisms are those that are products of natural selection. The idea is that selection can act on the developmental process to build Character Identity Networks (ChINs), which are emergent features of developmental programs that individuate characters and explain character persistence through time (Wagner 2007). One possibility is that ChINs will, in my framework, correspond to phenes with large numbers of equivalent DC genes. If this were the case, the phenes in question would be more durable through evolutionary time. However, many questions remain. For example, when will selection favor greater numbers of equivalent DC genes? By providing a trait concept that does not invoke selection, my counterfactual causal approach might provide a framework for asking some fascinating new questions about how selection shapes the genotype-phenotype map.

Finally, and perhaps most critically, there is a need for the analysis of empirical data. There are many questions that can only be answered by looking at the biology. Do the familiar traits of organisms correspond to phenes and, if so, how many equivalent DC genes are they caused by? Over evolutionary time, how readily and by what molecular mechanisms do DC genes appear or disappear or change the phenes they cause? What is the impact of gene duplication and loss on assessments of homology? It is my hope that the conceptual framework laid out in this paper will stimulate the development of methods for mining existing genetic data as well as new methods for automated phenotypic analysis to identify DC genes and phenes in a high-throughput manner. Through such work, we will get to the point that statements such as “my middle finger nails are homologous to horses hooves” can be assessed by more than simply appealing to human perceptions of spatial correspondence.

Literature cited

  • Abouheif E., M. Akam, W.J. Dickinson, P.W.H. Holland, A. Meyer, N.H. Patel, R.A. Raff, V.L. Roth, and G.A. Wray. 1997. Homology and developmental genes. Trends in Genetics 13: 432–433. doi:10.1016/S0168-9525(97)01271-7
  • Baum, D.A. and M.J. Donoghue. 2002. Transference of function, heterotopy, and the evolution of plant development. In: Developmental Genetics and Plant Evolution. Ed. by Q.C.B. Cronk, R.M. Bateman, and J.A. Hawkins. London: Taylor & Francis.
  • Boyden A. 1943. Homology and analogy: A century after the definitions of “homologue” and “analogue” of Richard Owen. The Quarterly Review of Biology 18: 228–241. doi:10.1086/394676
  • Corner, E.J.H. 1958. Transference of function. Biological Journal of the Linnean Society 56: 33–40. doi:10.1111/j.1095-8339.1958.tb01706.x
  • Ghiselin MT. 2005. Homology as a relation of correspondence between parts of individuals. Theory in Biosciences 124: 91–103. doi:10.1007/BF02814478
  • Godfrey-Smith P. 2010. Causal pluralism. In: The Oxford Handbook of Causation. Ed. by H. Beebee, C. Hitchcock, and P. Menzies. Oxford: Oxford University Press.
  • Gould, S.J. and R.C. Lewontin. 1979. Spandrels of San-Marco and the Panglossian paradigm: a critique of the adaptationist programme. Proceedings of the Royal Society B-Biological Sciences 205: 581–598. doi:10.1098/rspb.1979.0086
  • Haag, E.S. 2007. Compensatory vs. pseudocompensatory evolution in molecular and developmental interactions. Genetica 129: 45–55. doi:10.1007/s10709-006-0032-3
  • Hall, B.K. 2003. Descent with modification: the unity underlying homology and homoplasy as seen through an analysis of development and evolution. Biological Reviews 78: 409–433. doi:10.1017/S1464793102006097
  • Hubbs, C.L. 1944. Concepts of homology and analogy. The American Naturalist 78: 289–307. doi:10.1086/281202
  • Komosinski, M. and S. Ulatowski. 2004. Genetic mappings in artificial genomes. Theory in Biosciences 123: 125–137. doi:10.1016/j.thbio.2004.04.002
  • Lankaster, E.R. 1870. On the use of the term homology in modern zoology, and the distinction between homogenetic and homoplastic agreements. Journal of Natural History Series 4, Volume 6: 34–43. doi:10.1080/00222937008696201
  • Laubichler, M.D. 2000. Homology in development and the development of the homology concept. American Zoologist 40: 777–788. doi:10.1668/0003-1569(2000)040[0777:HIDATD]2.0.CO;2
  • Lynch, J.P. and K.M. Brown. 2012. New roots for agriculture: exploiting the root phenome. Philosophical Transactions of the Royal Society B-Biological Sciences 367: 1598–1604. doi:10.1098/rstb.2011.0243
  • Maynard Smith, J., R. Burian, S. Kauffman, P. Alberch, J. Campbell, B. Goodwin, R. Lande, D. Raup, and L. Wolpert. 1985. Developmental constraints and evolution. The Quarterly Review of Biology 60: 265–287. doi:10.1086/414425
  • Monteiro, A., J. Prijs, M. Bax, T. Hakkaart, and P.M. Brakefield. 2003. Mutants highlight the modular control of butterfly eyespot patterns. Evolution & Development 5:180–187. doi:10.1046/j.1525-142X.2003.03029.x
  • Nanney, D.L. 1982. Genes and phenes in Tetrahymena. Bioscience 32: 783–788. doi:10.2307/1308971
  • Patterson, C. 1982. Morphological characters and homology. In: Problems of Phylogenetic Reconstruction. Ed. by K.A. Joysey and A.E. Friday. London: Academic Press.
  • Pearl, J. 2000. Causality: Models, Reasoning, and Inference. New York: Cambridge University Press.
  • Ramsey, G. and A.S. Peterson. 2012. Sameness in biology. Philosophy of Science 79: 255–275. doi:10.1086/664744
  • Roth, V.L. 1984. On homology. Biological Journal of the Linnean Society 22: 13–29. doi:10.1111/j.1095-8312.1984.tb00796.x
  • Sattler, R. 1984. Homology - a continuing challenge. Systematic Botany 9: 382–394. doi:10.2307/2418787
  • Sattler, R. 1988. Homeosis in plants. American Journal of Botany 75: 1606–1617. doi:10.2307/2444710
  • Stadler, B.M.R., P.F. Stadler, G.P. Wagner, and W. Fontana. 2001. The topology of the possible: Formal spaces underlying patterns of evolutionary change. Journal of Theoretical Biology 213: 241–274. doi:10.1006/jtbi.2001.2423
  • Stadler, P.F., S.J. Prohaska, C.V. Forst, and D.C. Krakauer. 2009. Defining genes: a computational framework. Theory in Biosciences 128: 165–170. doi:10.1007/s12064-009-0067-y
  • True, J.R. and E.S. Haag. 2001. Developmental system drift and flexibility in evolutionary trajectories. Evolution & Development 3:109–119. doi:10.1046/j.1525-142x.2001.003002109.x
  • Van Valen, L.M. 1982. Homology and causes. Journal of Morphology 173:305–312. doi:10.1002/jmor.1051730307
  • Vidyakin, A.I. 2001. Phenes of woody plants: identification, scaling, and use in population studies (an example of Pinus sylvestris L.). Russian Journal of Ecology 32:179–184. doi:10.1023/A:1011310111062
  • Wagner, G.P. 1989. The biological homology concept. Annual Review of Ecology and Systematics 20: 51–69. doi:10.1146/annurev.es.20.110189.000411
  • Wagner, G.P. 1996. Homologues, natural kinds and the evolution of modularity. American Zoologist 36:36–43. doi: 10.1093/icb/36.1.36
  • Wagner, G.P. 2007. The developmental genetics of homology. Nature Reviews Genetics 8:473–479. doi:10.1038/nrg2099
  • Wagner, G.P. and L. Altenberg. 1996. Complex adaptations and the evolution of evolvability. Evolution 50: 967–976. doi:10.2307/2410639
  • Wagner, G.P., C.H. Chiu, and M. Laubichler. 2000. Developmental evolution as a mechanistic science: the inference from developmental mechanisms to evolutionary processes. American Zoologist 40: 819–831. doi:10.1668/0003-1569(2000)040[0819:DEAAMS]2.0.CO;2
  • Wagner, G.P. and B.Y. Misof. 1993. How can a character be developmentally constrained despite variation in developmental pathways? Journal of Evolutionary Biology 6: 449–455. doi:10.1046/j.1420-9101.1993.6030449.x
  • Wagner, G.P. and P.F. Stadler. 2003. Quasi-independence, homology and the unity of type: a topological theory of characters. Journal of Theoretical Biology 220: 505–527. doi:10.1006/jtbi.2003.3150
  • Wake D. 1999. Homoplasy, homology and the problem of ‘sameness’ in biology. In: Homology (Novartis Foundations Symposium 222). Ed. by G. Bock and G. Cardew. Chichester, UK: Wiley.

Notes

    1. Instead of conditioning narrowly on the current environment in which an organism lives, one could condition on the breadth of plausible environments that a member of this species might plausibly encounter. This would probably be the most sensible course of action in cases where one is trying to understand the genotype-phenotype mapping for a whole population or species rather than for a single individual. The phenotype is then a norm of reaction.return to text

    2. Some of these features may be absences. For example, a genetic factor that suppresses facial hair can be said to cause the feature “facial hair reduced.”return to text

    3. In the idealized case we would even consider heritable features that are not encoded at the DNA sequence level, which is to say epigenetic factors.return to text

    4. It might be reasonable in some contexts to limit this exercise to perhaps two or three levels (e.g., considering only the loss of up to three genetic factors). This would amount to focusing on a short enough timeframe that combinations of multiple mutations would be very unlikely to evolve. It is an empirical matter whether familiar traits would be successfully individuated in such truncated developmental causal maps.return to text

    5. The term phene has received occasional usage in scientific publication to refer to a specific aspect of the phenotype (e.g., Nanney 1982; Vidaykin 2001; Komosinski and Ulatowksi 2004; Lynch and Brown 2012). These prior uses are compatible with, if less precise than, the usage I am advocating here.return to text

    6. It is an empirical question whether we will find cases in which a perceptually compelling trait is not associated with a phene but with the intersection (e.g., A ∩ B) or complement relation (e.g., A/B) of two (or more) phenes. In that case we could admit the possibility of two kinds of valid traits, phenes and paraphenes, with the former being directly individuated and the latter being delimited only by relations among phenes. It would be possible to extend the homology concepts developed here to paraphenes, but such an extension is inadvisable until we establish that a purely phene-based approach is problematic.return to text

    7. An exception applies to those phenes that correspond to fitness (e.g., number of offspring), because this will eventually be important when or if a theory emerges that integrates developmental causation and evolutionary change by natural selection.return to text

    8. Gene duplication will complicate the situation, but I set this issue aside here.return to text

    9. In cases where A and B are connected by multiple paths of ancestry (e.g., because they are members of a single sexual population), phenes PA in A and PB in B correspond to homologous traits if there is at least one path of ancestry across which they satisfy the relevant definition of homology.return to text

    10. Direct homology approaches may be useful for exploring developmental constraint or “burden,” wherein the role of some genetic factors is extremely stable through evolutionary time (e.g., Maynard Smith et al. 1985; Laubichler 2000).return to text

    11. In the case of sex-limited traits, the existence of a trait is only required of organisms of the appropriate sex. This allows, for example, a sex-limited trait of a male organism and his maternal grandfather to be considered homologs even though the intervening generation (his mother) lacked the trait. One way to think about this is to consider the sexes as alternative possible “environments” in which the organism could have found itself.return to text

    12. Formally: phenes PA and PB show maximal homology if and only if they are linked through a phene in each parent-offspring pair along the path of ancestry such that all the equivalent DC genes of parents and offspring traits are orthologous.return to text

    13. Formally: phenes PA and PB show minimal homology if and only if they are linked through a phene in each parent-offspring pair along the path of ancestry such that parental and offspring phenes share at least one orthologous DC gene.return to text

    14. Formally: phenes PA and PB show consensus homology if and only if they are linked through a phene in each parent-offspring pair along the path of ancestry such that parental phenes share more equivalent orthologous DC genes with the offspring phene than with any other phene in the offspring, and vice versa.return to text

    Acknowledgments

    I would like to thank Casey Helgeson, William Saucier, Elliott Sober, Günter Wagner, and an anonymous reviewer for comments on the manuscript, and students in the Spring 2012 Evolution and Systematics Seminar (Botany 940), members of the UW-Madison Philosophy of Biology Reading Group, Jessica Flack, and David Krakauer for helpful discussion. Support from the John Simon Guggenheim Memorial Foundation is gratefully acknowledged.


    Copyright © 2013 Author(s).

    This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs license, which permits anyone to download, copy, distribute, or display the full text without asking for permission, provided that the creator(s) are given full credit, no derivative works are created, and the work is not used for commercial purposes.

    ISSN 1949-0739