Transcribing and Collating for Digital Stemmatology. The Case of Troilus and Criseyde

This paper explores the decisions involved in a digital stemmatology project. Traditionally, transcription, collation, and the creation of a stemma have been processes linked to the edition of texts. However, stemmatology on its own can provide valuable insight to further the understanding of literary works since it sheds light on their production process. The research on the textual tradition of Troilus and Criseyde offers the possibility to reconsider what are the difficulties and implications of a digital project of this sort. I focus particularly on transcription and collation. Transcription can be affected by aspects such as the availability of high-quality reproductions of manuscripts and early printed editions. Then, in order to produce useful transcriptions that serve the purpose of the project, the level of detail has to be established. It is important to find a balance between the overwhelmingly detailed and scarcity that could result in unfruitful transcripts. In regard to collation, which is the process of identifying variants, I mention the characteristics and purpose of a base-text. It is essential to understand what is a variant and how to work with them so that reliable stemmatta can be produced. Thus, I examine the case of two readings present in my research. By the end, I provide a brief example of how a phylogenetic tree can help us understand the distribution of variants and the relationships that the witnesses of Troilus and Criseyde bear. With that, I also hope that the usefulness of digital stemmatology is made evident. Resume Cet article explore les decisions prises dans un projet de stemmatologie numerique. Traditionnellement, la transcription, la collation et la creation d’un stemma etaient des processus lies a l’edition de textes. Cependant, la stemmatologie elle-meme peut fournir des informations precieuses qui contribuent a une meilleure comprehension d’œuvres litteraires, puisqu’elle eclaircit leur processus de production. La recherche sur la tradition textuelle de Troilus et Criseyde offre la possibilite de reexaminer les difficultes et les implications d’un projet numerique de ce genre. Je me concentre en particulier sur la transcription et la collation. La transcription peut etre affectee par des aspects tels que la disponibilite de reproductions de haute qualite de manuscrits et d’editions imprimees anciennes. Ensuite, pour produire des transcriptions utiles qui atteignent le but du projet, il faut etablir le niveau de details. Il est important de trouver l’equilibre entre des transcriptions massivement detaillees et celles qui sont peu detaillees pour eviter des resultats infructueux. En ce qui concerne la collation, le processus d’identifier des variantes, je mentionne les caracteristiques et buts d’un texte de base. Il est essentiel de comprendre ce qui c’est une variante, ainsi que comprendre la facon dont il faut la traiter pour que des stemmatta fiables puissent etre produits. Par consequent, j’examine le cas de deux lectures presentes dans ma recherche. Je fournis un exemple de la facon dont un arbre phylogenetique peut nous aider a comprendre la distribution de variantes et les relations que temoignent Troilus et Criseyde. Dans cet esprit, j’espere aussi rendre evidente l’utilite de la stemmatologie numerique. Mots-cles: transcription; collation; stemmatologie; phylogenetique; Chaucer; Troilus


Introduction
Transcribing and collating are chores that generally precede the making of a critical edition. A critical edition should represent a theory about a work, as argued by many textual scholars (Pugliatti 1998, 163;Cerquiglini 1999, 80). However, in the digital age of editing, differing understandings of how best to take advantage of the tools that are in reach, as well as the general state of textual scholarship, have opened diverse options about how best to further our understanding of works, texts, and documents. Should one take advantage of every single aspect of them? In my research on the textual tradition of Troilus and Criseyde, transcription and collation have been critical aspects. The end goal, however, is not to produce an edition, but to add to the scholarship we have on the relationship that the eighteen witnesses of Geoffrey Chaucer's work bear through digital stemmatology. This does not mean that an edition could not take place, but the purpose of this paper is to reflect and comment on how this particular goal models the processes of transcription and collation. In that sense, this text serves both as a rationale and a place to meditate on the implications that these activities bring to a project of digital recension. Recensio specifically refers to the analysis of the variants of all the extant manuscripts as well as the affiliation of the later ones (Blecua 1983, 31).
Troilus and Criseyde was finished around 1386. This situates the poem in a crucial time in Chaucer's life, since as Paul Strohm states: "Chaucer was a man of literary accomplishment, standing on the brink of his decision to write the Canterbury Tales" (Strohm 2016, 184). Strohm refers to Troilus and Criseyde as his "first undoubted masterpiece" (Strohm 2016, 186) and believes that the body of work he had produced by then, at the age of 43 was "more than sufficient to establish him as the most eminent English writer before Shakespeare" (Strohm 2016, 186). Let us acknowledge that "Chaucer had a typically medieval attitude toward completion, in which works were valued even in fragmentary form, but this is the major poem that he did complete, and triumphantly so" (Strohm 2016, 189). Troilus and Criseyde is Chaucer's most important complete poem. The circumstances around him were also changing, as new patterns of manuscript production and of circulation of vernacular texts emerged, plus "Chaucer's Vazquez: Transcribing and Collating for Digital Stemmatology. The Case of Troilus and Criseyde Art. 7, page 4 of 34 literary contemporaries were beginning to think differently about claiming credit for their work" (Strohm 2016, 185). Regarding Chaucer, Strohm believes that although he was fame worthy, by 1386 he was "certainly not famous yet" (Strohm 2016, 189).
This was about to change. The final section of Troilus and Criseyde "registers the first signs of a shift; that poem's conclusion expresses a welter of confused and disturbing new literary ambitions and desires" (Strohm 2016, 185). Chaucer worries about the reproduction and performance of his poem because he wants to establish himself as an author (autoritas) among "Uirgile, Ouide, Omer, Lucan and Stace" (Chaucer 1984(Chaucer , V, 1792. It is because of all of these reasons, that it is of the utmost importance to approach this poem with new eyes and new goals. To work with Chaucer's Troilus and Criseyde and digital tools offers the opportunity to create a stemma or stemmata for different sections of the work. This is of great significance since as of today, there is no stemma for the witnesses of Chaucer's poem. To survey the text of the manuscripts and early printed editions with digital tools and to create a stemma is a definite step forward in the understanding of the textual tradition of Troilus and Criseyde, of Chaucer's literary production, and opens the possibilities for further projects.
As for the rationale of this work and for its position within the context of other Middle English editing projects: it is close to The Canterbury Tales Project on which I worked, thus learned from it to work with large textual traditions. The strategies for transcription, collation and the use of phylogenetic analysis are heavily inspired by that project. In that sense, it also agrees in a broader sense with Manly and Rickert's efforts to study the witnesses of The Canterbury Tales as they state in the second volume of their edition: The purely mechanical procedure of collation will have resulted in groupings of MSS according to their readings without reference to whether the readings are correct or incorrect. These are all, prima facie, merely variational groups. Many of them are also in fact genetic groups […] The test of a genetic group, as distinguished from a mere accidental grouping is that the same sigils [these represent manuscripts in their cards] should appear together persistently and consistently. (Manly et al. 1940, 20) Vazquez: Transcribing and Collating for Digital Stemmatology. The Case of Troilus and Criseyde Art. 7, page 5 of 34 The collation and analysis of Troilus will reveal groups and their reoccurrence will reveal patterns that suggest whether these groups have a genealogical relation. Following this train of thought, this project does not share George Kane's conclusions on recension and its uselessness. Kane and Donaldson (1988) argued in the introduction of the B version of Piers Plowman the following: In this situation lodges the ultimate absurdity of recension as an editorial method: to employ it the editor must have a stemma; to draw the stemma he must first edit his text by other methods. If he has not done this efficiently his stemma will be inaccurate or obscure, and his results correspondingly deficient; if he has been a successful editor he does not need a stemma, or recension, for his editing. (Kane and Donaldson 1988, 17, note 10) Since the aim of this project is not to produce a critical text, but to shed light to the relationship that the witnesses bear, this criticism of the base text does not apply necessarily to this project. However, the methodological basis of the work I present here fundamentally disagrees with Kane's statement. Kane and Donaldson qualify recension as absurd because they portray the process as circular. They remind us that in order to collate the witnesses, a base text must be used. If the base text has scribal variation, the agreements in original readings would not be genetic. To avoid that, a base text with exclusively original readings is needed, then all the variants would be scribal. When several witnesses attest the same variant, presumptions of genealogy can be done. To obtain this ideal base text, the editor would have to create one by choosing the best manuscript and then proceed to remove the unoriginal readings with the variants offered by the rest of the manuscripts relying on his knowledge of the usus scribendi of both author and scribes. If this was done successfully, the base text would be the final text and the stemma would no longer be necessary as an editorial tool.
Kane and Donaldson's argument against recension is that they believe that originality must be detected in the beginning of the process. This is not necessarily true since agreement in variance can show allegiance and statements of originality can Vazquez: Transcribing and Collating for Digital Stemmatology. The Case of Troilus and Criseyde Art. 7, page 6 of 34 be made after the examination of attestation. The process could be done manually, but technology has allowed for progress. With the use of digital tools there is no need for any a priori judgment of originality, and as it will be seen further, the stemma itself does not need a root or a center to show how the witnesses relate to each other.
Although the base text does not need to be free of scribal variation as Kane suggests, the role it plays is critical. It should not be confused with a best text in the tradition of Bédier (1928), nor with a copy-text in Greg's tradition (1950). The base text should allow for comparison between the witnesses. Let us keep in mind that Troilus and Criseyde is a poetic work organized in books, stanzas, and lines. Therefore, the base text must include all the possible books, stanzas, and lines that the text of all the witnesses present, or at least of the excerpts that are being analyzed. The base text is used as a reference and will make it possible to show the witnesses' agreements in variation. Its other purpose is to guide the transcription of the texts.
In this particular case I was fortunate to find an xml document with Barry Windeatt's text for his Longman edition of Chaucer's work thanks to the Oxford Text Archive (OTA) (Chaucer 1984). The xml document had numbered lines, stanza division, book division, and indications of incipits and explicits which saved a considerable amount of time since I did not have to number the lines manually and structure the text according to levels of the work: book, stanza, line. However, the OTA xml text did not have numbered stanzas, page breaks were relative to the printed version and some characters like yoghs ("ȝ", "ȝ") were not recorded. Regardless, it was a good start for a base text since it could be compared with the witnesses. Accordingly, it could be modified to fit the number of stanzas or lines per folio and then guide the transcription. Although it is very important to talk about the base text in regard to collation, I will first explore the process of transcription and then explore collation in the second part of this text.

Transcription
The process of acquiring images of the manuscripts and printed editions requires a separate discussion. For now, let it be said that one would like to think that we are in an era in which manuscripts are digitised and made available for everyone. But as we know, as much as this is the direction in which libraries are going, it is not applicable be seen on the websites of the institution that host them. The rest of the manuscripts remain unavailable online. It is possible to photograph the ones that can be found at the Bodleian Library as long as the pictures are for research purposes only. For some manuscripts such as Cambridge University Library Gg. 4.27, it is possible to look at a facsimile; for others such as Durham University Library Cosin MS V.ii.13, the only possible way to get images is to pay for the digitization of a microfilm reel from the Hill Museum & Manuscript Library. Needless to say, it is not easy for a graduate student to get hold of all of these, but it should also be noted that it is getting easier.
The fact that images come together with transcription has, at the very least, two implications that should be considered. The first one is that the quality of the images has a direct impact on the transcription. While obvious, the quality of an image can hide valuable information. From now on, I will refer to witnesses according to their sigils (see appendix). An example of this is 793 and 795 of Book ii in H4 (for all sigla, While I was using images from a digitized microfilm, my transcription of lines 793 and 795 read "The treson -/that to women ay is do […] Or where becomyth it/whan it is go." At first glance, that is not inaccurate and as far as I can tell, it represents the scribe's intention, but it is hard to tell from that image that the scribe wrote something different and then corrected it. It is not hard to tell however if one looks at the high resolution image on The British Library Digitised Manuscripts (BLDM 2020a, H5). Initially, the scribe had written doon and goon, but it was later erased. While it could be stemmatically irrelevant, this piece of information can affect our perception of scribes and their rigor, or at least of this particular one. The previous images did not represent this aspect of the text in enough detail, and it was easily missed.
Another example of how better images enable the work to improve is line 1127 of Book ii in H5. The Riverside Chaucer reads "He may nat longe liven for his peyne" (Chaucer 2008, 504). The line in H5 is "He may not longe lyue in þs lango r for his payne"; however, this is only noticeable in the latest images uploaded to the British Library (BLDM 2020b, "Harley MS 4912," 31r). The reproductions that I was using previously, which came from a microfilm, made such subtle traces as to be unreadable. Therefore, the superscript þs was omitted from my transcription and his was not struck out. It is not possible for me to reproduce the aforementioned images so that the readers can reach their own conclusions. However, I can cite Windeatt's critical apparatus, which coincidentally did not take into account the same corrections that I overlooked.
I cannot assume that he was working with microfilm images instead of the manuscript itself, but his notation for line 1127 is interesting: "may] ne may Gg/for his] in thys Cx/his] the J/lyuen] lyve in langour H5" (Chaucer 1984, 213). According to Windeatt's edition, H5 reads "He may not longe lyve un langour for his peyne." This misses out completely on the interlinear corrections that coincide with Caxton's in thys. It also ignores that his should be omitted. Various authorities suggest that only one hand is involved in the inscription of this manuscript. Failure to recognize this variant means that information with possible stemmatological relevance is lost.
Only Caxton's printed version and H5 include that reading. It is clear that filiation cannot depend on a single variant and that polygenesis could also explain the evidence. Regardless, poor quality images should not determine our understanding The second implication that we should reflect on is on the meaning of text.
In her article "The Texts We See and the Works We Imagine," Bárbara Bordalejo defines text as "all the meaningful marks on the page made by someone with the intention of communicating something" (Bordalejo 2013, 67). As part of the "text of the document," she includes "any indications as to which text might be considered erased or what needs to be included, marks that suggest a change in order or any other meaningful signs on the page" (Bordalejo 2013, 67). I agree with this definition, since it is broad enough that it could include layout and gives place to think about variant states of the text, which is the issue Bordalejo explores in this article. Yet one needs to ask as a transcriber: what is meaningful? What would then be meaningless?
Both interpretation and purpose play a critical role in responding to these questions.
The purpose of the editorial work needs to be kept in mind. In my particular case, the readers who will have access to my transcription will also be able to see a visual reproduction of the folio that has been transcribed. This is not new by any means. On the left side of the screen, one would find an image of a document (or part of it), and on the right side, the transcription of the text contained in that document would be found. In his book La edición de textos, Miguel Ángel Pérez Priego states that, for a diplomatic edition, "it is mandatory to keep graphic signs such as long s, sigma, the tironian note, abbreviations (they can be expanded as long as they are in italics or in square brackets), even punctuation as long as it reveals a particular and interesting practice" (Pérez Priego 1997, 43). Nevertheless, in this context, my transcription does not intend to fill the requirements of an ultra-diplomatic edition. This is not due to the unfaithfulness of my transcription or due to a lack of rigor; instead, my transcription is meant to help the reader make sense of the inscriptions that they can A dialectical tension between transcription and the spelling of medieval works arises. This tension is not necessarily due to the graphical differences between them because readers understand that scribes and editors deal with different material supports which imply different capabilities. The tension has more to do with the logic behind each; it has to do with how we choose to represent letters and words, and it has to do with how we encode text. As literate individuals from the 21 st century, one of the conventions that informs our habits of reading and writing is that of standardized spelling. This means that we all operate on the assumption that there is one correct spelling for each word, which turns all other variants into incorrect or misspelled words. This impulse is inherited from the tendency of print culture to standardize spelling. As Marshall McLuhan in The Gutenberg Galaxy states: "Manuscript culture had no power to fix language or to transform a vernacular into a mass medium of national unification" (McLuhan 1962, 229), there are "two questions directly related to the printed form of any language at all, namely the drive for fixity of spelling and grammar" (McLuhan 1962, 229). The material conditions in which scribes and typesetters work are completely different. The typesetters, in their task to represent what they see on the page, were limited to a finite amount of characters in a context in which, as Frederick H. Brengelman states, "no entirely redundant characters should be tolerated" (Brengelman 1980, 345). In contrast, scribes had an array of strikes to represent entities that today we represent in one particular way.
There is a possible analogy to be made with phonemic and phonetic transcriptions. A phonemic transcription is an abstraction that represents, regardless of the actual utterance of any particular speaker and redundant as it may be, a succession of phonemes, that are "the smallest sound unit in a language. Meaningless in themselves, phonemes are the building-blocks of language. Changing one for another changes the meaning of a word, as with /p/ and /b/ in pat and bat" (Chandler and Munday 2020). In this sense, a change in a phonemic transcription could cause the representation of different items; similarly, in print, a spelling mistake might misrepresent what was intended or might represent something entirely different. Meanwhile, a narrow phonetic transcription will "show whatever differences between sounds can be perceived, regardless of whether they are distinctive in the language represented" (Matthews 2014). If two speakers utter the same phrase, it is likely that a narrow phonetic transcription would show different signs if the speakers used allophones which are "a difference in sound within a language that does not produce a difference in meaning" (Calhoun 2002) but the phrase would still have the same meaning. If both transcriptions were translated to a phonemic one, they would be identical. The same can be said of two scribes who copy the same line but transcribe it in different ways. It may have the same meaning and they might transcribe the same words but with different characters because they might represent the phonetic choices of their context, among other reasons. According to Febvre, an aspect of spelling in the transition between manuscript and print is that it "came to correspond less and less with pronunciation" (Febvre and Martin 1976, 319).
It stopped representing dialectically specific uses gradually in favor of consistency.
To compliment these ideas, it is also useful to reflect on what Robinson and Solopova (1993) refer to as the level of transcription. In his article "Graphemic Analysis of Late Middle English Manuscripts," W. Nelson Francis characterizes his work as graphetic in relation to graphemic transcriptions in the following way: "This is a graphetic, not a graphemic, transcription; that is, it records every detail whether or not it has linguistic significance. It corresponds methodologically to the linguist's narrow phonetic transcription" (Francis 1962, 36). In this specific kind of graphetic transcription, each graph-type is assigned a number. In the case of groups, essentially graph-type variants, they "use the same number and add distinguishing letters" (Francis 1962, 36). 1 Regardless of representation, this definition matches Robinson's and Solopova's (1993): "graphetic: every distinct letter-type is distinguished (as: r "short" is transcribed apart from r "round" and r "long descender", etc.)." Francis also indicates where the graphemes of a graphemic transcription come from: "A finite and smaller 1 To go into more detail seems unnecessary, but it is interesting to see that a graphetic transcription of the kind that W. Nelson Francis refers to looks like this: #23b|10 6a||23a|28|23a||9b 19||9b 9c 4||16a 4 9c 9 14c 27||9d 9d 9d||6 ∧ 7e||9c 19c|| 12| 4 | 5a| 16a 2b||10 6c||14c | 13 14b#. The transcribed line is "Þat þ ̉ is no louer in th is world at ese." According to the rhyming scheme of the royal stanza, the first and the third lines should rhyme. We notice that this is not the case with H5 ("Harley MS 4912," 4r). In the manuscript, the final h has a stroke on top that crosses the ascender. Should that be considered to be a mark that means nothing in line 225 but restores the g in 227?
This same line has no shortage of h (thyng, hadde, had, suche), but none of them have a crossed ascender. Most likely, the crossed h means nothing, and the scribe made a mistake. Therefore, I transcribed all the occurrences of h as a simple h. The Late Medieval English Scribes Project has a section dedicated to h in their description of this hand, and their analysis agrees with mine. There is no inclusion of this particular form of h, which may mean that the team did not consider it worthy of any clarification, ignored it because their instinct was not to consider it significant, or just ignored it as an honest mistake.
Is it then safe to state that within the textual tradition of Troilus and Criseyde, any h with a stroke across the ascender should be considered as an ordinary h? Not according to H4 ("Harley MS 2392," 30r), R (there is no digital reproduction of Rawlinson Poet. 163), and Ph ("MssHM 114," 217r). In line 235 of Book ii, H4 and Ph read "That to myn ħtis […]," and the rest of the manuscripts do not abbreviate and have a version of hertes. It is obvious that the stroke in that specific context is abbreviating "er." Thus, an exception has to be made in order to record this use of the h with a stroke across the ascender. The transcriber always has to be vigilant; any stroke is potentially meaningful.
Not every case of potential abbreviation is as clear. As anyone who has transcribed knows, dealing with macrons and minims is problematic if one chooses to expand abbreviations. Words like wōman could be expanded to womman, but is that necessary? Here, the decision to expand the macron is the responsibility of the transcriber, but transcribers, like scribes, change over time. I  allow the transcriber some flexibility. It is possible to produce different types of transcriptions. A "diplomatic" transcription seen in Figure 1 will show the text with no expansions, while an "edited" version in Figure 2 shows the expanded Transcription is an informed suggestion of what is significant based on the experience of looking at manuscripts over and over. A platonic fetishist, which I might as well be making up for the sake of the argument, could consider that every step from the original to the edited transcription leads to a loss in fidelity because each mediation is further removed from the original, from the Truth. I argue that every step is also the product of agents engaging with the artifact to facilitate its understanding. Thus, while each mediation is in no way a substitute for the manuscript, each has its own benefits.
The adequacy of transcriptions will change over time depending on the needs of the ever-developing interpretative communities. For now, let us hope that our work is useful.

Collation
For this project, it is necessary to keep in mind that transcription is a step towards collation. In order to compare the texts of witnesses, it is important to regularize orthographic variants in order to obtain substantive variants instead of accidentals.
These are the terms coined by W.W. Greg: A distinction between the significant, or as I shall call them 'substantive', readings of the text, those namely that affect the author's meaning or the essence of his expression, and the others, such in general as spelling, punctuation, word-division, and the like, affecting mainly its formal presentation, which may be regarded as the accidents, or as I shall call them ' accidentals' of the text." (Greg 1950, 21) Here is where a note about the base text is required. The base text has to be modified so that it can always be compared to whatever evidence the witnesses offer. Its purpose is to show the relationships between the witnesses. Thus, when witnesses add a line or a stanza that is not present anywhere else and might not be authorial, the base text needs to account for that evidence so that it can show the agreement of those two witnesses. As stated before, the OTA xml text of Windeatt's edition was The fact that he uses a "purely arbitrary" text as base also coincides with J. Froger's (1970) notion that any text could serve as a false original to show the relations between witnesses. What seemed to be wishful thinking by Froger (1970) was achieved by Robinson some twenty years after.
Since the final purpose is to generate stemmata, it is also convenient to make a final note about transcription. If a word has an abbreviation mark that was not expanded, that will not affect the final stemma. The example of line 184 of Book i, where the final word is down, H2, H5, R, and S1 read doun -/down -, and the final word in H4 is encoded as works to illustrate this.
On Figure 3, we can see the apparatus for the phrase. Since every spelling variant of down has been regularized, as seen in Figure 4, they will all count as the same reading. That will allow the collation to focus on substantive variants. To perform this task I used the ITSEE-Birmingham Collation Editor (Smith 2019) that "provides with specific tools: the interplay between the reworkings of literary texts and their oral performance and dissemination is palpable in many works, while in others the scribal intervention and copying dynamics play a preponderant role compared to the oral delivery. The later seems to be the case of Troilus or at least it is that aspect on which this project focusses.
There is a debate regarding what exactly constitutes a variant. For the purposes of this project, the useful variants are what Bordalejo calls "stemmatically significant variant" (Bordalejo 2002, 96). In order to understand what that means, it is important to know some characteristics of both significant and non-significant variants.

Benjamin Salemans lists some characteristics of non-significant variants in his
Building Stemmas with the Computer in a Cladistic, Neo-Lachmannian, Way (2000) k. Archaic words. Many copyists will use a more contemporary word when confronted with an archaic word. Therefore, there is considerable chance that copyists working with minimally related exemplars introduce the same more modern word in their copies. The occurrence of the same more contemporary word in the text versions is not due to equal descent, but by diachronic change of language: they are not genealogical variants but parallelisms.
Learning how to distinguish between significant and non-significant variants will affect the collation. A big concern with collation is that no one wants to alter the process so that it reflects the editor's assumptions. Yet, as much as one would like to draw hard lines to guide the task, it is frequent that variants will also push our limits.
Let us consider the next couple of examples.
In line 768 of Book ii, the Riverside Chaucer reads: "A cloudy thought gan thorugh hire soule pace" (Chaucer 2008, 499). Some witnesses, like A, Cl, Cp, D, Gg, H1, H5, J, and S1, indeed read soule. Some others, like Cx, H2, H3, H4, R, Ph, and S2, read heart, and only Dg reads thought. At first glance, these variants are substantive, but they fit This is not new; these are Aristotelian notions of the soul. Accordingly, Batholomeus explains that: "the vertue vitall, that giueth lyfe to the bodye, whose foundation or proper place is the heart" (Anglicus 1584, iii, 15). Ioan P. Culianu states that: The doctrines expounded by Bartholomaeus were based on the idea prevalent in Arab medicine that the heart is the unique generator of the vital spirit which, once it has reached the brain, is called sensitive. The messages of the five "external" senses are transported by the spirit to the brain, where the inner or common sense resides. (Culianu 1987, 11) And finally, according to Mary Carruthers in her Book of Memory: [E]ven though the physiology of consciousness was known to occur entirely in the brain, the metaphoric use of heart for memory persisted […] The Middle English Dictionary records an early twelfth-century example of herte to mean ''memory''; there is an Old English use of heorte to mean ''the place where thoughts occur,'' cogitationes." (Carruthers 2008, 59) Memory is an intellective process. An oversimplified version of the procedure is that the soul moves from the heart to the brain to produce thoughts. This brief excursus through medieval notions of anatomy sufficiently explain why in the line "A cloudy thought gan thorugh hire soule pace," heart and thought could be synonyms of soule.
To Salemans' credit, the fact that thought is present in only one manuscript could indicate that it is the product of scribal resolution and, regardless of its presence in six witnesses, the same applies to heart. However, it is hard to believe that even if scribal substitution explains the origin of the variants, that necessarily means there is no vertical transmission in the case of heart and soule. I consider these two readings as stemmatically significant variants. With this in mind, let us examine the next example.
Just some lines ahead, we can read in 771 "That thought was this: "Allas! Syn I am free." There are two readings present in the witnesses that at first glance are accidentals: syn and sith. It turns out that A, Cl, Cp, Dg, H1, H5, R, and S2 read syn, while Cx, Gg, H2, H3, H4, J, Ph, and S1 read sith. This point is better illustrated by  What this means is that we do not need to identify the original reading before carrying out the analysis as traditional stemmatics does. As Shaw and Robinson did before, I follow their example by using "the default criterion, maximum parsimony, The phylogenetic analysis of the first hundred lines of Book i seen in Figure 5 shows that those witnesses that read soule, (in a red circle) with the possible exception of H5 and Gg, are closely related. On the other hand, the ones that read heart (in a blue circle) present a more diverse position in the unrooted tree: H2, H4, and Ph are closely related, the same goes for H3, R, and Cx, but S2 does not fit. Two things must be considered. The first that heart could be a coincidence in S2, but it is indicative of vertical transmission regarding H2, H4, and Ph on the one side, as well as for H3, R, and Cx on the other. The second is that analysis of Book ii needs to be done to assess these particular variants. There might be changes of filiation, and what stands for the first hundred lines of Book i could have changed. But the full transcription and collation of the excerpts will allow me to test any particular variant and try to make sense of the conditions that gave place to variation.
Very similar results can be seen when using the Likelihood criterion (see Figure   6) with a neighbor joining analysis which:  Figure 5: Phylogenetic tree using the parsimony criterion and a heuristic search of the first 100 lines of Troilus and Criseyde with 18 witnesses. The witnesses that read soule in a red circle, the witnesses that read heart in a blue circle.
It is noticeable that the witnesses that read soule are closely related and their relationships are similar: A and D remain linked by node, the same can be said for H5 and Gg, S1 remains close to A, D, and H1. As for the other group, Ph, H2 and H4 are closely related and apart from the rest, while Cx, R, and H3 are not extremely distant from each other and S2 continues to be an outlier. Both diagrams confirm the general assumptions of witness relations and show encouraging results that will be later refined when more data is fed into the software so that precise information can be produced and we as editors or readers can make sense of it. I stated in the beginning of this text that the end goal of this project was not to make an edition, but to understand the relationship that the 18 witnesses to Troilus and Criseyde bear. The project is still at its early stages, but it shows potential. I find myself in agreement with Robinson (2006), who elegantly expressed why this kind of project is important: "Like the stemmatics of the last century, its aim is to illuminate the history of the text. Unlike the stemmatics of the old century, its aim is not a wellmade edition, but a well-informed reader." Figure 6: Phylogenetic tree using the likelihood criterion and the neighbor joining method of the first 100 lines of Troilus and Criseyde with 18 witnesses. The witnesses that read soule in a red circle, the witnesses that read heart in a blue circle.