Popma's Monolingual Dictionary of Latin Synonyms

: This report presents the results of our project of digitizing a dictionary of Latin synonyms which was popular for three centuries from XVII to XIX; describes its history and the contributions of authors and editors, explains its pedagogical value for modern students of classical languages; and discusses technical questions of coding the transcript and preparing it for end-users.


Introduction
The book titled De Differentiis Verborum was composed by a Dutch jurisconsult Ausonius Popma, first published in 1606 and reprinted 22 times in the following years. It became a basis for commentators of Latin texts and a source of citations. This dictionary was highly praised by scholars, so almost two centuries later Hill (1794) states that "remarks 'de differentiis verborum,' are often both ingenious and solid".
The dictionary represents also a class of words having separate meaning in modern English, but not distinguished by Romans. For example, Popma explains noun altum as "quod sursum est, et quod deorsum". Other examples: "Hospes est, qui recipit et qui recipitur"; "Vector est ille, qui portat, et qui portatur." Taking into account the direct usefulness of such a book, as any other dictionary, we would like to note several additional situations when this, relatively old dictionary, can be beneficial to a modern student of Latin language.
The first good point is that the language of Popma's book is rather simple, compared to authentic Roman texts, so intermediate students should be able to understand it without difficulty. We could estimate this level of proficiency, though speculatively, as an active vocabulary of about 2000 words or the completion of the first part of the Ørberg's course "Lingua Latina per se Illustrata" (Ørberg 2010). Regular use of the monolingual dictionary suits the idea of natural language learning method, or it can be considered as kind of extensive reading.
All articles of the dictionary can be separated into two big topics, a group of legal terms and synonyms from common literature. The first group represents the professional interests of the author who was a jurisconsult, so that the articles about ampliatio 'trial postponement' and comperendinatio 'adjournment of a trial for two days' are two of the lengthiest in the book.
The second group is based significantly on the texts of Cicero, Virgil, and Pliny, following the medieval tradition of annotating and explaining the most popular documents. From this point of view, Popma's dictionary will be helpful to modern students, because the above-mentioned Roman authors are a basis of every Classic language course.
Spaced repetition and flashcards are methods often used to memorize new words. From our own experience, dictionaries of synonyms are obligatory companions to this method. For example, the first book of the Ørberg's Latin language course contains three words meaning 'to shed tears': plorare (ch. 3), lacrimare (ch. 7), and flere (ch. 24). Opening a flashcard with this English word (or similar 'to cry', 'to weep'), a student can answer any of its Latin counterparts, so we recommend adding all suitable synonyms to the answers of the flashcard. These additions can be made by students themselves during the course of education along with self-studying of differences between these words.

Evolution of the book
Our transcript is based on the most recent edition by Tommaso Vallauri in 1865 (and reprinted in 1870), but from the first publication in 1606 the text of the book underwent so many corrections and amplifications that it will be better to call it a collaborative work (see Figure 1).
After a series of reprints in the following years, curated first by Bartholomaeus Meuschen and later by Thomas Hieronymus (1642-1716), the first significant revision of the text was made in 1694 by Johann Friedrich Heckel (1640-1715), who rewrote many articles, added new groups of synonymous words, incorporated separate lists into the main text and ordered all words alphabetically.
The next key editor was Adam Daniel Richter (1709-1782) who also significantly increased and improved content of the book in 1741. In 1750, he published an essay "Differentias quae in Ausonii Popmae De differentiis verborum libris amissae sunt" containing 62 articles. However, it was not noticed by later editors and was not included in subsequent editions of the Popma's dictionary. A digitized text of the Richter's Differentias is included in the supplementary files.
In 1769, Johann Christian Messerschmid (1720-1794) included most of the Strodtmann's commentaries into his edition of the Popma's dictionary (1769). Collation of the texts revealed only 17 articles missed by Messerschmid and Finally, the most recent version of the text was made by Tommaso Vallauri (1805-1897), who continued to add new groups of synonyms and to polish text of the articles. A distinctive property of this edition is more and more active use of vernacular languages, such as French or Italian. For example: "Crustae sunt laminae inauratae, quae poculis, aut vasis inferuntur, ut vix refelli possint; italice: riporti di basso rilievo". His second edition (1865) contains even more commentaries in Italian. German words can be found in earlier editions however, especially in Strodtmann's version: "Hi enim flagella Spitzruthen, virgam Ruthe, scuticam Peistche, fustes Prugel dicunt."

Transcript format
Choosing a final transcript format, our primary candidate was the Text Encoding Initiative (TEI), and particularly its dictionaries encoding Guidelines (TEI Consortium 2007). TEI is a rich markup language based on XML and designed for coding a wide range of texts from lexicons to prose and poetry. It covers every case of semantic information we wanted to mark up in the transcript of Popma's book.
The Text Encoding Initiative was widely accepted in big Latin research projects, such as Corpus Corporum, Digital Library of Late-Antique Latin Texts (digilibLT), Glossarium mediae et infimae latinitatis, Perseus Digital Library and Corpus Automatum Manhemiense Electorum Neolatinitatis Auctorum (CAMENA) (Schibel and Rydberg-Cox 2006;Glorieux and Thuillier 2010;Tabacco and Lana 2010;Roelli 2014;Crane 2021). We plan to submit our transcript to the Corpus Corporum so the choice of the TEI format is reasonable.
Another notable project publishing Latin texts is Lexicons of Early Modern English (LEME) (Lancashire 2018). Its editors adopted their own format based on XML but using a different set of tags. However, since 2019, this project also publishes documents in the TEI format.
Other formats, such as XML Dictionary Exchange Format (XDXF) or formats promoted by SIL International (MDF and LIFT), are more rigid and only suitable for bilingual dictionaries with a strict structure, so are not applicable to our dictionary of synonyms.
The TEI format provides tags to mark up fragments of the text semantically and visually. In our transcript we use the following semantic tags: -<entryFree> is the parent node of every article; in terms of TEI standard it marks articles with loose structure; attribute "id" connects every entry with the corresponding record in the lexicon file (see next section); -<quote> and <bibl> mark a quotation and its author (together with the title of the work); these tags are not joined into <cit> block; -<foreign xml:lang="grc"> is used for non-Latin fragments (Ancient Greek, French, etc.); -<abbr> marks common abbreviations (v.g., h.e., v.c.).

Corrections made by the transcribers:
-<corr> tag contains both original and corrected reading of the fragment; in most cases typographic errors were checked against the 1852 edition; -<add> is used when one letter was missing or a fragment was added by the transcriber.
Visual formatting tags: -<hi rend="bold"> highlights key (index) words; -<hi rend="italic"> is used for visual formatting; -<hi rend="term"> highlights key words inside of quotes; -<label> marks labels of list items; -<lg> is a parent node for lines <l>; it is used to format poetry.

End-user format
TEI has become a standard format of book transcripts. It is suitable for machine processing or producing a paper book layout. However, end-users are not able to read it directly and need a computer application to load the dictionary and look up words. There are several desktop and mobile application, such as GoldenDict, AARD 2, StarDict, MDict, ABBYY Lingvo and many other. These dictionary shells support one or several file formats.
XML based formats, and TEI in particular, can be easily transformed into HTML (see Figure 2). Among the popular dictionary formats, Slob, StarDict and MDict can store articles with HTML codes, so we routinely encode Popma's dictionary in these formats. Files in DSL format were compiled by an anonymous volunteer. In this way, our dictionary is available for use with at least 5 desktop applications, and 23 mobile dictionary shells. It is included into our collection of Latin dictionaries (https://latin-dict.github.io) and was downloaded more than 130 times (400 times counting pre-release versions). Another situation in which computer devices can assist users is so-called "morphological search". When a user types in a word in declined or conjugated form, such as amavisti '(you) loved', a dictionary shell should find the correct articleamo 'to love'. Hunspell spell checking library is one of the lemmatizers the most widely used in dictionary shells. Keywords in Popma's dictionary are also written in a non-standard form. For example, in "Acini densius nascuntur; Baccae autem rarius", both headwords are in plural forms. To harmonize them with Hunspell, a list of corresponding keywords and their normal forms was compiled (file lexicon.json) and used in producing of files in end-user formats. In such a way, users can look up not only derived forms of the words, but also their orthographical or medieval variants: epistola or epistula, coelum or caelum.
The text of the transcript, and all derived and supplementary materials are distributed as Public Domain on the web-site https://latin-dict.github.io/ dictionaries/Popma1865.html.

Conclusion
Popma's De Differentiis Verborum was a prominent dictionary for three centu-http://lexikos.journals.ac.za; https://doi.org/10.5788/32-1-1747 (Project) ries before it was substituted with dictionaries written in vernacular languages. However, it can gain demand in new conditions following the modern methods of teaching Latin languages, such as the Direct Approach (Natural Method) and the promotion of extensive reading.
Recently digitized into a machine-readable form, the text will increase the corpus of Latin literature and find application in future linguistic research. Students of Classic courses can use it on their mobile devices along with tens of other Latin dictionaries provided on our website.