Towards a computational history of modernism in European literary history: Mapping the Inner Lives of Characters in the European Novel (1840–1920)

In this paper, we investigate the common narrative in literary history that the inner lives of characters became a central preoccupation of literary modernism – a phenomenon commonly referenced as the “inward turn”. We operationalize this notion via a proxy, tracing the use of verbs relating to inner life across 10 language corpora from the ELTeC collection, which comprises novels from the period between 1840–1920. We expected to find an increase in the use of inner-life verbs corresponding to the traditional periodisation of modernism in each of the languages. However, different experiments conducted with the data do not confirm this hypothesis. We therefore look at the results in a number of more granular ways, but we cannot identify any common trends even when we split the verbs into individual categories, or take canonicity or gender into account. We discuss the obtained results in detail, proposing potential reasons for them and including potential avenues of further research as well as lessons learned.


Introduction
In 1924, Virginia Woolf famously proclaimed that "on or about December 1910December , human character changed" (2009: 38): 38), arguing that it was the task and responsibility of modern(ist) fiction to represent the complexities of a character's interiority through new and innovative literary techniques.Since the introduction of the "inward turn" (Kahler, 1973), the notion that the "inner life, the soul, l'âme, die Seele, sjoelen" (Bradbury & McFarlane, 1991: 196) of characters constitutes a central preoccupation of literary modernism has become a critical staple.In line with recent contributions to a better understanding of literary characters through distant reading methods (e.g.Freitas & Santos, 2023;Piper, 2023), we aim to test this notion by tracing references to verbs denoting "inner life" in the European Literary Text Collection (ELTeC), a multilingual collection of corpora created within the COST Action "Distant Reading for European Literary History". 1 Modelling complex, occasionally even contradictory developments and processes in the European novel is beyond the scope of a single paper and a number of confounding variables need to be considered.The primary goal of this brief report is to illuminate the choices that went into the project, reflect on potential confounding variables and outline challenges encountered, for the benefit of future researchers aiming to work with ELTeC and/or explore distant reading methods in the context of literary periodisation.

Modernism and the "inward turn"
In a 2014 article, Melanie Conroy notes that "[t]he alleged inward turn of the modernist novel has become one of the great truisms of academic literary criticism" (2014: 121).Conroy employs a combination of distant and close reading methods to critically examine this commonplace, tracing the use of "mental verbs" (2014: 117) across French literary texts between 1800 and 1929 and asking, "do reporting clauses and mental verbs occur more frequently in some authors, texts, or decades than elsewhere?If the frequency of these markers is significantly higher in them, these authors, books, or decades quite possibly engage in more thought representation and thereby strengthen the 'inward turn'" (Conroy, 2014: 134).The present paper builds and expands on Conroy's central points and findings in two ways: firstly, it addresses the call to investigate representations of mental states in different languages and literatures (Conroy, 2014: 125) by working with the multilingual text collection ELTeC; secondly, we similarly focus on mental verbs, but develop the approach further by drawing on a six-part categorisation system derived from the "Theory of Mind" framework (Bretherton & Beeghly (1982); Tompkins et al. (2019)).Thereby, we diversify the mental verbs chosen beyond the two that Conroy primarily focusses on, "he/she thought" and "said to themselves". 2 One of our aims is furthermore to examine how Conroy's findings that a "significant increase in the use of mental verbs" can be observed, but that "the two pointers of represented thought at issue did not increase in a linear fashion in the period 1800-1929" (Conroy, 2014: 140;: 140;emphasis added) in the French corpus compares to results from a multilingual collection such as ELTeC.
Taking the common critical narrative that the characters' inner lives became central to literary modernism as a starting hypothesis to be critically investigated, in the present paper, we operationalize the use of inner-life verbs, such as feel or think (see the Methods section) in ELTeC as a proxy.Our assumption is that these verbs are employed when interior processes are represented and that upwards or downwards trends in their use may roughly correspond to developments in European literary history.The issue at hand seemed cut out for a comparative computational study of the novel corpora included in ELTeC, 3 since this approach allows us to identify and visualise trends appearing at different times and compare similarities and differences across languages.
We want to emphasize that the terms 'modernism', 'modern' and 'modernity' are not unambiguous, but rather carry fluctuant meanings across national literary histories (Călinescu, 1987;Friedman, 2001).Some sources locate the start of modernity during Romanticism or even the Renaissance, while others consider only the changes in the literary/aesthetic field around 1890 as 'modern(ist)'.Additionally, the themes and techniques perceived as 'modern' or 'modernist' did not develop synchronously throughout Europe.While there is thus no common and/or uncontested periodisation of 'European Modernism', we can observe processes involving formal innovation in national literatures throughout Europe at different times.Based on commonly accepted (though not entirely undisputed) periodisations of modernism in each of the national literatures represented, we would expect to see, for example, an upward trend in mental verbs between the late 19th and the early to mid-20th century in the English-, Serbian-and Romanian-speaking corpora, and the late 19th century in the German-and Norwegian-speaking corpora.The Portuguese corpus from the ELTeC collection is included as a test case for the above-mentioned hypothesis: While Portuguese modernism is said to have started with the Revista Orpheu (in 1915), no authors that would be considered hallmark modernist writers (for instance, Almada Negreiros, António Ferro) could be included in the collection for copyright 1 See www.distant-reading.net/eltec/.
2 These are the only two examples discussed at length and used in the visualisations in Conroy's paper.However, since their occurrence is always preceded by "e.g." and the raw data of this project was not freely available, it is not unambiguously possible to tell whether these two verbs were used primarily or exclusively in Conroy's project. 3For detailed discussions of the selection criteria for the ELTeC collection, see Burnard et al., 2021;Schöch et al., 2021.reasons.Therefore, we expect the results from the Portuguese corpus to differ from those of the other corpora and include it as a counter example.

Data
This study uses 10 ELTeC corpora: English, French, German, Hungarian, Norwegian, Portuguese, Romanian, Serbian, Slovenian, and Spanish.ELTeC was created to reflect a sample of novels in various languages (the core collection consisting of 10, the expanded collection of 20 languages) from 1840 to 1920 based on criteria that ensured "rough comparability" (Schöch et al., 2021: 4) across subcollections.Each of the corpora includes 100 public-domain novels, with diverse metadata (author name, gender, publication date, word count, etc.) that operationalize certain concepts (e.g.canonicity, reflected in reprint count).Although the editors aimed at balanced subcollections and fair distribution according to variables, not all corpora could comply with the proposed criteria (for detailed discussions of the selection criteria and the process of corpus-building, see Burnard et al., 2021;Herrmann et al., 2020;Schöch et al., 2021).
For the purpose of the present study, we chose ELTeC -level 2, that is, corpora that contain XML texts, which are TEI encoded and POS-tagged.The size of the full material appears in Table 1.4

Methods
Just as there are many conceptual ways and proxies available to start unpacking the complex relationship between literary modernism, the inward turn and the use of inner-life verbs, we had a range of options to explore and discuss when operationalizing and then testing our hypothesis.The methodology described here has slowly evolved over multiple discussions, during which we carefully weighed advantages and disadvantages for each approach, together with domain experts for each language, i.e. collaborators from the COST Action who had expertise in a given national literature and/or were native speakers of the respective language.Our methodology emerged from a trial-and-error process that will have to be repeated and expanded on in larger studies.Even so, the data gathered as part of alternative approaches is available in the supplementary material.Originally, we wanted to compare two approaches: the first based on the methodology described in this paper, with the difference that we simply selected the 10 most frequent inner-life verbs in the respective corpus rather than 3 verbs from 6 categories.Our second -eventually abandoned -approach used seed words (feel, think, believe, know, hope, wish and their translations in the ELTeC languages), augmented by 15 nearest neighbours found through word embeddings.This resulted in a list of up to 100 words.However, filtering the noise from this list was, for the moment, beyond the scope of this first paper. 5We eventually decided to return to the first approach, but provide a clearer theoretical basis for the selection of verbs by using existing categories from theoretical literature on inner-life language.
Our methodology revolves around two sets of choices: 1) how the items on the language-specific wordlists are selected, and 2) how the data is analysed.It is presented as a first conceptual bloc in a much longer, complex debate.The choice to focus on the morphological category of verbs is based on our aim to compare the results obtained from a multilingual collection to Conroy (2014)'s findings from a single-language corpus.The selection of these verbs is informed by recent literature in psychology on "internal state language", which has been studied intensively, especially in the context of the Theory of Mind framework (Bretherton & Beeghly, 1982;Tompkins et al., 2019).Drawing on Bretherton and Beeghly's grouping of utterances relating to mental states, we derived six relevant categories that were used when compiling the list of verbs to be mapped: • perception: verbs relating to sensory experience (e.g."see" something, "listen" to something, "perceive" somebody); • physiology: verbs relating to the body/bodily experience that influences one's inner life (e.g."hurt", "feel hungry"); • affect: verbs relating to emotions or emotional states (e.g."love", "hate") • volition and ability: verbs relating to wishes, desires etc. and/or ability (e.g."desire", "wish")6 ; • cognition: verbs relating to mental processes (e.g."remember", "forget"); • moral judgment and obligation: verbs that contain evaluative statements (e.g."she preferred x over y") and/or that refer to an obligation (e.g."they should be careful"; "he was obliged to her").
We asked domain experts for each language to go over the entire verb frequency list and select the three most frequent verbs for each of the abovementioned categories.Primarily due to the complexities of syntactic parsing for all these languages, filtering the verbs through dependency parses or similar could not be done reliably across languages.We normalized counts of inner-life verbs against the overall verb count for best possible comparability.This procedure maximizes diversity within the total of inner-life verbs, ensuring that the selected verbs have a high overall relative frequency in a given language corpus.If we had used pre-established seed words for this method, the results might have been skewed because specific verbs may be used more frequently in some languages than in others.Through this broader approach, based on scientifically derived categories, we believe that we have arrived at comparable -though of course not identical -lists that better represent the idiosyncracies of the respective corpora (and associated languages).If we had simply scored for absolute frequency, the category 'perception' would have been dominantly represented.Instead, our approach gives the six categories equal weight and thus also allows us to inspect them individually.
The analysis consists of establishing the prevalence of each inner-life verb from the list for each language-based corpus.
The prevalence in one novel is defined as the proportion of instances of each inner-life verb relative to all instances of verbs used.All analyses are performed (a) for all six categories taken together and (b) for each of the categories separately.
We visualize the data in three ways: (1) Using a detailed scatterplot, we can show each novel as a function of its publication year and the prevalence of inner-life verbs; this allows for the calculation of a second-order polynomial regression line that shows whether the prevalence increases or decreases over time.
(2) Using a group of boxplots, each showing the distribution of the prevalence of inner-life verbs during one decade, we summarize the data.
(3) Splitting the data into an earlier (1840-1870) and a later phase , and visualizing the distribution of innerlife verbs as a density plot, we display the degree of overlap and similarity between the data for the earlier and the later period.This also allows for the calculation of a test statistic and the probability that the two distributions are part of the same underlying distribution.Only if this probability is below a threshold (traditionally, p<0.05) should we assume a genuine underlying difference between the distributions of the earlier and later phases in the data.

Results
Based on the corpora, lists of verbs and methods of analysis described in the previous sections, we have obtained a set of results in the form of frequency tables and data visualizations.7 The examination of scatterplots and boxplots for all verbs first shows various slight trends, whether upwards (English, Hungarian), downwards (French, Norwegian, Portuguese and, most markedly, Slovenian) or more or less flat (German, Romanian, Spanish).However, the density plots and tests for statistical significance do not detect any differences of statistical significance.8For a summary of these data, see Figure 1.
As a consequence, we look at the data in several more finegrained ways.First, we consider whether individual categories of verbs may display more marked trends than was the case for all verbs taken together.However, this yields similarly inconclusive results for all corpora.Second, we consider whether there is any divergence in the trends according to ELTeC canonicity and gender criteria. 9We only find slight variations between them, with strongly overlapping confidence intervals and therefore no statistically significant divergences.
Finally, we want to exclude the possibility that we do not see significant trends because the datasets are too small, or because the time period 1840-1920 is too short for such trends to become visible.Therefore, for languages where more data from a wider chronological range is available (French, German, Portuguese), we perform the same analysis.This time, in the French data, we do see statisticallysignificant downward trends, both overall and most markedly for verbs of affect (Figure 2)10 .This result, however, contradicts our initial expectation of an upward trend.

Discussion
For the period primarily investigated here (1840-1920), the proportion of inner-life verbs, in the way we have defined and operationalized them, appears to be remarkably stable across a number of different languages, corpora, and subsets of the data.Additionally, their distribution appears to be statistically stable in all but one language even when considering corpora with a larger diachronic scope.In other words, the data do not confirm the original hypothesis of a rising trend in the use of inner-life verbs over time.The reason for this finding could be that assumptions in literary history about this aspect of modernism are false.However, it is probably much too early to reach such a conclusion, because there could be other reasons, for example: The time slot represented in ELTeC could be too short.As the experiment with the 1100 French novels shows, we can see a trend when we look at the long history from 1750 to 2000, but it is a decreasing one.Conversely, there is no trend visible in either the German or Portuguese temporally extended corpus.We assumed that we would find an upward trend earlier in some languages and later in others, so our time window of 80 years could be too short for the phenomenon we are interested in.However, as the experiments with larger corpora show, this expectation is not supported by the data.

Canonicity:
The mix of novels in ELTeC could be misleading.Traditional accounts of literary history are usually based on a rather small set of texts which represent retrospectively the most advanced trends of their time and which made it into the literary canon.The various ELTeC corpora also contain non-canonical novels, a category which sometimes refers to texts from high literature (literature meant to be read by educated readers, often employing complex literary devices) that were excluded from canonical accounts of a given national literature for various reasons.Sometimes, the category refers to texts which are precursors of popular fiction.But even despite this ambiguity, the meaning of canonicity is operationalized according to specific parameters in ELTeC.If we consider only canonical novels from the selected corpora, we can see a decreasing trend in the French and Portuguese data, while there is no trend for the German corpus and only small trends for the Spanish one.11Only in the English data do we see exactly the upwards trend we expected. 12onfounding variables: The distribution of narrative devices is certainly not stable for the time period we are interested in, as writers developed innovative techniques that challenged "ill-fitting" (Woolf, 2009) realist forms of literature and sought new ways of capturing the ever-elusive complexity of human character.Two connected trends are often mentioned in literary history: the disappearance of the narrator (at least a trend to avoid third-person narrators or strong commenting voices) and a preference for showing as opposed to telling (Klauk & Köppe, 2014).The latter trend has not only been confirmed but shown to be visible in the history of the novel between 1750 and 1950 (Underwood, 2019; see also Heuser & Le-Khac, 2012).The French long-term data show a decrease of verbs indicating a description of affects, which seems to conform to this tendency.Conversely, we cannot see any long-term trend with respect to the inner-life verbs in the expanded German corpus.Interestingly, Conroy's French corpus does not seem to confirm the aforementioned trend towards a preference for showing rather than telling, as she notes that "common reporting clauses and mental verbs appeared in a wide variety of texts, [...] alongside other, 'free ' techniques" (2014: 117).This parallel development could potentially be considered a confounding variable insofar as the use of both techniques in equal or similar measure might mean that neither appears as a statistically significant trend in the data.
In its critical reconsideration of the common narrative of the "inward turn" as a key aspect of modernist literature, Conroy's article can be seen in the context of other recent publications (Gang, 2013;Herman, 2011;Miguel-Alfonso, 2020) which similarly highlight the complexity of issues surrounding periodisation based on formal, aesthetic, and thematic features and the need for a nuanced engagement with them.Both the insights from these recent approaches and the lack of an unequivocal trend in our data suggest that there are good reasons to conduct further research into the manifold connections between literary modernism and the inner life of characters, which eschew simple and univocal narratives.At the same time, our data show that the European history of the novel is not just a trickling down of modernism from the centre to the periphery, as it has traditionally often been conceived (a view that is increasingly being challenged in recent conceptualisations, of which Friedman's Planetary Modernisms (2015) is a key representative).We see very different trends in the data, even if we grant a potential time lag between national developments.This indicates that modern interests and sensibilities were integrated into very different national histories and any attempt to tell this story on the European level must start from a more complex model of literary history.

Conclusion: Lessons learned and further research
By tracing the use of inner-life (or mental) verbs, based on 6 categories derived from the "Theory of Mind" framework, across ELTeC corpora in 10 different European languages, we have attempted to operationalize the hypothesis that characters' inner lives become a central concern of literary modernism.The methodology chosen in this paper has expanded on the approach adopted by Conroy ( 2014) by investigating the "inward turn" in a multilingual collection and by further specifying and diversifying the categories according to which the mental verbs were selected.Largely in accordance with Conroy's findings, but contrary to expectations based on the common critical narrative of the "inward turn" as an essential feature of literary modernism, our data do not show an increase in the use of these verbs over time, but rather a relatively stable distribution of them within the period in question.
This project has been an exploratory endeavour, partly owing to the multilingual feature of ELTeC and the lack of existing cross-linguistic frameworks and methods that might fully encapsulate the complex tension between different national and cross-national developments in European literary history.Several factors may have been responsible for the inconclusiveness of our results.Some might be related to the materials worked with, as discussed in the results and discussion sections: • the size and scope of the collection • the fact that 'modern', 'modernist', 'modernism', as well as associated questions of periodisation, differ across languages and national literatures However, the inconclusiveness may also owe to methodological choices: • Were we sufficiently 'deep' in our approach?Provided the verbs are indeed indicative of inner life in all our languages, is the occurrence pattern of these verbs more complex than a pure counting of relative occurrences?One avenue of future research would be applying word embedding-analysis to the texts and trying to tease out the contexts that the seed words occur within or even 'the company they keep' (traceable through their embedding-neighborhood). 13 Have we been paying enough attention to the different distributions of the categories in the different languages?
With regard to these points, a comparative study employing similarly complex tools and methods as Piper, 2023 (bookNLP, super-sense tagging) might provide a further potentially fruitful avenue of research.With its focus on embodiment, rather than emotional states, in the Hathi1M corpus, Piper's paper proceeds from a diametrically opposed starting point, yet arrives at a similar result that corroborates our findings: While verbs referring to "embodiment", in particular those referring to "motion", experienced a steady rise in the period between 1800 and 2000, verbs of cognition do not display a similar upward trend (Piper, 2023: 7).
Finally, working together in a multilingual and interdisciplinary setting is, no doubt, a very rewarding and enriching endeavour.At the same time, it entails its own challenges regarding explicit or implicit expectations, conventions, and terminologies that are important to consider when embarking on comparable projects.The article studies whether the theorized increasing focus on the 'inner-lives' of characters associated with modernism can be quantitatively shown by measuring the relative prominence of 'inner-life' verbs over time.The authors study this phenomenon in ELTeC corpora for 10 languages.After identifying three 'inner-life' verbs from each of six relevant categories for each of the ten languages, they measure the usage of these verbs as a proportion of all verbs each year from 1840-1920.The proportion of 'inner-life' verbs used over time does not change significantly over the time period, except for a larger corpus of French literature, in which they find a decrease in the usage of 'inner-life' verbs.

Data and software availability
Given that Open Research Europe seems to be a broad repository venue rather than a specific journal, I am uncertain of the expected level of contribution or intended audience for this piece.My comments assume an audience of Digital Humanities researchers and those interested in DH work.Overall, the work seems sound if not particularly exciting.I can see it as a baseline for a more sophisticated study, but at present I feel that the work would be too preliminary to publish in a competitive journal or conference.
My main criticism is that the assumption that inner life can be characterized by specific verbs, and more importantly that those verbs indicate descriptions of inner life, seems thin and reductive.Does Catullus not hate and love?At this point we have language models that can operate in multiple European languages and extract high-level conceptual constructs.Counting verbs seems somehow both unambitious and unworthy of the subtlety of the concept of inner life.If this were presented as a relatively weak opening play at operationalizing inner life to be empirically disproved, and to be followed by speculation about stronger methods, I would feel more confident.At least looking at objects and subjects in addition to verbs might provide more variation.
An alternative, and perhaps stronger motivation might be to clarify that even if an observation does not define a construct, it can still serve as an indicator of the presence of the construct.The famous Stanford lit lab result that the Gothic novel is characterized by greater use of locative prepositions does not mean that those prepositions are what makes the novel Gothic, it is that the Gothic is a distinct and noticeable phenomenon that has innumerable statistical manifestations, of which prepositions are just the most countable."Inner life" verbs could be similar, and they are not.
Another potential threat to validity is the composition of the corpus.I believe that this collection was developed for general purposes and not for this specific study.I don't know anything about the novels except their rough publication date and the fact that they were prominent enough to be digitized and selected for these collections.Do they represent modernism?Could the early ones have been chosen because they are avant garde (and thus happen to have more modern features), or the later ones because they are in a more conservative prestige-literary tradition?I don't expect so, but the point is that I don't know.
There are well-known large-scale patterns in literature (at least in English) over this time period, see Ted Underwood's work on narrative time measurements.Over the 19th century Englishlanguage literature shifts from a more narrative style to a more dialog-oriented style.I wonder if there are I am also concerned that the choice of only three verbs from each 'inner-life' category could skew results.If certain authors favor particular verbs over others or there was a transition from using one verb to another over time -both easily imaginable scenarios -the given results would not reflect the phenomenon being studied.It is particularly concerning to me that the criteria for choosing these verbs is not given.To account for the concern that studying all verbs would skew the results in favor of one category of verb, one could report general trends for all verbs and then individual trends for each category, thus allowing for both a broad and fine-grained study of the phenomenon.
I don't know how the confidence regions in Figure 1 are defined, but they look like bootstrap CIs, which would be a good option to increase statistical reliability.The second-order curve fit may be too smooth, I would rather see this as the actual numbers in a line plot with per-decade confidence bars.
Finally, and less importantly, I found the paper difficult to follow in places and believe it could benefit from some reorganization and simplification of language.I comment more on this in the section below.

Introduction
A final word describing what COST Action "Distant Reading for European Literary History" is (project?)would help clarify for those unfamiliar.

Measuring modernism through "inner life"
This section feels slightly disjointed; I think that splitting its content between the Introduction and Methods sections would help.Specifically, moving the hypothesis and discussion of modernism to the Introduction and the portion on methods to the Methods section would likely increase flow.
The second sentence of the section mentions a hypothesis before one is given which feels odd although the hypothesis can be guessed.
The next sentence seems to tease the methods without providing much information on themthis could just be moved to Methods.
In the sentence beginning "Our assumption…", "has been" should be changed to "is" or "was" for clarity.
The phrase "seemed a good test case" feels overly informal and implies this project was just created to use ELTeC.Perhaps rephrase to say that an additional benefit of the experiment was that it was an opportunity to explore the uses of ELTeC.
Adding some clarification to what the "upward trends" referenced in the final paragraph of this section are would be helpful!I assume this means upwards trends in the usage of inner-life verbs, but it would help to have that explicitly stated.

Data
The sentence beginning "ELTeC was created…" reads slightly awkwardly and may benefit from rephrasing.Specifically, including a specific number instead of saying "in various languages" would be beneficial.The phrase "in a comparable way" sits awkwardly and could be removed.
Writing "ELTeC level 2 [corpora or texts]" instead of "ELTeC -level 2" may help those unfamiliar with the corpora.
"… corpora that were xml texts" should be "corpora containing XML texts, which are TEI encoded and…" Why are the results for Serbian, referenced in the footnote, not included if they show similar trends?

Methods
The first sentence of the sentence is confusing and may benefit from reorganization.Specifically, "unpacking the relationship between literary modernism and the use of inner-life words" is referenced but I am not sure where this has been discussed in the paper so far.
The discussion of how you settled on the methodology (beginning with the sentence "The methodology described here…" and ending with "The second iteration…") could be significantly shortened or cut.The way the general process described - iterative refinement through testing and discussion -seems standard of this kind of research project.Instead, the section could be formatted as "we chose these methods instead of ---because of ----" or the paper cited in footnote 4 could just be referenced.
The sentence beginning "It is important to see the methodology…" distracts from the intention of this paragraph and could be cut.Beginning with "Our methodology revolves around…" would be more impactful.
I'm not sure what "the morphological category" is at the point it is mentioned.
The criteria by which the language experts selected the verbs for analysis is important.
The sentence which starts "The analysis step is centred around…" is confusing and can be cut.
It feels odd to have the different visualizations listed before they are shown, particularly because not all of the visualizations are used in the upcoming analysis.Perhaps just mention visualizations in the results when they are used.For the description of the third visualization category, what test statistic is calculated?A technical audience will also know what significance is and the traditional thresholds for p-values; including it is unnecessary.
Table 1 and Figure 1 I would change the language codes to full language names to increase legibility.

Results
The first paragraph can be cut.
In reference to Footnote 6: I would include any visualizations referenced in the paper.Or, if there is only space for Figure 1, it seems it can be referred to for all of your analysis in this section.
"…any differences with statistical significance…" should be "…any differences of statistical significance…" The paragraph beginning "as a consequence" could likely be condensed to one sentence per tested theory.

Discussion
In the first sentence, the phrase "at least in the way we have defined and operationalized them" undercuts your analysis and its contents are clear from the framing.
The word "somewhat" in sentence two feels vague; perhaps you could say "statistically stable in all but one language"?
The sentences which begin "The reason for this finding…" and "However, it is probably…" contain a lot of hedging and could likely be combined.
The time slot could be too short Moving the sentence about the German and Portuguese data to the end would improve flow.
Canonicity "in the ELTeC" should be "in the ELTeC corpora" I'm not sure what is meant by "The description of literary history."Common discussions of literary history?
The distinction between canonical novels and those from high literature is not entirely clear to me.Who considers these books irrelevant?
"…the meaning of canonicity is clear…" does not hold true in my experience.
When stating "we only consider these novels" do you mean only the canonical novels?If so, how are those determined?
Confounding variables Are the "two connected trends" those that could explain the results?Some qualifier about what these trends are / how they were chosen would be helpful.
Please cite or include the "other experiments" referenced.It would be particularly valuable to know what method for measuring inner-life verbs are used in these experiments.
The sentence beginning "While acknowledging…" is confusing and could benefit from rephrasing.
What is the "centre" and what is the "periphery"?
Instead of "we see," I would write "our results show" so that it is clear what leads to this conclusion.
I am not sure how the data show that "modern interests and sensibilities were integrated into very different national histories.I fully believe this is true, but struggle to follow how the data demonstrate this.

Conclusion
Why does the multilingual.Nature of ELTeC make this an exploratory endeavor?
Saying "provided the studied verbs" or similar instead of "provided the verbs" would add clarity.
"… such Piper 2023" should be "such as Piper 2023" The final paragraph commenting on interdisciplinary research may be obvious to an audience used to such collaborations.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound?Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility?Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Digital humanities, text mining
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Tamara Radak
We thank the reviewers for the perceptive and fine-grained report.We appreciate the productive feedback and clearly outlined questions, suggestions, and recommendations and respond to individual points made in the report, outlining how they are addressed in the revised version of the paper below.
"My main criticism is that the assumption that inner life can be characterized by specific verbs, and more importantly that those verbs indicate descriptions of inner life, seems thin and reductive." In the revised version, we expanded the section concerning methodology and aimed to clarify our approach further, attempting to answer the important questions raised in the report in the process.We expanded on the connection between our project and Melanie Conroy's 2014 paper "Before the 'Inward Turn': Tracing Represented Thought in the French Novel (1800-1929)", which was previously mentioned briefly and parenthetically, but deserved a more thorough engagement, as it helpfully lays out a number of considerations that are also central to our own point of departure with regard to the overall topic and the methods employed.Among these is the mapping of what Conroy terms "mental verbs" within literary texts of the period in question.In Conroy's article, this quantitative approach is complemented by a detailed qualitative engagement with the development of French literature at the time, which would generally be an interesting way forward, but would have gone beyond the scope of this short report at the present time, given that our own project worked with 10 European languages with rather different developments in their literary histories (with some overlaps).

"Another potential threat to validity is the composition of the corpus. I believe that this collection was developed for general purposes and not for this specific study. I don't know anything about the novels except their rough publication date and the fact that they were prominent enough to be digitized and selected for these collections. Do they represent modernism? Could the early ones have been chosen because they are avant garde (and thus happen to have more modern features), or the later ones because they are in a more conservative prestige-literary tradition?"
○

"A final word describing what COST Action "Distant Reading for European Literary
History" is (project?)would help clarify for those unfamiliar."

○
We have added more information about ELTeC, which was developed as part of the COST Action Distant Reading for European Literary History, in the main text (both in the introduction and in the section entitled "Data") and hope to have drawn more attention to the papers that deal in detailed ways with the methodological background, choices, and criteria made in the context of corpus-building of ELTeC (Burnard et al.; Schöch et al.).

"To account for the concern that studying all verbs would skew the results in favor of one category of verb, one could report general trends for all verbs and then individual trends for each category, thus allowing for both a broad and fine-grained study of the phenomenon."
○ "The criteria by which the language experts selected the verbs for analysis is important." ○ "I would include any visualizations referenced in the paper.Or, if there is only space for Figure 1, it seems it can be referred to for all of your analysis in this section."

○
In the revised version, we expanded the section concerning methodology and aimed to clarify our approach further, attempting to answer the important questions raised in the report in the process.We address individual comments below.All analyses discussed in the paper were performed both for all verbs jointly and for each category of verb separately.As an example of this approach, Figure 1 gives an overview of the relative frequency of all inner-life verbs over time per language, while Figure 2 shows the development of verbs in the affect category alone (for the extended French corpus)."I don't know how the confidence regions in Figure 1 are defined, but they look like bootstrap CIs, which would be a good option to increase statistical reliability.The second-order curve fit may be too smooth, I would rather see this as the actual numbers in a line plot with per-decade confidence bars."

○
We chose this method because higher polynomials fit the data more closely, retaining more details than a simple linear regression (the main text has been updated to reflect this term).
We see this in the down-and-up curve for Romanian and the down-then-flat curve for Hungarian.The raw data, as well as preliminary visualisations created at earlier points in the process, can be found on GitHub: https://github.com/COST-ELTeC/innerlife/"Why are the results for Serbian, referenced in the footnote, not included if they show similar trends?"

○
We only received data for the first round of experiments with the Serbian corpus, not for the revised method based on the six categories of verbs derived from the Tompkins et al.

"For the description of the third visualization category, what test statistic is calculated?"
Contrary to what would be expected, modernist novels do not show a significantly higher frequency of such verbs.
The paper is generally well written.However, some details are missing or are unclear.
For example, the section of the selection of verbs talks about "nearest neighbors", but it is not clear by what method that was operationalized (the GitHub repository shows that word embeddings were used, apparently).Furthermore, several references to "categories" are made on page 1, without introducing how they are derived from the Theory of Mind framework.
It is also unclear whether in the end there are different inner life verbs for each language, or whether translations of the same words are used.If only 3 verbs are selected per category for each language, the risk could be that these 3 words are not representative for the category.
It would have been ideal to use an already validated lexicon for the project.For example, the LIWC lexicon has been validated in independent studies (and has been translated in several languages).
In the current paper, it is not possible to separate the question of whether the lexicon is appropriate from the result for the research question.
Another aspect is the statistical methodology.It is not made explicit how statistical significance was tested.Also, the Methods section announces that a linear regression line will be plotted, but Figure 1 and 2 actually show a polynomial regression.Why was polynomial regression chosen rather than linear?
A suggestion: in Table 1 it would be more useful to show the percentage (or relative frequency) of inner life verbs, rather than absolute counts.There is a surprisingly large variance in word counts in the corpus, making absolute counts difficult to compare.
Finally, one possible explanation for the results is that modernism gives more attention to inner life not through explicit inner life verbs, but through devices such as free indirect discourse, in which the thoughts of characters are described with explicitly marking them as such.

Is the work clearly and accurately presented and does it cite the current literature? Partly
Is the study design appropriate and is the work technically sound?Yes

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility?Yes Are the conclusions drawn adequately supported by the results?Yes belief in Literary studies regarding the importance of interiority to 20 th century literary fiction.This is not only an interesting subject of study in and of itself, but it also provides a fascinating test case for the ELTeC corpus.I believe that the article has a great deal of merit and will make an important addition to our understanding of both the limits of computational literary study and the role of characterization in Modernism; however, there are a few, relatively minor, aspects of the study that I would like the authors to address before I can recommend the text for publication.
Overall, my largest concern is the justification of the topic itself.I would not advocate for any substantive change to the methods employed by the authors, or their interpretations of their results; however, it will be important to include a bit more material justifying aspects of the study at the outset.For example, the authors suggest in their paragraph "Measuring modernism through 'inner life'" that Modernism is not a universal phenomenon and does not take place across all languages at the same time.They do suggest that there are observable "processes implying formal innovation in national literatures throughout Europe at different times", suggesting that they have underlying data which indicates the advent of Modernism in each language/cultural tradition represented by their corpus, but they do not make this data apparent.
It is unclear if all of the languages represented in the corpus experience a "Modernist" period, or just some, and what kinds of formal innovation take place (these may be confounding variables to the study).It would be helpful for the authors to add in a little extra material explaining the periodization of the corpus they are using and the results that lead them to hypothesize the presence of "Modernist" literature in each of the languages that their corpus contains.This would allow for a more fine-grained hypothesis and help justify the goals of the study overall.
Similarly, while the authors do an admirable job of explaining how they selected the verbs that may represent interiority, they do not explain how they treated these terms in the context of the different languages that the corpus contains.How were each of the verbs in the categories selected across languages?What efforts were made to ensure that the same selection of words would remain consistent despite the differences in meaning and frequency across the languages?Since part of the paper's goal is to experiment with a multi-lingual corpus, this would be a very helpful detail to include.I'm also curious about how the authors counted the verbs.From the paper, it appears that, once the verbs were selected, the authors counted their incidence across the texts.While this is an understandable method, I would like some small explanation of why the frequencies of the verbs themselves were counted across the text as a whole, rather than being first filtered by their attachment to descriptions of character (through, for example, a dependency parse).Such a filter would ensure that the verbs are only being counted when they are attached to a named entity (presumably a character) rather than used in other contexts.Again, I'm not asking the authors to do this, simply to reference their choice not to.
Aside from these two more substantive concerns, there are a small number of very minor points that the authors could augment or correct: The Authors indicate that they use a "nearest neighbor" approach to building out their wordlist in one of the initial trials.Some more detail on what this approach was (a knn test, or something else) would be helpful.

1.
In figure 1, the authors visualize the trends in their results using a second order polynomial as a best fit line: why did authors choose this particular regression rather than a simple linear one? 2.
In their description of Heuser and Le-Khac's 2012 paper, the authors suggest that they found a trend to avoid telling in favor of showing from 1750-1950.The paper that they cite, however, only tracks this phenomenon in nineteenth century texts and their corpus stops in 1900.This is a relatively minor point, but considering that this paper is interested in modernism (which is often understood as a 20 th century phenomenon, this is an important caveat).

3.
With these minor changes made (most of which just involves supplying the reader with some more information about the choices made by the authors), I would be happy to recommend this paper for indexing.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound?Partly

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility?Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: I am an Associate Professor of Digital Humanities in a Literary Studies department (English) with over 20 years of experience in the field referenced by the article.My research focuses on quantitative and computational literary analysis, nineteenth-century literature and aesthetics, and the history of the novel.
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
each language/cultural tradition represented by their corpus, but they do not make this data apparent.It is unclear if all of the languages represented in the corpus experience a "Modernist" period, or just some, and what kinds of formal innovation take place (these may be confounding variables to the study)" In the revised version, we expanded on the connection between our project and Melanie Conroy's 2014 paper "Before the 'Inward Turn': Tracing Represented Thought in the French Novel (1800-1929)", which was previously mentioned briefly and parenthetically, but deserved a more thorough engagement, as it helpfully lays out a number of considerations that are also central to our own point of departure with regard to the overall topic and the methods employed.We have made more explicit and expanded on the concept of the "inward turn", which is commonly posited as a hallmark feature of literary modernism and which was previously referenced in more implicit ways within our paper.
• "It would be helpful for the authors to add in a little extra material explaining the periodization of the corpus they are using and the results that lead them to hypothesize the presence of "Modernist" literature in each of the languages that their corpus contains."In the updated version, we provided examples of commonly accepted periodisations of literary modernism within some of the corpora, clarifying that the Portuguese corpus, which does not contain texts of writers who would unequivocally be termed "modernist" due to copyright issues, was included as a counter example and thus, a test case for the hypothesis.These periodisations are not based on data gathered via computational methods from literary corpora, but rather constitute a common consensus across literary histories of Europe (e.g. Walter Cohen's 2017 book A History of European Literature: The West and the World from Antiquity to the Present).We acknowledge that questions of periodisation are not clear-cut or and that they represent a range of (often invisible) biases, preferences, and choices of their time (for example, with regard to the issue of canonisation).Both our own data (which are presented as a first step in a complex longer debate) and Conroy's findings, along with other recent publications along similar directions suggest that computational methods and distant reading may be important tools for revisiting and critically questioning established narratives of literary history from fresh perspectives.Comments relating to methodology In the revised version, we expanded the section concerning methodology and aimed to clarify our approach further, attempting to answer the important questions raised in the report in the process."I would like some small explanation of why the frequencies of the verbs themselves were counted across the text as a whole, rather than being first filtered by their attachment to descriptions of character (through, for example, a dependency parse).Such a filter would ensure that the verbs are only being counted when they are attached to a named entity (presumably a character) rather than used in other contexts.Again, I'm not asking the authors to do this, simply to reference their choice not to." ○ Author response: While this method would certainly have been helpful, it could not be done reliably across languages, at least at the present time.In future iterations of this and/or similar projects, using tools and such as those mentioned in Piper (2023) -bookNLP, super-sense tagging -could certainly be a productive way forward and potentially yield more fine-grained results.At present, we worked with existing tags and metadata that were developed in the context of ELTeC and the COST Action Distant Reading for European Literary History.Other comments "The Authors indicate that they use a "nearest neighbor" approach to building out their wordlist in one of the initial trials.Some more detail on what this approach was ○ (a knn test, or something else) would be helpful."Author response: Originally, we planned to compare two different methods -one based on frequencies of inner-life verbs and one based on seed words enriched through word embeddings.We clarified further that the second approach, involving word embeddings, was eventually abandoned because it was not possible to reliably filter out the 'noise', issues of polysemy, and similar issues.The raw data of this approach is, however, included in the extended data.
"In figure 1, the authors visualize the trends in their results using a second order polynomial as a best fit line: why did authors choose this particular regression rather than a simple linear one."

Figure 1 .
Figure 1.Relative frequency of all inner-life verbs over time, per language (second-order polynomial regression line).

Figure 2 .
Figure 2. Relative frequency of inner-life verbs (affect category only) in the extended French data (1750-2000), with a secondorder polynomial regression line.