Next Article in Journal
Virtual Restaurants: Customer Experience Keeps Their Businesses Alive
Next Article in Special Issue
Generative Pre-Trained Transformer (GPT) in Research: A Systematic Review on Data Augmentation
Previous Article in Journal
Enhancing CSI-Based Human Activity Recognition by Edge Detection Techniques
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Linguistic Communication Channels Reveal Connections between Texts: The New Testament and Greek Literature

by
Emilio Matricciani
Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB), Politecnico di Milano, Piazza L. da Vinci 32, 20133 Milan, Italy
Information 2023, 14(7), 405; https://doi.org/10.3390/info14070405
Submission received: 15 June 2023 / Revised: 10 July 2023 / Accepted: 12 July 2023 / Published: 14 July 2023
(This article belongs to the Special Issue Editorial Board Members’ Collection Series: "Information Processes")

Abstract

:
We studied two fundamental linguistic channels—the sentences and the interpunctions channels—and showed they can reveal deeper connections between texts. The applied theory does not follow the actual paradigm of linguistic studies. As a study case, we considered the Greek New Testament, with the purpose of determining mathematical connections between its texts and possible differences in the writing style (mathematically defined) of the writers and in the reading skill required of their readers. The analysis was based on deep-language parameters and communication/information theory. To set the New Testament texts in the larger Greek classical literature, we considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch. The results largely confirmed what scholars have found about the New Testament texts, therefore giving credibility to the theory. The Gospel according to John is very similar to the fables written by Aesop. Surprisingly, the Epistle to the Hebrews and Apocalypse are each other’s “photocopies” in the two linguistic channels and not linked to all other texts. These two texts deserve further study by historians of the early Christian church literature at the level of meaning, readers, and possible Old Testament texts that might have influenced them. The theory can guide scholars to study any literary corpus.

1. A Mathematical Theory of Texts Outside the Paradigm of Natural Language Processing

In recent papers [1,2,3,4,5,6,7,8], we have developed a general theory on the deep-language mathematical structure of literary texts (or any long text), including their translation. The theory is based on linguistic communication channels—suitably defined—always contained in texts and based on the theory of regression lines [9,10] and Shannon’s communication and information theory [11].
In our theory, “translation” means not only the conversion of a text from a language to another language—what is properly understood as translation—but also how some linguistic parameters of a text are related to those of another text, either in the same language or in another language. “Translation”, therefore, refers also to the case in which a text is mathematically compared (metaphorically “translated”) with another text, whichever is the language of the two texts [2].
The theory does not follow the actual paradigm of linguistic studies. Most studies on the relationships between texts concern translation because of the importance of automatic translation. Refs. [12,13,14,15,16,17,18] report results not based on mathematical analyses of texts—as our theory does—and when a mathematical approach is used, as in Refs. [19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51], most of these studies consider neither Shannon’s communication theory, nor the fundamental connection that some linguistic variables seem to have with reading ability and short-term memory (STM) capacity [1,2,3,4,5,6,7,8]. In fact, these studies are mainly concerned with automatic translations, not with a high–level direct response of human readers, as our theory is. Very often, they refer only to one very limited linguistic variable, not to sentences that convey a completely developed thought—or to deep–language parameters, as our theory does.
The theory allows one to perform experiments with ancient readers − otherwise impossible—or with modern readers, by studying the literary texts of their epoch. These “experiments” can reveal unexpected similarities and dependences between texts because they consider mathematical parameters not consciously controlled by writers, either ancient or modern, as we will also show in the present paper.
In addition to the total number of characters, words, sentences, and interpunctions (punctuation marks) of a text, the linguistic parameters considered in our theory are the number of words n W per chapter, the number of sentences n S per chapter, and the number of interpunctions per chapter n I . Instead of referring to chapters, the analysis can refer to any chosen subdivision of a literary text, large enough to provide reliable statistics, such as a few hundred words [1,2,3,4,5,6,7,8].
We also consider four important deep-language parameters, calculated in each chapter (or in any large-enough block text): characters per word C P , words per sentence P F , words per interpunction I P , and interpunctions per sentence M F = P F / I P (this variable gives the number of I P s contained in a sentence).
The parameter I P , also referred to as the “words interval” (i.e., an “interval” measured in words [1]), is very likely linked to readers’ STM capacity [52], and it can be used to study how much two populations of readers of diverse languages overlap in reading a literary text in translation [7].
To study the chaotic data that emerge in any language, the theory compares a text (the reference, or input text) with another text (output text, “cross-channel”) or with itself (“self-channel”), with a complex communication channel—consisting of several parallel single channels [4], two of which are explicitly considered in the present paper—in which both input and output are affected by “noise”, i.e., by diverse scattering of the data around a mean linear relationship, namely, a regression line.
In [3] we have shown how much the mathematical structure of a literary text is saved or lost in translation. To make objective comparisons, we have defined a likeness index I L , based on the probability and communication theory of noisy digital channels. We have shown that two linguistic parameters can be related by regression lines. This is a general feature of texts. If we consider the regression line linking n S (dependent variable) to n W (independent variable) in a reference text and the regression line linking the same parameters in another text, then n S of the first text can be linked to n S of the second text with another regression line without explicitly calculating its parameters (slope and correlation coefficient) from the samples because the mathematical problem has the same structure of the theory developed in Ref. [2].
In Ref. [4] we have applied the theory of linguistic channels to show how an author shapes a character speaking to diverse audiences by diversifying and adjusting (“fine tuning”) two important linguistic communication channels, namely, the sentences channel (S-channel) and the interpunctions channel (I-channel). The S-channel links n S of the output text to n S of the input text, for the same number of words. The I-channel links M F (i.e., the number words intervals I P ) of the output text to M F of the input text, for the the same number of sentences.
In Ref. [5] we have further developed the theory of linguistic channels by applying it to Charles Dickens’ novels and to other novels of the English literature and found, for example, that this author was very likely affected by King James’ New Testament.
In Ref. [6] we have defined a universal readability index, applicable to any alphabetical language, by including the readers’ STM capacity, modeled by I P ; in Ref. [7] we have studied the STM capacity across time and language, and in Ref. [8] we have studied the readability of a text across time and language.
In this paper, as the title claims, we further study linguistic communication channels—namely, S-channels and I-channels—and show that they can reveal deeper connections between texts. As a study case, we consider an important historical literary corpus, the Greek New Testament (NT), with the purpose of determining the mathematical connections between its books (in the following referred to as “texts”) and possible differences in writing style (mathematically defined) of writers and in reading skill required of their readers. To set the NT texts in the Greek classical literature, we have considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch.
The analysis is based on the deep-language parameters and communication channels mentioned above, not explicitly known to the ancient writer/reader or, as well, to any modern writer/reader not acquainted with this theory.
After this introductory section, Section 2 recalls and defines the deep-language parameters of texts, Section 3 recalls the vector representation of texts, Section 4 summarizes the theory of linguistic communication channels, Section 5 defines the theoretical signal-to-noise ratio in linguistic channels (S-channels and I-channels), Section 6 defines the experimental signal-to-noise ratio in these channels, Section 7 recalls the likeness index of texts and defines the channels quadrants, Section 8 presents an extreme synthesis of the main findings, and Section 9 concludes and suggests future work. Appendix A and Appendix B reports numerical tables.

2. Deep-Language Parameters of Texts

The original NT Greek texts were first processed manually to delete all notes, titles, and other textual material added by modern editors, therefore leaving in the end only the original texts, as it was done in Ref. [53]. The original Greek texts of the New Testament have been downloaded from Tyndale House Greek New Testament (THGNT)— BibleGateway.com (last accessed on 31 May 2023).
Interpunctions were introduced by ancient readers acting as “editors” [54]. They were well-educated readers of the early Christian Church and very respectful of the original text and its meaning; therefore, they likely maintained a correct subdivision in sentences and word intervals within sentences, for not distorting the correct meaning and emphasis of the text. In other terms, we can reasonably assume that interpunctions were effectively introduced by the author.
In Ref. [53], we compared the Gospels according to Matthew (Mt), Mark (Mk), Luke (Lk), and John (Jh) and the book of Acts (Ac) by considering only deep-language parameters, not S-channels and I-channels, as we do in this paper. Moreover, we have presently enlarged our study case by including the Epistle to the Hebrews (Hb) and Apocalypse (Ap, known also as Revelation)—texts that show unexpected connections—and some texts written by the historians Polybius (Po), Plutarch (Pl), and Flavius Josephus (Fl) and by the story-teller Aesop (Ae) to set the NT in the larger classical Greek literature. These texts were downloaded from Greek and Roman Materials (tufts.edu) (last accessed on 31 May 2023).
The theory is very robust against slightly different versions of the Greek texts (e.g. New Testament) because it never considers meaning. If a word is not written, or it is substituted with another one in the NT texts, or if a small text is not present in a version, it does not significantly affect the statistical analysis. This applies also to the quality of the Greek used both in the NT texts and in Josephus. This a point of force of the theory.
The samples used in the statistical analysis refer to chapters: for example, Matthew has 28 chapters; therefore, this text is described by 28 samples for each deep-language parameter. The list of names (“genealogy” of Jesus of Nazareth) in Matthew and in Luke have been deleted for not biasing the statistical results. Like in Refs. [1,2,3,4,5,6,7,8,53], samples were statistically weighted with the fraction of total words; therefore, in Matthew—which contains 18121 total words—Chapter 5, for example, has 824 words, and therefore, its weight is 824 / 18121 = 0.0455 , not 1 / 128 = 0.0078 . This choice is mandatory to avoid that a short chapter (or, in general, a short text) affects the statistical results like a long one.
After this processing, we have obtained the mean values of C P , P F , I P , and M F reported in Table 1 and the universal readability G U , defined and discussed in Ref. [6], here calculated with the mean values < P F > and < I P > from
G U = 89 10 < C P > + 300 < P F > 6 × ( < I P > 6 )
In Equation (1) we set < C P > = 4.48 , the mean value found in the Italian literature, since Italian is the reference language in the definition of G U [1].
To set the NT texts in the Greek classical literature, we have considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch. The rational for selecting these authors is the following: Aesop wrote texts (Fables) that may recall the parables of the Gospels for their brevity and similar narrative purpose and style, and Polybius, Flavius Josephus, and Plutarch were historians and therefore wrote essays narrating facts, like the Gospels, partially, and especially Acts. Table 2 lists the texts and the mean values of the deep-language parameters of these authors. These texts have been processed manually like the NT.
The mean values of Table 1 and Table 2 can be used for a first assessment of how “close”, or mathematically similar, texts are in a Cartesian plane, by defining a linear combination of deep-language parameters. Texts are then modeled as vectors, the representation of which is discussed in detail in [1,2,3,4,5,6] and briefly recalled in the next section.

3. Vector Representation of Texts

Let us consider the six vectors of the indicated components of deep-language parameters, R 1 = ( < C P > , < P F > ), R 2 = ( < M F > , < P F > ), R 3 = ( < I P > , < P F > ), R 4 = ( < C P > , < M F > ), R 5 = ( < I P > , < M F > ), and R 6 = ( < I P > , < C P > ), and their resulting sum:
R = k = 1 6 R k
By considering the coordinates x and y of Equation (2), we obtain the scatterplot of their ending points shown in Figure 1, where the coordinates X and Y are normalized so that Aesop’s Fables (Ae) is at the origin ( X = 0 ,   Y = 0 ) and Flavius Josephus’ The Jewish War (Fl) is at ( X = 1 , Y = 1 ) .
In this Cartesian plane, two texts are likely connected—they show close ending points—if their relative Pythagorean distance is small and are likely not connected if their distance is large. In other words, a small distance means that the texts share a similar mathematical structure. This is a necessary, but not sufficient, condition for two texts being very likely connected to each other.
In Figure 1, the three synoptic Gospels (Mt, Mk, and Lk) are the closest texts of the NT. In particular, Mt and Lk are practically coincident, almost a mathematical “photocopy” of each other, as it was also shown, with diverse analysis, in Refs. [1,2]. Notice also that G U (Table 1) is very similar for the synoptics but not for the other NT texts (except Hebrews) and that John (Jh) is the most readable text.
Acts and Luke, although written by the same author—as widely accepted by scholars in Refs. [55,56], a very small selection of the huge body of literature on this topic—are quite diverse because when Luke writes the Gospel, he has significant constraints because his sources are very likely shared with Matthew. But when Luke writes Acts, he has few or no sources to share with Matthew; therefore, he is free to use his personal writing style oriented to narrating the early facts of the church. It is not surprising, therefore, that Acts, because of its contents, is closer to Plutarch and Polybius than to the synoptics and that its G U = 41.37 is close to Plutarch’s Parallel Lives  G U = 45.53 (Table 1 and Table 2), therefore shedding some light on the similar readability skill required of the readers of these historical narrations.
John is distinctly diverse of Matthew, Luke and Mark, but it is very close to Aesop’s Fables.
Unexpected is the vicinity of Hebrews and Apocalyse—two NT texts scholars rarely consider to be connected [57,58,59,60]—and their great distance from the Gospels. Their universal readability indices are also very similar, G U = 53.10 for Hebrews and G U = 49.46 for Apocalypse.
As for the Greek historians, we can notice that they are distinctly grouped and distant from the Gospels.
In conclusion, the vector modeling of texts can reveal first connections, otherwise hidden. These connections can be further addressed by studying their S-channels and I-channels and the likeness index I L . Therefore, in the next section we first recall the theory of linguistic communication channels.

4. Theory of Linguistic Communication Channels

In a text, an independent (reference) variable x (e.g., n W in S-channels) and a dependent variable y (e.g., n S ) can be related by a regression line (slope m ) passing through the origin of the Cartesian coordinates:
y = m x
Let us consider two diverse texts Y k and Y j . For both we can write Equation (3) for the same couple of parameter; however, in both cases, Equation (3) does not give the full relationship of two parameters because it links only the mean conditional values. We can write more general linear relationships, which take care of the scattering of the data—measured by the correlation coefficients r k and r j , not considered in Equation (3)—around the regression lines (slopes m k and m j ):
y k = m k x + n k
y j = m j x + n j
While Equation (3) connects the dependent variable y to the independent variable x only on the average, Equation (4) introduces additive “noise” n k and n j , with zero mean value [2,3,4]. The noise is due to the correlation coefficient r 1 , not considered by Equation (1).
We can compare two texts by eliminating x . In other words, we compare the output variable y for the same value of the input variable x in the two texts. In the example just mentioned, we can compare the number of sentences in two texts—for an equal number of words—by considering not only the mean relationship (Equation (3)) but also the scattering of the data (Equation (4)).
As recalled before, we refer to this communication channel as the “sentences channel” and to this processing as “fine tuning” because it deepens the analysis of the data and provides more insight into the relationship between two texts. The mathematical theory follows.
By eliminating x , from Equation (4) we obtain the linear relationship between—now—the sentences in text Y k (now the reference, input text) and the sentences in text Y j (now the output text):
y j = m j m k y k m j m k n k + n j
Compared with the independent (input) text Y k , the slope m j k is given by
m j k = m j m k
The noise source that produces the correlation coefficient between Y k and Y j is given by
n j k = m j m k n k + n j = m j k n k + n j
The “regression noise-to-signal ratio”, R m , due to m j k 1 , of the channel is given by [2]
R m = ( m j k 1 ) 2
The unknown correlation coefficient r j k between y j and y k is given by [2,9]
r j k = c o s a r c o s ( r j ) a r c o s ( r k )
The “correlation noise-to-signal ratio”, R r , due to r j k < 1 , of the channel that connects the input text Y k to the output text Y j is given by [1]
R r = 1 r j k 2 r j k 2 m j k 2
Because the two noise sources are disjoint and additive, the total noise-to-signal ratio of the channel connecting text Y k to text Y j is given by [2]
R = ( m j k 1 ) 2 + 1 r j k 2 r j k 2 m j k 2
Notice that Equation (9) can be represented graphically [2], to study the impact of R m and R R on R . Finally, the total signal-to-noise ratio is given by
γ t h = 1 / R
Γ t h = 10 × l o g 10 γ t h
The last expression is in dB. Notice that no channel can yield r j k = 1 and m j k = 1 (i.e., Γ t h = ), a case referred to as the ideal channel, unless a text is compared with itself (self-comparison, self-channel). In practice, we always find r j k < 1 and m j k 1 . The slope m j k measures the multiplicative “bias” of the dependent variable compared with the independent variable; the correlation coefficient r j k measures how “precise” the linear best fit is.
In conclusion, the slope m j k is the source of the regression noise, and the correlation coefficient r j k is the source of the correlation noise of the channel.
In the next section we study how sentences and interpunctions build S-channels and I-channels and calculate their theoretical signal-to-noise ratio.

5. S-Channels and I-Channels: Theoretical Signal-to-Noise Ratio Γ t h

In S-channels the number of sentences of two texts is compared for the same number of words. Therefore, they describe how many sentences the writer of text j uses to convey a meaning, compared with the writer of text k —who may convey, of course, a diverse meaning—by using the same number of words. Simply stated, it is all about how a writer shapes his/her style in communicating the full meaning of a sentence with a given number of words available; therefore, it is more linked to P F than to other parameters.
In I-channels the number of word intervals I P of two texts is compared for the same number of sentences. Therefore, they describe how many short texts (the text between two contiguous punctuation marks) two writers use to make a full sentence. Since I P is connected with short-term memory [1], I-channels are more related to readers‘ STM capacity than to authors’ style.
Finally, notice that the universal readability index, Equation (1), depends on both P F and I P ; therefore, it can better measure reading difficulty, as discussed in Ref. [6].
To apply the theory of Section 4, we need the slope m and the correlation coefficient r of the regression line between (a) n S and n W to study S-channels and (b) n I and n S to study I-channels. We first consider the NT and then the texts from the Greek literature.

5.1. New Testament

Table 3 reports the slope m and the correlation coefficient r of the regression line in the NT texts. In Matthew, for example, if we set n W = 100 words, then the text, on the average, contains n S = 100 × 0.0508 = 5.08 sentences and 2.7271 × 5.08 = 13.85 interpunctions.
Figure 2 and Figure 3 show the scatterplots and regression lines linking n S to n W , and Figure 4 and Figure 5 show those linking n I to n S . By looking at these figures, we can see at glance which texts have very similar regression lines, but it is more difficult to see whether the scattering of data is similar or not.
Regression lines, however, consider and describe only one aspect of the linear relationship, namely, that concerning (conditional) mean values. They do not consider the other aspect of the relationship, namely, the scattering of data, which may not be similar when two regression lines almost coincide, as it is clearly shown in Figure 2 in Mark and John, in Matthew and Luke and in Hebrews and Apocalypse. The theory of linguistic channels (Section 4), on the contrary, by considering both slopes and correlation coefficients, provides a reliable tool to fully compare two sets of data and can confirm the findings shown in Figure 1.
As an example, Table 4 reports the calculated values of m j k (Equation (6)) and r j k (Equation (9)) in S-channels and in I-channels by assuming Matthew as the output text and the others as input texts. For instance, the number of sentences in Matthew (text Y j ) is linked to the sentences in Luke (text Y k )—for the same number of words—with a regression line with slope m j k = 1.0180 and correlation coefficient r j k = 0.9938 . In other terms, 100 sentences in Luke give 1.0180 × 100 = 101.80 sentences in Matthew, for the same number of words. The number of interpunctions in Matthew (text Y j ) is linked to the interpunctions in Luke (text Y k )—for the same number of sentences—with a regression line with m j k = 0.9638 and r j k = 0.9960 .
Let us calculate the theoretical signal-to-noise ratio Γ t h obtained in S-channels and in I-channels. Table 5 (S-channel) and Table 6 (I-channel) report Γ t h (dB) between the input text indicated in the first column and the output text indicated in the first line.
Let us examine in detail some results.
In S-channels (Table 5), if the input is Matthew (first column) and the output is Luke (fourth column, channel Matthew Luke) then Γ t h = 19.06 ; vice versa, if the input is Luke and the output is Matthew (Luke Matthew) then Γ t h = 18.76 , which is the typical asymmetry present in literary texts [2,3,4,5].
In I-channels (Table 6), we read Γ t h = 19.94 in Matthew Luke and Γ t h = 20.53 in Luke Matthew. These results say not only that the asymmetry is very small but, more important, that the S-channel and the I-channel are practically identical, with a Γ t h 19 ~ 20 , therefore confirming that the very small distance between Matthew and Luke shown in Figure 1 is not due to chance. From the point of view of communication theory, therefore, Matthew and Luke appear as each other’s mathematical “photocopies”.
Luke and Acts, both universally attributed to Luke [55,56,57,58,59,60,61,62,63,64,65], have very similar Γ t h in the S-channel: Γ t h = 15.14 in Luke Acts and Γ t h = 13.44 in Act s L u k e . These values are low enough to agree with the large distance shown in Figure 1; therefore, the style used in the two texts is significantly diverse, in agreement with the diverse values < P F > = 20.47 in Luke and < P F > = 25.47 in Acts. On the contrary, the large and practically identical values in the I-channel— Γ t h = 27.93 in Luke Acts and Γ t h = 27.56 in Acts Luke—indicate that the readers addressed by these texts may even coincide, as far as their STM capacity is concerned.
The example just discussed illustrates the following point. Since M F = P F / I P , I-channels with similar < M F > —like in the above example, namely, < M F > = 2.89 in Luke and < M F > = 2.91 in Acts—and I P rarely can exceed the upper value of 9 of Miller’s law [52] because as sentences grow long, the writer—who is, of course, also a reader of his/her own text—unconsciously introduces more interpunctions, therefore limiting I P in Millers’ range [1]. Consequently < I P > is longer in Acts ( 8.77 ) than in Luke ( 7.11 ).
Hebrews and Apocalypse are completely disconnected with the other NT texts in the S-channel but not with each other. These two texts unexpectedly coincide in the S-channels, in both the slope and the correlation coefficient (Table 7 and Table 8). This coincidence produces very large signal-to-noise ratios (Table 5 and Table 6), namely, Γ t h = 42.61 dB in Hebrews Apocalypse and Γ t h = 42.68 in Apocalypse Hebrews, practically the same value (i.e., about 18,500 in linear units). The texts share the same style— < P F > = 32 in Hebrews and < P F > = 30.70 in Apocalypse; therefore, the two datasets, in this channel, seem to be produced by the same source.
In the I-channel, Hebrews and Apocalypse are also completely disconnected with the other NT texts, but they are to each other significantly connected because Γ t h = 15.25 dB in Hebrews Apocalypse and Γ t h = 13.92 in Apocalypse Hebrews.
Finally, notice that the four Gospels are closer to each other than to the other texts.

5.2. Greek Literature

For the Greek literature, Table 9 reports the slope m and the correlation coefficient r of the regression lines between n S versus n W and n I versus n S . Table 10 (S-channels) and Table 11 (I-channels) report Γ t h . The data referring to John are also reported for comparison with Aesop’s Fables because of their vicinity in the vector plane (Figure 1).
Let us examine the connection of John with Fables. Figure 6 shows the scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in John (cyan triangles and cyan) and in Aesop (magenta circles and magenta line). Notice that the two regression lines are practically superposed, and the scattering of the two sets are very alike.
Figure 7 shows the scatterplot and regression line between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in John (cyan triangles and cyan line) and in Aesop (magenta circles and magenta line). In this case, it is clear they do not share the slope.
Table 9 reports the slope and correlation coefficient of the regression lines. From these data we calculate Γ t h , according to Section 4, reported in Table 10 (S-channels) and Table 11 (I-channels).
John and Aesop share a large Γ t h in the S-channel and a significant Γ t h in the I-channel; therefore, this “fine tuning” clarifies that the vicinity of the two ending points in Figure 1 is mainly due to sharing more the style than the readers’ STM capacity.
In conclusion, S-channel results suggest that John’s style was likely affected by Fables, or by the particular type of story-telling, while the I-channel results suggest that John’s readers were not different, as far as their STM capacity, from the readers of the other texts listed (see the last column in Table 11).
As for the historians, Flavius Josephus shares more the style of Polybius than that of the other writers (Table 10), and his readers share the same STM capacity of Polybius’ readers since Γ t h = 30.80 in the I-channel Polybius Flavius Josephus and Γ t h , d B = 30.56 in Flavius Josephus Polybius (Table 11).

5.3. Issues and Solutions

At this stage, however, as discussed in Ref. [3], important issues arise, likely due to the small sample size used in calculating the regression line parameters, especially for the NT texts, and some questions must be answered.
The large and unexpected Γ t h in the channels Hebrews Apocalypse is just due to chance, or is it due to real likeness of the two texts? How can we assess whether these values are reliable? Now, it is practically impossible to estimate some probabilities of the parameters m and r of the regression lines of Table 3 because the texts available are very few. If Matthew had written, say, hundreds of texts, then we could attempt an analysis based on probability, but this is not the case, of course, and we are in the same situation for many ancient or modern authors.
In fact, because of the small sample size used in calculating a regression line, the slope m and the correlation coefficient r —being stochastic parameters—are characterized by mean values and standard deviations, which depend on the sample size [9]. Obviously, the theory would yield more precise estimates of the signal-to-noise ratio Γ t h   for larger sample sizes, as it can be assumed for the Greek literature.
With a small sample size, the standard deviations of m and r can give too large a variation in Γ t h (see the sensitivity of this parameter to m and r in [3]). To avoid this inaccuracy—due to the small sample size, not to the theory of Section 4—we have defined and discussed in [3,4] a “renormalization” of the texts and their subsequent analysis, based on Monte Carlo (MC) simulations of multiple texts attributed to the same writer, whose results can be considered “experimental”. Therefore, in the case of texts with small sample sizes for which we suspect Γ t h is due only to chance, as it may be with Hebrews and Apocalypse, the results of the simulation can replace the theoretical values.
In addition to the usefulness of the simulation as a “renormalization” tool, there is another property—very likely more interesting—of the generated new texts. In fact, since the mathematical theory does not consider meaning, these new texts could have been “written” by the author because they maintain the main statistical properties of the original text. In other words, they are “literary texts” that the author might have written at the time when he/she wrote the original text. Based on this hypothesis, we can consider a large number of texts for each author. With this strategy, we think we have solved these issues in Ref. [3]. In the next section we recall the rationale of the MC simulation.

6. S-Channels and I-Channels: Experimental Signal-to-Noise Ratio Γ e x

In this section, after recalling the Monte Carlo simulation steps to obtain the new texts attributed to the same author, we examine S-channels and I-channels.

6.1. Multiple Versions of a Text: Monte Carlo Simulation

Let the literary text Y j be the “output” of which we consider n disjoint block texts (e.g., chapters), and let us compare it with a particular input literary text Y k characterized by a regression line, as detailed in Section 4. The steps of the MC simulation are the following (here explicitly described for S-channels):
  • Generate n independent integers (the number of disjoint block texts, e.g., chapters, 28 in Matthew) from a discrete uniform probability distribution in the range 1 to n , with replacement—i.e., a block text can be selected more than once.
  • “Write” another “text Y j ” with new n block texts, e.g., the sequence 2, 1, n , n 2 ; hence, take block text 2, followed by block text 1, block text n , block text n 2 up to n block texts. A block text can appear twice (with probability 1 / n 2 ), three times (with probability 1 / n 3 ), etc., and the new “text Y j ” can contain a number of words greater or smaller than the original text, on the average; however, the differences are small and do not affect the final statistical results and analysis.
  • Calculate the parameters m j and r j of the regression line between words (independent variable) and sentences (dependent variable) in the new “text Y j ”, namely, Equation (1).
  • Compare m j and r j of the new “text Y j ” (output, dependent text) with any other text (input, independent text, m k and r k ), in the “cross–channels” so defined, including the original text Y j (this latter case is referred to as the “self–channel”).
  • Calculate m j k , r j k , and Γ c r o s s , e x of the cross–channels or Γ s e l f , e x of the self-channel according to the theory of Section 4.
  • Consider the signal-to-noise ratios obtained as “experimental” results.
  • Repeat steps 1 to 6 many times for obtaining reliable results (we have repeated the sequence 5000 times, ensuring a standard deviation of the mean value less than about 0.1 dB).
In conclusion, the MC simulation substitutes a probability study on the joint density function of m and r on real texts, not available in such a large number. Let us now apply the MC simulation to the NT texts.

6.2. S-Channels and I-Channels

Figure 8 shows < Γ c r o s s , e x > and < Γ s e l f , e x > for each NT output text and input texts for S-channels (upper panel) and I-channels (lower panel). The mean and standard deviation values are reported in Appendix A because they are needed in Section 7.
From Figure 8, for example, or from Appendix A, in S-channels we can notice that if the input is Matthew and the output is Luke (blue line), then Γ c r o s s , e x = 20.52 ; vice versa, if the input is Luke and the output is Matthew (black line), then Γ c r o s s , e x = 19.68 . If the input is Matthew and the output is Matthew (self-channel), then Γ s e l f , e x = 25.01 . In this case we compare Matthew with 5000 “new” Matthews obtained randomly. Notice that Γ s e l f , e x > Γ c r o s s , e x .
The Gospels are clearly distinguishable from the other texts, especially from Hebrews and Apocalypse, which can be confused. Notice that Γ s e l f , e x = 15.6 6 for Hebrews and Γ s e l f , e x = 19.76 for Apocalypse are always very similar to Γ c r o s s , e x = 15.73 and Γ c r o s s , e x = 19.64 , respectively; therefore, the theoretical striking similarity of the two texts found in Section 5 (Table 5) is confirmed.
Notice that the Gospels differ quite significantly from Acts, Hebrews, and Apocalypse and that they are very similar to each other, therefore confirming, with this “fine-tuning”, the findings shown in Figure 1.
Let us discuss the results for I-channels (lower panel). For example, if the input is Matthew and the output is Luke, then Γ c r o s s ,   e x = 20.46 dB; vice versa, if the input is Luke and the output is Matthew, then Γ c r o s s , e x = 21.23 dB. If the input is Matthew and the output is Matthew, then Γ s e l f , e x = 26.63 , very close to that obtained in the S-channel. Like in S-channels, Γ s e l f , e x > Γ c r o s s , e x .
The Gospels are very similar to each other and are clearly distinguished from Hebrews and Apocalypse, confirming therefore also in this channel what is shown in Figure 1. Finally, notice that also in the I-channel, Hebrews and Apocalypse are always the most similar texts.
In the next sub-section we compare < Γ e x > with < Γ t h > because this comparison gives fundamental insight on the range in which < Γ t h > is reliable.

6.3. Γ t h Versus Γ e x and Minimum Reliable Range of Γ t h

As done in Ref. [3], it is very interesting to compare Γ t h with Γ e x . This comparison gives the minimum range in which Γ t h is reliable.
Figure 9 shows < Γ e x > versus Γ t h in S-channels, for self- and cross-channels (a), and the difference Γ t h Γ e x versus Γ t h (b). This difference represents the ratio (expressed in dB) between the noise power in the experimental channel and that in the theoretical channel. As in Ref. [3], we notice that the two signal-to-noise ratios are very well correlated up to a maximum value set by < Γ s e l f , e x > , presently at about 20 ~ 22 dB (horizontal asymptote), beyond which < Γ e x > cannot follow the large increase in Γ t h , which reaches about 42 dB in Hebrews and Apocalypse.
Figure 10 shows < Γ e x > versus Γ t h and Γ t h Γ e x versus Γ t h in I-channels. We notice the same behavior of S-channels but with the asymptote set at about 24 dB.
From these figures we can draw the following conclusions:
(1)
There is a horizontal asymptote that sets the maximum reliable value of Γ t h , given by the largest < Γ s e l f , e x > .
(2)
In this range the MC simulation is not indispensable, because Γ t h , calculated from Equation (12), is reliable. However, MC simulations are very useful to calculate the likeness index [3], which is based on a large number of texts an author might have written.
(3)
The theory can predict large values—as in Hebrews and Apocalypse—but we may suspect they are just due to chance because of the large sensitivity of Γ t h to slopes and correlation coefficients, as discussed in Ref. [3]. Therefore, a cautionary (pessimistic) value is to assume Γ t h Γ e x .
(4)
The difference Γ t h Γ e x —i.e., the ratio (expressed in dB) between the noise power in the experimental channel and that in the theoretical channel—tends to be constant before saturation; afterward, it increases linearly, therefore indicating the end of a reliable range of Γ t h .
In the next section we calculate the likeness index of texts and define a useful graphical tool, the “channels quadrants”.

7. Likeness Index of Texts and Channels Quadrants

In Ref. [3] we explored a way of comparing the signal-to-noise ratios Γ d B , e x of self- and cross-channels objectively and possibly obtaining more insight on the texts’ mathematical likeness. In comparing a self–channel with a cross-channel, the probability of mistaking one text with another is a binary problem because a decision must be made between two alternatives. The problem is classical in binary digital communication channels affected by noise. In digital communication, “error” means that bit 1 is mistaken for bit 0 or vice versa; therefore, the channel performance worsens as the error frequency (i.e., the error probability) increases. However, in linguistics self- and cross-channels, “error” means that a text can be more or less mistaken, or confused, with another text; consequently, two texts are more similar as the “error probability” increases. Therefore, a large error probability means that two literary texts are mathematically similar.
We first recall the theory of likeness index and then define the “channel quadrants”, a graphical tool that classifies texts, with the aim of showing how much the writers’ style and the readers’ STM capacity are matched.

7.1. Likeness Index

In digital communication channels affected by noise, the probability of error is given by [3]
p e = 0.5 T m i n g 0 Γ e x , c r o s s d Γ d B , e x , c r o s s + T m i n g 1 Γ e x , s e l f d Γ e x , s e l f
In Equation (13), Γ e x , c r o s s and Γ e x , s e l f are modeled as Gaussian density functions with the mean and standard deviation given in Appendix A. The decision threshold, T m i n , is given by the intersection of the two known probability density functions g o y (cross-channel) and g 1 y   (self-channel). The integrals limits are fixed as shown because in general, Γ d B , c r o s s Γ d B , s e l f .
If p e = 0 , there is no intersection between the two densities; their mean values are centered at and + , respectively, or the two densities have collapsed to Dirac delta functions. If p e = 0.5 , the two densities are identical, e.g., a self-channel is compared with itself. In conclusion, 0 p e 0.5 ; therefore, if p e = 0 , the cross- and self-channels can be considered totally uncorrelated, and if p e = 0.5 = p e , m a x , the self- and cross-channels coincide, and the two texts are mathematically identical.
The likeness index I L is defined by
I L = p e p e , m a x
The likeness index ranges from 0 I L 1 ; I L = 0 means totally uncorrelated texts, and I L = 1 means totally correlated texts.

7.2. Channels Quadrants

Some insight on the “fine-tuning”—i.e., matching the writers’ style and the readers’ STM capacity—and on the relationship between texts can be visualized through the “channel quadrants” shown in Figure 11. In quadrant IV, the S-channels of two texts are significantly similar, and the texts coincide along the vertical line x = 1 . Similarly, in quadrant II, the I-channels are significantly similar, and the texts coincide along the horizontal line y = 1 . In quadrant III, the two texts can be considered unmatched completely uncorrelated at the origin (0,0). Finally, in quadrant I, the two texts are very much matched in both channels and fully matched at (1,1); therefore, at this point, the two texts are mathematically indistinguishable.
Figure 12 shows the scatterplot of I L of the I-channel (ordinate) versus I L of the S-channel (abscissa) referred to the NT. The numerical values are reported in Appendix B. We can notice that only 19.0% of the cases have good matching in both channels (quadrant I), 21.4% have good matching only in the I-channel (quadrant II), 54.8% have poor matching in both channels (quadrant III), and 4.8% have good matching only in the S-channel (quadrant IV).
The marginal probabilities are P ( I L 0.5 ) = 23.8 % in the S-channel and P ( I L 0.5 ) = 40.4 % in the I-channel. This fact, together with the other percentages, marks some interesting differences between the S-channels and I-channels.
Table 12 and Table 13 report the average values of I L of the two asymmetric channels (e.g., Matthew Luke and Luke Matthew; see Appendix B) in S-channels and in I-channels, respectively.
For S-channels, we notice a large I L = 0.707 between Matthew and Luke, a very large I L = 0.914 between Mark and John, and a very large and unexpected I L = 0.996 between Hebrews and Apocalypse. All these values are reliable because they are based on Γ t h .
We can notice that the mathematical similarity of Matthew and Luke, already observed, is further reinforced by noting they are quite similar in both channels. Another interesting fact to notice is the high likeness index between Mark and John, who, according to scholars [64,65], share some similar Greek.
For I-channels, there are confirmations and differences compared with S-channels. Recall that I-channels are more concerned with the readers’ STM memory than with the authors’ style. The large I L between Hebrews and Apocalypse of the S-channel is not confirmed in the I-channel, although it is large enough ( I L = 0.697 ) to link the two groups of readers.
Very insightful is the large I L = 0.863 between Luke and Acts, both texts written by Luke, who very likely addressed, as already mentioned, similar groups of readers. Further, notice that Acts is very close to all other texts, except Hebrew and Apocalypse, which means that Acts likely addressed all the early Christians.
Finally, let us reconsider the vicinity of John to Aesop’s Fables shown in Figure 1. The signal-to-noise ratio in the S-channel Aesop John is Γ c r o s s , e x = 23.23 , with a standard deviation of 6.7 —John’s self-channel values are given in Appendix A—giving therefore I L = 0.930 . In the I-channel, Γ c r o s s , e x = 19.91 , with a standard deviation of 0.70 dB; therefore, I L = 0.150 .
In brief, John’s style is similar to Aesop’s style—see also the values < P F > = 18.56 in John and < P F > = 18.29 in Fables—but the readers’ STM capacity is not, also evident in the values < I P > = 6.79 in John and < I P > = 5.28 in Fables, a difference that implies a diverse readability index (see Table 1 and Table 2).
In conclusion, the coincidence of John and Aesop in Figure 1 is a necessary condition for being similar, but only the fine tuning provided by linguistic channels can fully reveal the nature of this similarity. In this example, John might have been inspired by the long tradition of short stories telling a truth, such as Aesop’ Fables.

7.3. I-Channel Versus S-Channel: Hebrews and Apocalypse

According to Table 12 and Table 13, Hebrews and Apocalypse are mathematically each other’s “photocopies” in the S-channel and very similar in the I-channel; therefore, the styles—as it is meant in this paper—of the two authors coincide, and their readers share similar STM capacities. As already mentioned, the likeness of these texts is unexpected; therefore, it may be realistic to suppose that the writers and readers of them have belonged to the same group of Jewish-Christians, an issue to be researched by scholars of the Greek language used in the NT and by historians of early Christianity.
In conclusion, the S-channel and the I-channel describe the deep mathematical joint structure of two texts, namely, the authors’ styles and the readers’ STM capacities required to read the texts. If both likeness indices are large, then the two texts are very similar. These mathematical results may be used to confirm, in a multidisciplinary approach, what scholars of humanistic disciplines find, and they can even suggest new paths of research, such as the relationship between the author and the readers of Hebrews and Apocalypse.

8. Synthesis of Main Results

At this point, the reader of the present paper may be overwhelmed by tables and figures. However, due to the nature of the mathematical theory based on studying regression lines and linguistic channels—not to mention the many comparisons that can be carried out, even in a small literary corpus such as the New Testament—these numbers and figures are the only means we know for supporting the partial conclusions reached in each section above. Now we can attempt to present a final compact comparison based on one more table and figure.
Table 14 shows the most synthetic comparison of the NT texts, namely, the overall mean value of I L , averaged from Table 12 and Table 13. By assuming I L > 0.5 as the threshold beyond which texts are reasonably similar, this threshold is exceeded in Luke–Matthew, Luke–Mark, John–Matthew, John–Mark, and Luke–Acts.
The couple Hebrews–Apocalypse is completely disconnected from the other texts, and their likeness index is the largest. We like to reiterate that these two texts deserve further studies by historians of the early Christian church literature at the higher level of meaning, readers, and possible Old Testament texts that might have affected them, a task well beyond the knowledge of the present author.
Now, we show that the value I L 0.5 brings a special meaning, besides defining the borders of the quadrants in Figure 12.
Figure 13 shows the scatterplot between I L of S-channels and I-channels versus the difference Δ Γ = < Γ s e l f , e x > < Γ c r o s s , e x > found in each channel, for all NT texts. The scatterplot suggests a tight inverse proportional relationship between I L and Δ Γ . A very similar scatterplot and tight relationship was also found for texts taken from the Italian literature [4], therefore suggesting that this relationship is “universal” for alphabetical texts.
The best-fit non-linear curve drawn in Figure 13 can be considered a good overall model, given by
I L = e x p 10 Δ Γ / 10 1 5
Notice that Δ Γ is the ratio (expressed in dB) between the noise, defined in Section 4, affecting a cross-channel and that found in the corresponding self-channel.
The value I L = 0.5 is obtained from Equation (15) at Δ Γ = 6.50 dB, a value that is practically the standard deviation of Γ s e l f , e x in all cases, because this parameter ranges from 6 to 7.
We can link this last observation to the quadrants of Figure 11. As a general rule, we can say that in quadrant I ( I L > 0.5 in both channels), we will always find texts whose < Γ c r o s s , e x > is approximately distant 6 ~ 7 dB from the corresponding < Γ s e l f , e x > . In other words, a noise power ratio of 6 ~ 7 dB indicates that the two texts considered tend to be matched in both channels; therefore, it can be taken, with the vector representation of Figure 1, as a first objective assessment of the texts’ likeness.

9. Conclusions

We studied two fundamental linguistic channels—namely, the S-channel and the I-channel—and showed that they can reveal deeper connections between texts. As a study case, we considered the Greek New Testament, with the purpose of determining mathematical connections between its texts and possible differences in the writing style (mathematically defined) of the writers and in the reading skill required of their readers. The analysis is based on deep-language parameters and communication/information theory developed in previous papers.
Our theory does not follow the actual paradigm of linguistic studies, which consider neither Shannon’s communication theory nor the fundamental connection that some linguistic parameters have with the reading skill and short–term memory capacity of readers.
To set the New Testament texts in the Greek classical literature, we have also studied and compared texts written by Aesop, Polybius, Flavius Josephus, and Plutarch.
We have found large similarities (measured by the likeness index) in the couplings of Luke–Matthew, Luke–Mark, John–Matthew, John–Mark, and Luke–Acts, findings that largely confirm what scholars have found about these texts, therefore giving credibility to the theory.
The Gospel according to John is very similar to Aesop’s Fables. John might have been inspired by the long tradition of short stories telling a truth, such as Fables.
Surprisingly, we have found that Hebrews and Apocalypse are each other’s “photocopies” in the two linguistic channels and not linked to all other texts. In our opinion, these two texts deserve further studies by historians of the early Christian church literature conducted at the higher level of meaning, readers, and possible Old Testament texts that might have influenced them, a task well beyond the knowledge of the present author.

Funding

This research received no external funding.

Data Availability Statement

Data are available from the Author.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A. Signal-to-Noise Ratio in S-Channels and in I-Channels

Table A1 reports < Γ e x > (dB) and its standard deviation (dB, in parentheses) in the S-channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Matthew and the output is Luke (cross-channel), then Γ d B , e x = 28.52 ; vice versa, if the input is Luke and the output is Matthew, then Γ e x = 19.68 . If the input is Matthew and the output is Matthew (self-channel), then Γ e x = 25.01 .
Table A1. S-channels. Experimental mean signal-to-noise ratio Γ e x (dB) and standard deviation (dB, in parentheses) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
Table A1. S-channels. Experimental mean signal-to-noise ratio Γ e x (dB) and standard deviation (dB, in parentheses) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
MtMkLkJhAcHbAp
Mt25.01 (6.96)17.06 (5.94)20.52 (5.81)18.52 (4.24)13.15 (2.34)7.69 (1.81)8.05 (1.48)
Mk18.12 (3.46) 20.91 (7.13)19.33 (3.46)22.05 (6.04)12.40 (1.43)7.66 (1.33)8.00 (1.01)
Lk19.68 (6.43)17.39 (4.75)24.82 (6.53)16.98 (2.73)14.75 (2.06)8.54 (1.67)9.00 (1.31)
Jh18.95 (2.72) 20.59 (7.04)18.30 (3.16)24.39 (7.07)11.73 (1.47)7.27 (1.33)7.58 (1.04)
Ac10.61 (2.16)9.19 (1.71)12.45 (2.24)8.71 (0.96)23.55 (6.16)11.71 (3.29)12.79 (2.74)
Hb3.52 (1.43)3.15 (1.28)4.85 (1.64)2.50 (0.81)10.67 (2.64)15.66 (6.77)19.64 (6.86)
Ap3.52 (1.44)3.30 (1.28)5.00 (1.66)2.65 (0.81)10.95 (2.69)15.73 (6.69)19.76 (6.74)
Table A2 reports < Γ e x > (dB) and its standard deviation (dB, in parentheses) in the I-channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
Table A2. I-channels. Experimental mean signal-to-noise ratio Γ d B , e x (dB) and standard deviation (dB, in parentheses) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
Table A2. I-channels. Experimental mean signal-to-noise ratio Γ d B , e x (dB) and standard deviation (dB, in parentheses) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
MtMkLkJhAcHbAp
Mt26.63 (6.68)14.84 (5.68)20.46 (5.83)28.01 (5.92)21.91 (5.81)4.49 (1.60)4.57 (2.35)
Mk13.61 (2.80)19.55 (7.32)15.41 (3.01)13.78 (2.60)15.94 (2.87)3.50 (2.10)5.40 (1.56)
Lk21.23 (5.40)15.71 (4.22)24.92 (6.57)20.03 (3.22)23.28 (5.45)5.82 (2.01)6.76 (2.40)
Jh25.55 (6.17)15.62 (6.38)19.72 (4.86)28.19 (6.15)22.55 (6.32)4.19 (1.57)4.39 (2.46)
Ac22.32 (6.00)16.98 (5.69)22.48 (5.14)22.46 (5.28)24.32 (6.26)4.84 (1.80)5.71 (2.20)
Hb9.15 (0.54)8.16 (0.66)10.00 (0.54)8.89 (0.37)9.43 (0.82)18.11 (7.14)15.53 (5.00)
Ap8.93 (0.97)9.17 (0.94)10.31 (1.07)8.68 (0.60)9.75 (1.20)13.50 (6.97)20.61 (6.88)
For example, if the input is Matthew and the output is Luke (cross-channel), then Γ d B ,   e x = 20.46 ; vice versa, if the input is Luke and the output is Matthew, then Γ e x = 21.23 . If the input is Matthew and the output is Matthew (self-channel), then Γ e x = 26.63 , very close to that obtained in the S-channel.

Appendix B. Likeness Index in S-Channels and in I-Channels

Table A3 reports I L in the S-channel between the (input) indicated texts. For example, if the input is Matthew and the output is Luke, then I L = 0.724 ; vice versa, if the input is Mark and the output is Matthew, then I L = 0.689 . Self-channels yield I L = 1 .
Table A3. S-channels. Mean value of the likeness index I L in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
Table A3. S-channels. Mean value of the likeness index I L in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
MtMkLkJhAcHbAp
Mt10.7580.7240.5670.1930.2790.119
Mk0.46210.5340.8460.1110.2380.091
Lk0.6890.72610.3860.2390.3080.135
Jh0.4550.9810.45310.0960.2210.084
Ac0.0960.1450.1360.03610.6150.402
Hb0.0080.0260.0120.0040.12910.993
Ap0.0080.0270.0130.0040.1400.9991
Table A4 reports I L in the I-channel between the (input) indicated texts. For example, if the input is Matthew and the output is Luke, then I L = 0.716 dB; vice versa, if the input is Luke and the output is Matthew, then I L = 0.646 . Self-channels yield I L = 1 .
Table A4. I-channels. Mean value of the likeness index I L in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
Table A4. I-channels. Mean value of the likeness index I L in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.
MtMkLkJhAcHbAp
Mt10.7020.7160.9820.8390.0930.071
Mk0.15210.2900.0900.3240.0940.056
Lk0.6460.67910.3560.9130.1460.117
Jh0.9270.7680.63310.8880.0850.072
Ac0.7310.8180.8120.61210.1100.085
Hb0.0100.0980.0230.0020.02510.650
Ap0.0150.1420.0410.0030.0380.7441

References

  1. Matricciani, E. Deep Language Statistics of Italian throughout Seven Centuries of Literature and Empirical Connections with Miller’s 7 ∓ 2 Law and Short–Term Memory. Open J. Stat. 2019, 9, 373–406. [Google Scholar] [CrossRef] [Green Version]
  2. Matricciani, E. A Statistical Theory of Language Translation Based on Communication Theory. Open J. Stat. 2020, 10, 936–997. [Google Scholar] [CrossRef]
  3. Matricciani, E. Linguistic Mathematical Relationships Saved or Lost in Translating Texts: Extension of the Statistical Theory of Translation and Its Application to the New Testament. Information 2022, 13, 20. [Google Scholar] [CrossRef]
  4. Matricciani, E. Multiple Communication Channels in Literary Texts. Open J. Stat. 2022, 12, 486–520. [Google Scholar] [CrossRef]
  5. Matricciani, E. Capacity of Linguistic Communication Channels in Literary Texts: Application to Charles Dickens’ Novels. Information 2023, 14, 68. [Google Scholar] [CrossRef]
  6. Matricciani, E. Readability Indices Do Not Say It All on a Text Readability. Analytics 2023, 2, 296–314. [Google Scholar] [CrossRef]
  7. Matricciani, E. Short–Term Memory Capacity Across Time and Language Estimated from Ancient and Modern Literary Texts. Open J. Stat. 2023; in press. [Google Scholar]
  8. Matricciani, E. Readability across Time and Languages: The Case of Matthew’s Gospel Translations. AppliedMath 2023, 3, 497–509. [Google Scholar] [CrossRef]
  9. Papoulis, A. Probability & Statistics; Prentice Hall: Hoboken, NJ, USA, 1990. [Google Scholar]
  10. Lindgren, B.W. Statistical Theory, 2nd ed.; MacMillan Company: New York, NY, USA, 1968. [Google Scholar]
  11. Shannon, C.E. A Mathematical Theory of Communication. Part I and Part II. Bell Syst. Tech. J. 1948, 27, 379–423, 623–656. [Google Scholar]
  12. Catford, J.C. A Linguistic Theory of Translation. An Essay in Applied Linguistics; University Press: Oxford, UK, 1965. [Google Scholar]
  13. Munday, J. Introducing Translation Studies. Theories and Applications, 2nd ed.; Routledge: Oxfordshire, UK, 2008. [Google Scholar]
  14. Proshina, Z. Theory of Translation, 3rd ed.; Far Eastern University Press: Manila, Philippines, 2008. [Google Scholar]
  15. Trosberg, A. Discourse analysis as part of translator training. Curr. Issues Lang. Soc. 2000, 7, 185–228. [Google Scholar] [CrossRef]
  16. Tymoczko, M. Translation in a Post–Colonial Context: Early Irish Literature in English Translation; St Jerome: Manchester, UK, 1999. [Google Scholar]
  17. Warren, R. (Ed.) The Art of Translation: Voices from the Field; North–eastern University Press: Boston, MA, USA, 1989. [Google Scholar]
  18. Williams, I. A corpus–based study of the verb observar in English–Spanish translations of biomedical research articles. Target 2007, 19, 85–103. [Google Scholar] [CrossRef]
  19. Wilss, W. Knowledge and Skills in Translator Behaviour; John Benjamins: Philadelphia, PA, USA, 1996. [Google Scholar]
  20. Wolf, M.; Fukari, A. (Eds.) Constructing a Sociology of Translation; John Benjamins: Philadelphia, PA, USA, 2007. [Google Scholar]
  21. Gamallo, P.; Pichel, J.R.; Alegria, I. Measuring Language Distance of Isolated European Languages. Information 2020, 11, 181. [Google Scholar] [CrossRef] [Green Version]
  22. Barbançon, F.; Evans, S.; Nakhleh, L.; Ringe, D.; Warnow, T. An experimental study comparing linguistic phylogenetic reconstruction methods. Diachronica 2013, 30, 143–170. [Google Scholar] [CrossRef] [Green Version]
  23. Bakker, D.; Muller, A.; Velupillai, V.; Wichmann, S.; Brown, C.H.; Brown, P.; Egorov, D.; Mailhammer, R.; Grant, A.; Holman, E.W. Adding typology to lexicostatistics: Acombined approach to language classification. Linguist. Typol. 2009, 13, 169–181. [Google Scholar] [CrossRef]
  24. Petroni, F.; Serva, M. Measures of lexical distance between languages. Phys. A Stat. Mech. Appl. 2010, 389, 2280–2283. [Google Scholar] [CrossRef] [Green Version]
  25. Carling, G.; Larsson, F.; Cathcart, C.; Johansson, N.; Holmer, A.; Round, E.; Verhoeven, R. Diachronic Atlas of Comparative Linguistics (DiACL)—A database for ancient language typology. PLoS ONE 2018, 13, 1. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Gao, Y.; Liang, W.; Shi, Y.; Huang, Q. Comparison of directed and weighted co–occurrence networks of six languages. Phys. A Stat. Mech. Appl. 2014, 393, 579–589. [Google Scholar] [CrossRef]
  27. Liu, H.; Cong, J. Language clustering with word co–occurrence networks based on parallel texts. Chin. Sci. Bull. 2013, 58, 1139–1144. [Google Scholar] [CrossRef] [Green Version]
  28. Gamallo, P.; Pichel, J.R.; Alegria, I. From Language Identification to Language Distance. Phys. A 2017, 484, 162–172. [Google Scholar] [CrossRef]
  29. Pichel, J.R.; Gamallo, P.; Alegria, I. Measuring diachronic language distance using perplexity: Application to English, Portuguese, and Spanish. Nat. Lang. Eng. 2019, 26, 433–454. [Google Scholar] [CrossRef]
  30. Eder, M. Visualization in stylometry: Cluster analysis using networks. Digit. Scholarsh. Humanit. 2015, 32, 50–64. [Google Scholar] [CrossRef]
  31. Brown, P.F.; Cocke, J.; Della Pietra, A.; Della Pietra, V.J.; Jelinek, F.; Lafferty, J.D.; Mercer, R.L.; Roossin, P.S. A Statistical Approach to Machine Translation. Comput. Linguist. 1990, 16, 79–85. [Google Scholar]
  32. Koehn, F.; Och, F.J.; Marcu, D. Statistical Phrase–Based Translation. In Proceedings of the HLT–NAACL 2003, Stroudsburg, PA, USA, 27 May–1 June 2003; pp. 48–54. [Google Scholar]
  33. Michael Carl, M.; Schaeffer, M. Sketch of a Noisy Channel Model for the translation process. In Empirical Modelling of Translation and Interpreting; Hansen Schirra, S., Czulo, O., Hofmann, S., Eds.; Language Science Press: Berlin, Germany, 2017; pp. 71–116. [Google Scholar] [CrossRef]
  34. Elmakias, I.; Vilenchik, D. An Oblivious Approach to Machine Translation Quality Estimation. Mathematics 2021, 9, 2090. [Google Scholar] [CrossRef]
  35. Lavie, A.; Agarwal, A. Meteor: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments. In Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Czech Republic, 23 June 2007; pp. 228–231. [Google Scholar]
  36. Banchs, R.; Li, H. AM–FM: A Semantic Framework for Translation Quality Assessment. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; Volume 2, pp. 153–158. [Google Scholar]
  37. Forcada, M.; Ginestí-Rosell, M.; Nordfalk, J.; O’Regan, J.; Ortiz-Rojas, S.; Pérez-Ortiz, J.; Sánchez–Martínez, F.; Ramírez–Sánchez, G.; Tyers, F. Apertium: A free/open–source platform for rule–based machine translation. Mach. Transl. 2011, 25, 127–144. [Google Scholar] [CrossRef]
  38. Buck, C. Black Box Features for the WMT 2012 Quality Estimation Shared Task. In Proceedings of the 7th Workshop on Statistical Machine Translation, Montreal, QC, Canada, 7–8 June 2012; pp. 91–95. [Google Scholar]
  39. Assaf, D.; Newman, Y.; Choen, Y.; Argamon, S.; Howard, N.; Last, M.; Frieder, O.; Koppel, M. Why “Dark Thoughts” aren’t really Dark: A Novel Algorithm for Metaphor Identification. In Proceedings of the 2013 IEEE Symposium on Computational Intelligence, Cognitive Algorithms, Mind, and Brain, Singapore, 16–19 April 2013; pp. 60–65. [Google Scholar]
  40. Graham, Y. Improving Evaluation of Machine Translation Quality Estimation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, 26–31 July 2015; pp. 1804–1813. [Google Scholar]
  41. Espla–Gomis, M.; Sanchez–Martınez, F.; Forcada, M.L. UAlacant word–level machine translation quality estimation system at WMT 2015. In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisboa, Portugal, 17–18 September 2015; pp. 309–315. [Google Scholar]
  42. Costa–jussà, M.R.; Fonollosa, J.A. Latest trends in hybrid machine translation and its applications. Comput. Speech Lang. 2015, 32, 3–10. [Google Scholar] [CrossRef] [Green Version]
  43. Kreutzer, J.; Schamoni, S.; Riezler, S. QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word–level Translation Quality Estimation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisboa, Portugal, 17–18 September 2015; pp. 316–322. [Google Scholar]
  44. Specia, L.; Paetzold, G.; Scarton, C. Multi–level Translation Quality Prediction with QuEst++. In Proceedings of the ACL–IJCNLP 2015 System Demonstrations, Beijing, China, 26–31 July 2015; pp. 115–120. [Google Scholar]
  45. Banchs, R.E.; D’Haro, L.F.; Li, H. Adequacy–Fluency Metrics: Evaluating MT in the Continuous Space Model Framework. IEEE/ACM Trans. Audio Speech Lang. Process. 2015, 23, 472–482. [Google Scholar] [CrossRef]
  46. Martins, A.F.T.; Junczys–Dowmunt, M.; Kepler, F.N.; Astudillo, R.; Hokamp, C.; Grundkiewicz, R. Pushing the Limits of Quality Estimation. Trans. Assoc. Comput. Linguist. 2017, 5, 205–218. [Google Scholar] [CrossRef] [Green Version]
  47. Kim, H.; Jung, H.Y.; Kwon, H.; Lee, J.H.; Na, S.H. Predictor–Estimator: Neural Quality Estimation Based on Target Word Prediction for Machine Translation. ACM Trans. Asian Low–Resour. Lang. Inf. Process. 2017, 17, 1–22. [Google Scholar] [CrossRef]
  48. Kepler, F.; Trénous, J.; Treviso, M.; Vera, M.; Martins, A.F.T. OpenKiwi: An Open Source Framework for Quality Estimation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, Italy, 28 July–2 August 2019; pp. 117–122. [Google Scholar]
  49. D’Haro, L.; Banchs, R.; Hori, C.; Li, H. Automatic Evaluation of End–to–End Dialog Systems with Adequacy–Fluency Metrics. Comput. Speech Lang. 2018, 55, 200–215. [Google Scholar] [CrossRef]
  50. Yankovskaya, E.; Tättar, A.; Fishel, M. Quality Estimation with Force–Decoded Attention and Cross–lingual Embeddings. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers, Belgium, Brussels, 31 October–1 November 2018; pp. 816–821. [Google Scholar]
  51. Yankovskaya, E.; Tättar, A.; Fishel, M. Quality Estimation and Translation Metrics via Pre–trained Word and Sentence Embeddings. In Proceedings of the Fourth Conference on Machine Translation, Florence, Italy, 1–2 August 2019; pp. 101–105. [Google Scholar]
  52. Miller, G.A. The Magical Number Seven, Plus or Minus Two. Some Limits on Our Capacity for Processing Information. Psychol. Rev. 1955, 343–352. [Google Scholar]
  53. Matricciani, E.; Caro, L.D. A Deep–Language Mathematical Analysis of gospels, Acts and Revelation. Religions 2019, 10, 257. [Google Scholar] [CrossRef] [Green Version]
  54. Parkes, M.B. Pause and Effect. An Introduction to the History of Punctuation in the West; Routledge: Abingdon, UK, 2016. [Google Scholar]
  55. Reicke, B. I The Roots of the Synoptic Gospels; Fortress Press: Minneapolis, MN, USA, 1986. [Google Scholar]
  56. Andrews, E.D. The Epistole to the Hebrews: Who Wrote the Book of Hebrews; Christina Publishing House: Cambridge, OH, USA, 2020. [Google Scholar]
  57. Van Voorst, R.E. Building Your New Testament Greek Vocabulary; Society of Biblical Literature: Atlanta, GA, USA, 2001. [Google Scholar]
  58. Attridge, H.W. The Epistle to the Hebrews; Fortress: Philadelphia, PA, USA, 1989. [Google Scholar]
  59. Bauckham, R. The Climax of Prophecy: Studies on the Book of Revelation; T & T Clark International: Edinburgh, UK, 1998. [Google Scholar]
  60. Stuckenbruck, L.T. Revelation. In Eerdmans Commentary on the Bible; Dunn, J.D.G., Rogerson, J.W., Eds.; Eerdmans: Grand Rapids, MI, USA, 2003. [Google Scholar]
  61. Rolland, P. Les Premiers Evangiles. Un Noveau Regard sur le Probléme Synoptique; Editions du Cerf: Paris, France, 1984. [Google Scholar]
  62. Stein, R.H. The Synoptic Problem: An Introduction; Baker Book House: Grand Rapids, MS, USA, 1987. [Google Scholar]
  63. Ehrman, B.D. Forged: Writing in the Name of God—Why the Bible’s Authors Are Not Who We Think They Are; Harper One: San Francisco, CA, USA, 2011. [Google Scholar]
  64. Dvorak, J.D. The Relatioship between John and the synoptic gospels. J. -Evang. Theol. Soc. 1998, 41, 201–214. [Google Scholar]
  65. Mackay, I.D. John’s Relationship with Mark; Mohr Siebeck: Tübingen, Germany, 2004. [Google Scholar]
Figure 1. Normalized coordinates X and Y of the ending point of vector (5) such that Aesop is (0,0) (Ae, magenta square), and Flavius Josephus is (1,1) (Fl, green square). Matthew (Mt, green triangle), Mark (Mk, black triangle), Luke (Lk, blue triangle oriented to the right), John (Jh, cyan triangle), Acts (Ac, blue triangle oriented to the left), Flavius Josephus (Fl, green square), Hebrews (Hb, red circle), Apocalypse (Ap, magenta circle), Polybius (Po, blue square), and Plutarch (Pl, black square).
Figure 1. Normalized coordinates X and Y of the ending point of vector (5) such that Aesop is (0,0) (Ae, magenta square), and Flavius Josephus is (1,1) (Fl, green square). Matthew (Mt, green triangle), Mark (Mk, black triangle), Luke (Lk, blue triangle oriented to the right), John (Jh, cyan triangle), Acts (Ac, blue triangle oriented to the left), Flavius Josephus (Fl, green square), Hebrews (Hb, red circle), Apocalypse (Ap, magenta circle), Polybius (Po, blue square), and Plutarch (Pl, black square).
Information 14 00405 g001
Figure 2. Scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Mark (black triangles and black line), Luke (blue triangles and blue line), and John (cyan triangles and cyan line).
Figure 2. Scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Mark (black triangles and black line), Luke (blue triangles and blue line), and John (cyan triangles and cyan line).
Information 14 00405 g002
Figure 3. Scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line). The magenta line (Apocalypse) and the red line (Hebrews) are superposed because they practically coincide (see Table 3).
Figure 3. Scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line). The magenta line (Apocalypse) and the red line (Hebrews) are superposed because they practically coincide (see Table 3).
Information 14 00405 g003
Figure 4. Scatterplots and regression lines between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in the following texts: Matthew (green triangles and green line), Mark (black triangles and black line), Luke (blue triangles and blue line), and John (cyan triangles and cyan line).
Figure 4. Scatterplots and regression lines between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in the following texts: Matthew (green triangles and green line), Mark (black triangles and black line), Luke (blue triangles and blue line), and John (cyan triangles and cyan line).
Information 14 00405 g004
Figure 5. Scatterplots and regression lines between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line). The green line (Matthew) and the blue line (Acts) are superposed because they practically coincide (see Table 3).
Figure 5. Scatterplots and regression lines between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line). The green line (Matthew) and the blue line (Acts) are superposed because they practically coincide (see Table 3).
Information 14 00405 g005
Figure 6. Scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in John (cyan triangles and cyan) and in Aesop (magenta circles and magenta line). Notice that the two regression lines are practically superposed, and the scattering of the two sets are very alike.
Figure 6. Scatterplots and regression lines between n W   (words, independent variable) and n S (sentences, dependent variable) in John (cyan triangles and cyan) and in Aesop (magenta circles and magenta line). Notice that the two regression lines are practically superposed, and the scattering of the two sets are very alike.
Information 14 00405 g006
Figure 7. Scatterplots and regression lines between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in John (cyan triangles and cyan line) and in Aesop (magenta circles and magenta line).
Figure 7. Scatterplots and regression lines between n S   (sentences, independent variable) and n I (interpunctions, dependent variable) in John (cyan triangles and cyan line) and in Aesop (magenta circles and magenta line).
Information 14 00405 g007
Figure 8. < Γ e x , c r o s s > and < Γ e x , s e l f > for each NT input texts indicated in abscissa. Upper panel: S-channel; Lower panel: I-channel. Output texts: Matthew, black; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta. The mean and standard deviation numerical values are reported in Appendix A. Notice that Γ e x , s e l f > Γ e x , c r o s s .
Figure 8. < Γ e x , c r o s s > and < Γ e x , s e l f > for each NT input texts indicated in abscissa. Upper panel: S-channel; Lower panel: I-channel. Output texts: Matthew, black; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta. The mean and standard deviation numerical values are reported in Appendix A. Notice that Γ e x , s e l f > Γ e x , c r o s s .
Information 14 00405 g008
Figure 9. S-channel. (a) Scatterplot of < Γ e x > versus Γ t h in S-channels. (b) Scatterplot of Γ t h Γ e x versus Γ t h . Matthew (green triangles), Mark (black triangles), Luke (blue triangles), John (cyan triangles), Acts (blue circles), Hebrews (red circles), and Apocalypse (magenta circles).
Figure 9. S-channel. (a) Scatterplot of < Γ e x > versus Γ t h in S-channels. (b) Scatterplot of Γ t h Γ e x versus Γ t h . Matthew (green triangles), Mark (black triangles), Luke (blue triangles), John (cyan triangles), Acts (blue circles), Hebrews (red circles), and Apocalypse (magenta circles).
Information 14 00405 g009
Figure 10. I-channel. (a) Scatterplot of < Γ e x > versus Γ t h in S-channels. (b) Scatterplot of Γ t h Γ e x versus Γ t h . Matthew (green triangles), Mark (black triangles), Luke (blue triangles), John (cyan triangles), Acts (blue circles), Hebrews (red circles), and Apocalypse (magenta circles).
Figure 10. I-channel. (a) Scatterplot of < Γ e x > versus Γ t h in S-channels. (b) Scatterplot of Γ t h Γ e x versus Γ t h . Matthew (green triangles), Mark (black triangles), Luke (blue triangles), John (cyan triangles), Acts (blue circles), Hebrews (red circles), and Apocalypse (magenta circles).
Information 14 00405 g010
Figure 11. Matching texts in S-channels and in I-channels.
Figure 11. Matching texts in S-channels and in I-channels.
Information 14 00405 g011
Figure 12. Scatterplot of I L of the interpunctions channel (ordinate scale) versus I L of the S-channel (abscissa scale). Output channels (first line in Table 11 and Table 12): Matthew, black circles; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta. The percentages indicate the relative number of cases falling in a quadrant.
Figure 12. Scatterplot of I L of the interpunctions channel (ordinate scale) versus I L of the S-channel (abscissa scale). Output channels (first line in Table 11 and Table 12): Matthew, black circles; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta. The percentages indicate the relative number of cases falling in a quadrant.
Information 14 00405 g012
Figure 13. Scatterplot of I L of S-channel and I-channel versus Γ s e l f , e x Γ c r o s s , e x . Matthew (green triangles), Mark (black triangles), Luke (blue triangles), John (cyan triangles), Acts (blue circles), Hebrews (red circles), and Apocalypse (magenta circles). The black line draws Equation (15).
Figure 13. Scatterplot of I L of S-channel and I-channel versus Γ s e l f , e x Γ c r o s s , e x . Matthew (green triangles), Mark (black triangles), Luke (blue triangles), John (cyan triangles), Acts (blue circles), Hebrews (red circles), and Apocalypse (magenta circles). The black line draws Equation (15).
Information 14 00405 g013
Table 1. New Testament. Mean values (averaged over all chapters) of C P (characters per word), P F (words per sentence), M F (interpunctions per sentence ) ,   I P   ( words per interpunctions), and G U (universal readability index). The genealogies in Matthew (verses 1.1–1.17) and in Luke (verses 3.23–3.38) have been deleted for not biasing the statistical analyses. All parameters have been computed by weighting a chapter with the fraction of total words of the literary text.
Table 1. New Testament. Mean values (averaged over all chapters) of C P (characters per word), P F (words per sentence), M F (interpunctions per sentence ) ,   I P   ( words per interpunctions), and G U (universal readability index). The genealogies in Matthew (verses 1.1–1.17) and in Luke (verses 3.23–3.38) have been deleted for not biasing the statistical analyses. All parameters have been computed by weighting a chapter with the fraction of total words of the literary text.
BookTotal Words < C P > < P F > < M F > < I P > G U
Matthew18,1214.9120.272.837.1853.90
Mark11,3934.9619.142.687.1754.87
Luke19,3844.9120.472.897.1154.21
John15,5034.5418.562.746.7957.65
Acts18,7575.1025.472.918.7741.37
Hebrews49405.3332.004.537.0253.10
Apocalypse98704.6630.703.977.7949.46
Table 2. Greek literature. Mean values (averaged over all chapters) of C P (characters per word), P F (words per sentence), M F (interpunctions per sentence ) ,     I P   ( words per interpunctions, or words interval), and the corresponding G U (universal readability index). All parameters have been computed by weighting a chapter with the fraction of total words of the literary text.
Table 2. Greek literature. Mean values (averaged over all chapters) of C P (characters per word), P F (words per sentence), M F (interpunctions per sentence ) ,     I P   ( words per interpunctions, or words interval), and the corresponding G U (universal readability index). All parameters have been computed by weighting a chapter with the fraction of total words of the literary text.
AuthorTotal Words < C P > < P F > < M F > < I P > G U
Aesop (620–564 BC, Fables)39,1225.2418.293.465.2864.95
Polybius (200–118 BC, The Histories)256,4955.97 29.193.308.8837.22
Flavius Josephus (37–100 AD, The Jewish War)121,7175.5131.053.209.7431.44
Plutarch (46–119 AD, Parallel Lives)499,6835.5129.353.737.8243.53
Table 3. Slope m and the correlation coefficient r   of the regression lines of n S  versus n W , and n I   versus n S in the indicated texts. Four decimal digits are reported because some values differ only from the third digit. These parameters are calculated by uniformly weighing each block text, e.g., weight 1 / 28 in Matthew.
Table 3. Slope m and the correlation coefficient r   of the regression lines of n S  versus n W , and n I   versus n S in the indicated texts. Four decimal digits are reported because some values differ only from the third digit. These parameters are calculated by uniformly weighing each block text, e.g., weight 1 / 28 in Matthew.
Text n S Versus   n W n I   Versus   n S
m r m r
Matthew0.05080.94102.72710.9548
Mark0.05380.89852.55270.8800
Luke0.04990.89752.82960.9243
John0.05490.91812.67970.9517
Acts0.04130.88072.71920.9280
Hebrews0.03360.80374.09700.9005
Apocalypse0.03380.80633.76050.8173
Table 4. Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts. Output channel: Matthew.
Table 4. Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts. Output channel: Matthew.
TextSentences versus SentencesInterpunctions versus Interpunctions
m j k r j k m j k r j k
Mark0.94420.99401.06830.9814
Luke1.01800.99380.96380.9960
John0.92530.99811.01770.9999
Acts1.23000.98901.00290.9968
Hebrews1.51190.95760.66560.9891
Apocalypse1.50300.95890.72520.9516
Table 5. S-channel. Theoretical signal-to-noise ratio Γ t h (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Matthew and the output is Mark, then Γ t h = 17.70 ; vice versa, if the input is Mark and the output is Matthew, then Γ t h = 18.59 .
Table 5. S-channel. Theoretical signal-to-noise ratio Γ t h (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Matthew and the output is Mark, then Γ t h = 17.70 ; vice versa, if the input is Mark and the output is Matthew, then Γ t h = 18.59 .
TextMatthewMarkLukeJohnActsHebrewsApocalypse
Matthew 17.7019.0619.5613.048.128.22
Mark18.59 22.7925.6612.618.128.21
Luke18.7622.14 18.8715.149.149.26
John20.5025.9919.87 11.837.677.76
Acts10.6210.2613.449.15 13.1313.36
Hebrews3.293.485.102.6110.75 42.61
Apocalypse3.463.645.292.7711.0442.68
Table 6. I-channel. Theoretical signal-to-noise ratio Γ t h , d B (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Matthew and the output is Mark, then Γ t h = 14.25 ; vice versa, if the input is Mark and the output is Matthew, then Γ t h = 13.16 .
Table 6. I-channel. Theoretical signal-to-noise ratio Γ t h , d B (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Matthew and the output is Mark, then Γ t h = 14.25 ; vice versa, if the input is Mark and the output is Matthew, then Γ t h = 13.16 .
TextMatthewMarkLukeJohnActsHebrewsApocalypse
Matthew 14.2519.9433.9421.945.194.66
Mark13.16 16.0213.9617.234.305.94
Luke20.5317.37 20.7027.936.827.02
John33.7514.7819.91 22.814.894.51
Acts21.8918.2027.5623.06 5.735.96
Hebrews9.158.4510.128.939.39 15.25
Apocalypse8.859.6010.458.809.7513.92
Table 7. Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts. Output channel: Hebrews. Notice that five decimal digits are reported for Apocalypse because its value is very close to 1.
Table 7. Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts. Output channel: Hebrews. Notice that five decimal digits are reported for Apocalypse because its value is very close to 1.
TextSentences vs. Sentences%Interpunctions vs. Interpunctions
m j k r j k m j k r j k
Matthew0.66140.9576 1.50230.9891
Mark0.62450.9833 1.60500.9990
Luke0.67330.9837 1.44790.9983
John0.61200.9737 1.52890.9905
Acts0.81360.9897 1.50670.9977
Apocalypse0.99410.99999 1.08950.9865
Table 8. Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts. Output channel: Apocalypse. Notice that five decimal digits are reported for Hebrews because its value is very close to 1.
Table 8. Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts. Output channel: Apocalypse. Notice that five decimal digits are reported for Hebrews because its value is very close to 1.
TextSentences vs. SentencesInterpunctions vs. Interpunctions
m j k r j k m j k r j k
Matthew0.66540.95891.37890.9516
Mark0.62830.98411.47310.9929
Luke0.67740.98451.32900.9754
John0.61570.97471.40330.9547
Acts0.81840.99031.38290.9731
Hebrews1.00600.999990.91790.9865
Table 9. Slope m and the correlation coefficient r of the regression lines between n S versus n W and n I versus n S for the indicated texts of the Greek literature. The slopes and correlation coefficients have been calculated the same as those reported in Table 3.
Table 9. Slope m and the correlation coefficient r of the regression lines between n S versus n W and n I versus n S for the indicated texts of the Greek literature. The slopes and correlation coefficients have been calculated the same as those reported in Table 3.
Author n S   Versus   n W n I   Versus   n S
m r m r
Polybius0.03430.99713.24320.9885
Plutarch0.03710.91953.35390.9577
Flavius Josephus0.03250.97343.18910.9846
Aesop0.05450.90323.42360.9302
John0.05490.91812.67970.9517
Table 10. S-channel, Greek literature. Theoretical signal-to-noise ratio Γ t h (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Polybius and the output is Plutarch, then Γ t h = 9.81 ; vice versa, if the input is Plutarch and the output is Polybius, then Γ t h = 8.48 .
Table 10. S-channel, Greek literature. Theoretical signal-to-noise ratio Γ t h (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Polybius and the output is Plutarch, then Γ t h = 9.81 ; vice versa, if the input is Plutarch and the output is Polybius, then Γ t h = 8.48 .
TextPolybiusPlutarchFlaviusAesopJohn
Polybius 8.4816.081.421.78
Plutarch9.81 14.126.516.38
Flavius Josephus15.1912.24 2.302.47
Aesop7.089.897.46 28.61
John7.289.787.5128.74
Table 11. I-channel, Greek literature. Theoretical signal-to-noise ratio Γ t h (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Polybius and the output is Plutarch, then Γ t h = 17.06 ; vice versa, if the input is Plutarch and the output is Polybius, then Γ t h = 16.49 .
Table 11. I-channel, Greek literature. Theoretical signal-to-noise ratio Γ t h (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line. For example, if the input is Polybius and the output is Plutarch, then Γ t h = 17.06 ; vice versa, if the input is Plutarch and the output is Polybius, then Γ t h = 16.49 .
TextPolybiusPlutarchFlaviusAesopJohn
Polybius 16.4930.8012.1513.19
Plutarch17.06 18.3221.0713.91
Flavius Josephus30.5617.51 12.7714.11
Aesop13.0721.4213.94 13.04
John10.8411.9412.0210.77
Table 12. Average value of I L in S-channels. For example, in the channels Hebrews Apocalypse, from Appendix B, we obtain the average value ( 0.993 + 0.999 ) / 2 = 0.996 . In bold type are the cases in which I L > 0.5 .
Table 12. Average value of I L in S-channels. For example, in the channels Hebrews Apocalypse, from Appendix B, we obtain the average value ( 0.993 + 0.999 ) / 2 = 0.996 . In bold type are the cases in which I L > 0.5 .
MtMkLkJhAcHbAp
Mt1
Mk0.1601
Lk0.7070.6301
Jh0.5110.9140.4191
Ac0.1450.1280.1880.0661
Hb0.1440.1320.1600.1330.3721
Ap0.0630.0590.0740.0440.2710.9961
Table 13. Average value of I L in I-channels. In bold type are the cases in which I L > 0.5 .
Table 13. Average value of I L in I-channels. In bold type are the cases in which I L > 0.5 .
MtMkLkJhAcHbAp
Mt1
Mk0.4271
Lk0.6810.4851
Jh0.9540.4290.4941
Ac0.7850.5710.8630.7501
Hb0.0510.0960.0840.0370.0671
Ap0.0430.0990.0790.0370.0620.6971
Table 14. Overall total average value of I L . For example, in the channels Hebrews Apocalypse, from Table 12 and Table 13 we obtain the average value ( 0.996 + 0.697 ) / 2 = 0.847 . In bold type are the cases in which I L > 0.5 .
Table 14. Overall total average value of I L . For example, in the channels Hebrews Apocalypse, from Table 12 and Table 13 we obtain the average value ( 0.996 + 0.697 ) / 2 = 0.847 . In bold type are the cases in which I L > 0.5 .
MtMkLkJhAcHbAp
Mt1
Mk0.2941
Lk0.6940.5581
Jh0.7330.6740.4571
Ac0.4650.3500.5260.4081
Hb0.0980.1140.1220.0850.2201
Ap0.0530.0790.0770.0410.1670.8471
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Matricciani, E. Linguistic Communication Channels Reveal Connections between Texts: The New Testament and Greek Literature. Information 2023, 14, 405. https://doi.org/10.3390/info14070405

AMA Style

Matricciani E. Linguistic Communication Channels Reveal Connections between Texts: The New Testament and Greek Literature. Information. 2023; 14(7):405. https://doi.org/10.3390/info14070405

Chicago/Turabian Style

Matricciani, Emilio. 2023. "Linguistic Communication Channels Reveal Connections between Texts: The New Testament and Greek Literature" Information 14, no. 7: 405. https://doi.org/10.3390/info14070405

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop