Derivative of a hypergraph as a tool for linguistic pattern analysis

The search for linguistic patterns, stylometry and forensic linguistics have in the theory of complex networks, their structures and associated mathematical tools, allies with which to model and analyze texts. In this paper we present a new model supported by several mathematical structures such as the hypergraphs or the concept of derivative graph to introduce a new methodology able to analyze the mesoscopic relationships between sentences, paragraphs, chapters and texts, focusing not only in a quantitative index but also in a new mathematical structure that will be of singular help to both: detecting the style of an author and determining the language level of a text. In addition, these new mathematical structures may be useful to detect similarity and dissimilarity in texts and, eventually, even plagiarism.


Introduction
In the last decades the emergence of new structures and models in the field of complex networks and the successive advances in the study and development of their associated tools have made it possible to model the different types of interactions between the diverse parts of a complex system in an efficient and remarkably successful way in practically all areas of knowledge [9,19,34,46,58,66]. Complex networks have become an essential and indispensable element in the representation of systems for simulating the interactions and relationships between the components of a complex system in domains as diverse as biology, technology, and human social organization [9,10,16,24,30,31,35,49,55,58].
It can be said that Network Science can be traced back to the analysis of heterogeneity in real-world complex systems in both nature and function. Thus, the role played by some nodes in these systems is very different from that obtained by the classical Erdős-Rényi model of random networks, which was a first fundamental milestone in the modeling of real-world complex systems and in the assumption in these models of a first level of heterogeneity [1]. The famous scale-free model made it possible to successfully model real-world complex systems by highlighting the relevant role of nodes with heterogeneous connectivity [1]. A second milestone consisted in the emergence of multilayer network models by taking into account that links could also be heterogeneous in nature [10]. The third milestone is currently being developed by considering that the heterogeneity of complex systems may affect not only the function of links, but also their nature, since links may be formed by subsets of nodes of different cardinality [4]. From collaborative networks to collective social interaction, from trophic networks to biochemical regulatory networks and, in our case, to linguistic networks, many complex systems are produced by considering interactions between more than two nodes simultaneously, making classical network models insufficient. Therefore, the new challenge for the network community is to find new mathematical models that fit multiparty interactions in order to model complex systems with relationships of heterogeneous nature.
The emergence of new tools that allow large datasets to be handled and analyzed automatically has led to the development of new approaches in many areas of knowledge including text analysis [29,30,44].
Classical approaches for linguistic analysis of texts were based on simple statistical studies that relied on word frequency [3,30]. However, it is noteworthy that in the last decades modern linguistics has received a great advance stemming from the treatment of a language as a system or complex network, having at its disposal in this representation all the tools, measures and procedures to obtain a new, efficient and effective approach to the study of language through complex network that includes qualitative and quantitative aspects [12,17,31,35,37,50,52,54,56,62].
Therefore, the analysis of linguistic theories supported by the study of specialized corpora and the new approach provided by complex networks makes it possible to obtain certain stylistic and typological characteristics together with some intrinsic properties of languages. The perspective provided by complex networks must go beyond the use of word adjacency or co-occurrence methods which, although they successfully capture the syntactic elements of the texts [36], do not have the capacity to represent certain characteristics that develop at the mesoscopic level throughout the text and that have to do with the semantic relationships between the different sentences and paragraphs that compose it.
The linguistic network model we are working with in this manuscript emerges from the need to work with sentences or paragraphs as a group or collection of certain words in contrast to the type of links considered in previous works where directed and weighted links are used to represent the relationships between linguistic units as in [17,54,55]. In this work, as in [22,23], instead of considering the co-occurrence relationship between two adjacent words or linguistic units within a sentence, we will study not only the relationship between sentences (those that share lexical words) but also the relationship between paragraphs or even articles, seeking to characterize, by using network theory parameters, the style of an author or a text as well as the level of language and/or specialization used in the text. This approach leads us to a completely different perspective from the one used, for example, in [30], where, among many other differences, words are transformed and reduced to their canonical forms and the text is organized in consecutive sets of paragraphs.
In order to apply the tools described in this work, and perform a computer processing on a linguistic corpus understood as a collection of texts collected electronically as a representative sample of texts selected according to certain linguistic criteria [53], a corpus of texts composed of 86 extended abstracts (volumes 1-6 of the International Journal of Complex Systems in Science (IJCSS), published between April 2011 and November 2016 (http://www.ij-css.org)) has been considered. This corpus provides us with a total amount of 147637 words as well as 25210 sentences, considered in this study. It should be noted that the unit of analysis from which we start in this work is the sentence, i.e., the words enclosed between two periods [40]. In addition, it is important to note that commas and other punctuation marks within the sentence have not been considered for this analysis.
The questions we addressed when we started to write this paper were "How to characterize the competence level of language used in a text?" or "Can the style of an author be determined using specific parameters in the linguistic network under consideration?" and also, "What is the combination of words most frequently used in a corpus beyond locating the most relevant individual lexical words?" or even "How to determine the most representative words of a text (not necessarily the most frequent)?" Taking into account that the English language has four main classes of words: nouns, adjectives, verbs, and adverbs, and that other classes of words are prepositions, conjunctions, determiners, interjections, or pronouns, we established in [22] a four-layer network in order to study a specialty language. In this paper, we will focus on words belonging to the lexical layer, i.e., those significant words (mainly nouns and adjectives) with a specific meaning relevant to the specialty language under study [22,23].
Therefore in this paper we use the tools and methodology derived from some complex network structures to describe interactions between groups of words, each of these groups being formed by the lexical words belonging to a specific sentence in the analyzed corpus (syntagmatic approach, from the Greek "σιυτ αγµα", syntagma: "assembled group"). It is therefore important to note that the syntagmatic approach, which corresponds to the analysis presented in this paper, is different from the paradigmatic approach used in other works of computational linguistics [17].
Since the syntagmatic relationship is based on the interrelationships of words in a linguistic structure [22,23,60], it makes sense to consider the relationships between sets of two, three or more significant words that appear in the same sentence, paragraph, abstract or article and that in some way characterize a text as belonging to an author, or discriminate the level of language used in it, as well as those other words and relations that allow distinguishing it from texts belonging to other authors or that use a different level of language.
The methodology presented here makes it possible to determine the level of language used in a text as well as the style of an author and also to analyze and order sentences, abstracts, paragraphs and texts (sets of words) according to their importance, having mind their interrelationships in the context of the multilayer network structure defined in [22,23] as well as to extract new features of a text from the relationships between significant sets of words in the text.
High-order networks or hypergraphs are the natural generalization of networks that takes into account the fact that a link can connect more than two nodes. Interest in this type of network is growing due to the inability of classical graphical representations to describe group interactions. Their applicability goes beyond the field of social sciences [6,47,63] and the study of group interactions, public cooperation or opinion formation. In our case, we will consider its applicability in the field of linguistics and specialty languages beyond other approaches based on classical complex networks, multiplex networks or multilayer networks [12,17,35,37,50,51,52,55,61,68].
As it can be easily understood, a property referring to a finite set of objects (in our case, the nodes of a network), is completely characterized by the subset of elements that satisfy it, which in this case will be represented by the hyperedge formed by these elements, so that it will be possible to compare and relate properties of the nodes and the network by studying and analyzing the corresponding hypergraph.
Thus, studying the relationships between the properties of the nodes consists of mathematically analyzing the properties and typical parameters of the associated hypergraph. Therefore, the applications of this methodology to the field of linguistics range from the characterization of an author's style to the detection of plagiarism, including the detection and identification of the same concept expressed in a different way. To this end, starting, in the first instance, from the identification of a sentence of our corpus with the hyperedge formed by the set of lexical words of that sentence, the hypergraph will be constructed in which the nodes will be all the lexical words of the corpus and the hyperedges all the sentences of the corpus, defining the concept of derivative of two words with respect to a set of hyperedges and the degree of independence of two words of a text with respect to that set of hyperedges. This study can be extended in more depth by considering as hyperedges, successively, the sets of nodes formed by the lexical words of a paragraph, an abstract or even a chapter, taking the corresponding sequence of parameters as a feature of the text and pointing to new applications of this structure.
The structure of the paper is as follows. After this introduction, in Section 2 some basic concepts and a summary of the most important relationships between the line graph the dual hypergraph, the bipartite graph associated to a certain hypergraph and its corresponding matrices are introduced. Section 3 is devoted to introduce the concept of derivative of a hypergraph with respect to a set of nodes and to establish the definition of the homogeneity graph of a hypergraph obtaining some remarkable results related to this new structure. In Section 4 we apply the mathematical concepts and structures defined in the previous sections to obtain tools to characterize the style and level of a text belonging to the linguistic hypergraph considered. In Section 5 the lexical density of the set of texts that make up the analyzed corpus is studied, and some numerical experiments and computational results are presented by using three different algorithms to illustrate the diverse types of relationships that can be established between sentences within a text and their relative importance. Section 6 is devoted to apply the instruments and tools developed in order to obtain distinctive characteristics that allow us to distinguish the styles of the different authors and linguistic competence levels of the written texts included in the corpus considered. Finally in Section 7 we present some conclusions of this work.

Basic concepts and some preliminary results
A network (or graph) G = (X, E) is just a finite set of vertices (or nodes) X = {1, ..., N } connected by a set of edges (or links between certain pairs of nodes) E = {e 1 , · · · , e m }. If the edges have a direction, we will say that G is a directed network (or digraph). In the sequel, we will denote by e ij ∈ E the link between the nodes i and j, although sometimes we will also denote the edge e ij by {i, j} or, if G is a directed network, by i → j. Finally, a weighted network is a graph in which each edge e ij has an associated numerical value w(e ij ) = w ij called its weight. In the same way, following [7], a hypergraph H = (X, ε) is a finite set of vertices (or nodes) X = {1, ..., N } and a collection ε = {h 1 , h 2 , . . . , h n } of subsets of X such that h i = ∅ (i = 1, 2, . . . , n) and X = n i=1 h i . Each of these subsets is called a hyperedge. In this way, hypergraphs appeared as the natural extensions of graphs to describe group interactions. In the following sections, the study is developed with undirected graphs and hypergraphs, though some of the definitions can be easily extended to the directed case.
In order to carry out our study it is necessary to introduce the concepts of linegraph and dual hypergraph of a hypergraph. In this regard it should be noted that the concept of linegraph L(G) associated to a graph G = (X, E) was introduced by H. Whitney in 1932 [67] and extended for higher order networks by J.C. Bermond et al. in 1977 [8, 64]. It is important to point out that the study of these structures, as well as the relationships between them and their applications, has been increasing steadily in recent years (see, for example, [5,6,20,21,32,33,57]).
So, if H = (X, ε) is a hypergraph, the linegraph associated to H is the graph It is also notorious that the linegraph L(H) of a hypergraph H is a graph even though H is a hypergraph. Note that this concept is a particular case of the concept of intersection graph [57]. On the other hand, it is also possible to consider the dual hypergraph of a hypergraph: if H = (X, ε) is a hypergraph, the dual hypergraph associated with H is the hypergraph It is not difficult to verify that (H * ) * = H. Moreover, if I is the incidence matrix of H, then its transpose matrix I t is the incidence matrix of H * . In this context, to concretize the relationship between L(H) and H * , we consider the function Π 2 that turns a hypergraph H = (X, ε) into a graph Π 2 (H) = (X, E ) as follows: So, for any hypergraph H we have that Π 2 (H * ) = L(H). Furthermore, if G = (X, E) is a graph, with X = {1, ..., N }, we can also consider the dual hypergraph G * = (E, ε) of G where ε = {h 1 , ....h n } and ∀i ∈ {1, ...n} we consider the corresponding hyperedge h i = {e j ∈ E| i ∈ e j }, and also Π 2 (G * ) = L(G). Now, if we denote by I(H) the incidence matrix of H, then it is not difficult to verify that In fact, if we consider in addition the bipartite network B(H) associated to the hypergraph H = (X, ε) defined by B(H) = (X ∪ ε, E(H)) then its adjacency matrix is given by The matrix A(H) = (a ij ) is called the frequency matrix of relations between the elements (nodes) of the hypergraph H (see [39]).

Hypergraphs and Derivative graph
Quantifying the similarity between two models or structures is one of the most important aspects that has contributed to the development of theories and models in science and technology. There are multiple works whose objective is to model generic data sets in the field of complex networks in order to, by using the constructed model, study the level of similarity or coincidence of such data [15,28,65]. Thus, since the introduction of Jaccard's index in 1901 [43], through different adaptations and generalizations of this concept [25,65], several types of indexes and generalizations have been established with the aim of quantifying the similarity between two sets or mathematical structures [15,41,65,25,27,28].
The basic Jaccard index to compare the degree of coincidence or similarity between two sets A and B can be obtained from the formula The different applications of the Jaccard index along time made possible the development of new indexes, improving the accuracy of the original results. So, the overlap index and the coincidence similarity [26,27,28,65] are examples of additional indexes that allow to establish similarity between certain types of models and structures, including approaches aimed at quantifying similarity between paragraph contents using the concept of multisets [26]. In our case, we are going to introduce a methodology to analyze and quantify the similarity between two nodes i, j of a hypergraph, applying it to the study of the linguistic network built through the corpus under study.
In this section we are going to introduce the concept of derivative graph of a hypergraph with the idea of associating not only a numerical index that allows us to quantify the heterogeneity and absence of similarity between the corresponding hyperedges, but also to associate a structure (in this case a graph) to characterize the heterogeneity and dissimilarity of the elements of the hypergraph under consideration. Now, we are in a good position to establish the concept of derivative graph of a hypergraph over a pair of nodes: we will call the derivative hypergraph of H with respect to the pair of nodes i, j ∈ X as the numerical value ∂H ∂{i,j} obtained by applying the following formula Obviously, if there is not a hyperedge h ∈ ε such that i, j ∈ h, we will have ∂H ∂{i,j} = ∞, and if ∀h ∈ ε (i ∈ h ⇔ j ∈ h) then we will have ∂H ∂{i,j} = 0. Note that ∀i, j ∈ X we have that ∂H ∂{i,j} ≥ 0. It is important to point out that the above definitions can be extended without difficulty to the context of a collection of sets (which would play the role of the hyperedges) and of the elements (respectively the nodes) of the sets of that collection.
If we now consider each hyperedge h ∈ ε as a property or a feature that a node may or may not have, or even as an event or affair in which a particular node may or may not participate, so that the entire hypergraph is a set of features or events, the value of ∂H ∂{i,j} characterizes the (relative) heterogeneity of the properties ε satisfied simultaneously by nodes i and j, or the intensity of participation of the nodes i and j in the set of events ε. Moreover, the smaller the value of the derivative of the network with respect to the set of events over the pair of nodes i, j, the greater identification and similarity between the corresponding nodes i, j with respect to the considered set of events (in fact, if ∂H ∂{i,j} = 0, these nodes, which participate in exactly the same hyperedges, are so similar that they are, from the point of view of H indistinguishable). In other words, the higher the value of the derivative is, the greater the degree of unequal participation of the nodes in the hyperedges. Thus, it makes sense to give the following definition: Definition 3.3. Given a hypergraph H = (X, ε) and i, j ∈ X, we will call degree of independence of i and j with respect to H the numerical value of ∂H ∂{i,j} . Definition 3.4. Given a hypergraph H = (X, ε), the derivative graph ∂H of H is the weighted graph obtained by considering the derivative of H with respect all the pairs of nodes i, j ∈ X, and by setting ∀i, j ∈ X the corresponding numerical value of ∂H ∂{i,j} on the edge {i, j}, in such a way that if ∂H ∂{i,j} = 0, then the nodes i and j collapse into a single node (ij), and having in mind that if ∂H ∂{i,j} = ∞, then the edge {i, j} does not exist in the derivative graph.
Globally, it can be said that the derivative graph ∂H gives us a representation of the degree of heterogeneity of participation of nodes on the different hyperedges of H.
Assuming that if k is any positive number then k 0 = +∞ and k ∞ = 0, for continuity and consistency sake of the established concepts, we are interested in defining the homogeneity matrix and homogeneity graph of a hypergraph: Definition 3.5. Given a hypergraph H = (X, ε), we will call homogeneity matrix of H, to the matrix H(H) = (h ij ) ∈ R N ×N defined by Definition 3.6. Given a hypergraph H = (X, ε), the homogeneity graph HG(H) of H is the weighted graph with the same nodes and edges as ∂H , but considering as the weight of each edge the inverse value of the weight corresponding to the derived graph ∂H.
At this point it is remarkable that the application of the PageRank algorithm on the homogeneity graph HG(H) will allow us to extract the most representative nodes of the hypergraph, in the sense that the nodes located in the first places of the ranking obtained will be the "most similar" (in the sense that underlies the definition of homogeneity graph) to each other and to the rest of the nodes of the hypergraph as it will be shown in Section 5.
To clarify the concepts and ideas introduced, let's examine the following example: so that the derivative graph ∂H is the one represented in part (b) of Figure 1 and the homogeneity matrix of H is: Note that the edge {1, 4} ∈ E has been removed in the derivative network ∂H and that nodes 1 and 5 have collapsed into a single node in the obtained network. So, the adjacency matrix of the homogeneity graph HG(H) is: where the set of nodes of HG(H) is ({(1, 5), 2, 3, 4} ordered as they appear (panel (c) of Figure 1). It is worth noting that, in a similar way as it has been done in Definition 3.1, it is possible to establish the derivative of a hypergraph with respect to a set of three or more nodes as follows: where a ijk = |{h ∈ ε| i, j, k ∈ h}|, and the same type of formula can be obtained for sets of nodes of higher cardinality. Note that the same idea can be extended to the definition of degree of independence of several nodes as follows: Given a hypergraph H = (X, ε), and i 1 , ...i n ∈ X, the degree of independence of i 1 , ..., i n in H is the numerical value ∂H ∂{i1,...in} .
Finally, it is remarkable that the use of the PageRank algorithm on the homogeneity graph will allow us to extract a ranking of the most representative individuals (or nodes) of either the hypergraph or the network under consideration.
To conclude this section, it must be noted that when both graphs and hypergraphs are used simultaneously to model certain complex systems, it is sometimes very useful to analyze how these structures interact and overlap using the tools introduced in this section. In this regard, it should be noted that the tools introduced in this section can be used to capture intrinsic and mesoscopic characteristics of a graph and to define new invariants of graphs and isomorphic networks. For example, given a graph G = (X, E), we can consider the hypergraph H = (X, ε) such that each of its hyperedges is formed by all the nodes that are part of a cycle, or by all the nodes that are part of a spanning tree of G. The most accurate framework to work with the overlapping of these structures is the use of hyperstructures.
In [18] we can find a first definition of the concept of hyperstructure as follows: It is not difficult to prove the following result: It is important to highlight that by using the idea of derivative we have introduced in this paper we can examine and determine the uniformity of participation of two, three or more nodes in the considered structure or hyperstructure, or even the binary relationships (edges) between participants of a certain event by simply considering a suitable hyperstructure in which the nodes be the edges of the original graph under consideration. Now, we can define the derivative graph of a weighted hyperstructure: Definition 3.10. Given a hyperstructure S = (X, E, H), where G = (X, E, W ) is a weighted graph and H = (X, ε), if w ij denotes the weight of the edge e = {i, j} ∈ E, then we will call the derivative of e with respect to the hyperstructure S the numerical value obtained by applying the following formula Obviously, if there is not a hyperedge h ∈ H such that e = {i, j} ∈ h, we will have ∂e ∂S = ∞. On the other hand, it is evident that if a hyperstructure is compatible, the derivative of any edge with respect to S cannot be equal to +∞. As a direct application of the definition, note that if we consider the graphs G = (X, E) (panel (a) of Figure 2) and G = (X, E ) (panel (b) of Figure 2) and the hyperstructures S = (X, E, H) and S = (X, E , H ) such that each of their hyperedges is composed by all the nodes belonging to a cycle formed by three or more nodes of G and G respectively, then the derived graphs ∂G ∂S and ∂G ∂S are completely different since, for example, On the other hand, as can be seen, and, obviously, 5 2 = 31 10 . Note that Definition 3.11 allows us to iterate the derivatives with respect to a hyperstructure, because if the graph derived from the hyperstructure is ∂G ∂S = (X , E , W ) y S = (X , E , K), then we can consider the mixed derivatives of a graph G with respect to two different hyperstructures (which may eventually be the same) S and S (in this order) as It is obvious that the successive derivative graphs obtained by deriving respect a suitable chain of two or more hyperstructures allow to obtain characteristics and properties of the system or model under study related to the absence of similarity between the nodes.

A linguistic hyperstructure based on the lexical layer within a multilayer linguistic network model
We are now ready to show the potential applications of the defined mathematical structures and tools to the linguistic analysis of texts, looking for the identification of signs and specific features of a style or competence level of language considering the most significant words and their relationships. It can be said that the English language has four major word grammar categories: nouns, adjectives, verbs, and adverbs. Other word classes are prepositions, conjunctions, determiners, interjections or pronouns [42]. On this basis described in [22,23] we have built a methodology close to supervised machine learning consisting of dividing the words of the corpus under study into a multilayer network [10] composed by four layers: lexical layer, verb layer, linking layer and remaining words layer.
In order to discriminate between the terms (words) and to assign them to one or another layer, a completely lexical linguistic decision was made according to the criteria of several experts. Thus, the terms (words) of the corpus have been distributed in the different layers according to their morphological and lexical properties. Some other linguistic aspects, such as the specific terminology of a specialty language and the different combinations of words that give rise to new meanings (called "linguistic collocations") have also been successfully studied and modeled in [22,23].
In the model established in [22] interlayer relations are the basic grammatical relations in a sentence, for example, the interaction between layers that facilitates the formation and description of specialty verbs (e.g. "cluster together"). On the other hand, throughout the present work, we will consider the sentences as the unit under study, identifying each sentence in the corpus (set of words located between two periods) with the subset of lexical words appearing in that sentence.
For this reason, throughout this work we are going to focus on the words (nodes) located in the lexical layer. At this point, it is remarkable that in the lexical layer many words can act as verbs when we analyze texts written by authors with higher language skills. For example, within the sentence "model a network", the word "model" is a verb, but in the expression "network model" the term "model" is a noun.
In order to set our approach, the model of the corpus analyzed is considered as a set of texts formed by sentences (set of lexical words between two periods). In fact, from a practical and computational point of view, each sentence is identified with the set of lexical words that compose it. This way, let us consider the hyperstructure in which the nodes are the lexical words, the edges between these nodes are established when these words appear in the same sentence, and the set of hyperedges is the set of sentences that constitute the corpus.
It is important to point out that the linguistic hyperstructure considered is a compatible hyperstructure, since the edges are established between words that appear in the same sentence. Therefore, from the Theorem 3.9 it is possible to study both the hyperstructure in which the nodes are the words and the hyperedges are the sentences and, in a complementary way, the hyperstructure in which the nodes are the edges between words (dual graph of the original graph) and the hyperedges are also the sentences.
On the other hand, by considering paragraphs as a set of sentences, and the extended abstracts of our corpus as a set of paragraphs, we can add to this model new linguistic hyperstructures that undoubtedly allow us to characterize a text or set of texts from the derivatives of the corresponding graphs and hypergraphs respectively.
In order to illustrate how useful are the tools presented in the context of the linguistic analysis of texts, let us consider a text in which the same sentence is repeated over and over again. In that case, by deriving the linguistic hypergraph formed by the set of all the repeated sentences with respect to the lexical words of the sentence repeated over and over in all those sentences, the derivative graph will collapse to a single node.
So, by calculating the derivative graph from the linguistic hypergraph composed by all the sentences of a corpus or a text, we will obtain the degree of similarity between the sentences of that text, and also the greater or lesser degree of difference between all the sentences forming such text (or corpus), with the peculiarity that these quantitative measures are represented in the corresponding derivative graph.
Consequently, the derivative graph of a text or a set of texts is a quantitative and qualitative structure of such text that is a specific feature of that text (or set of texts) for real, which may be considered, in certain cases, like a signature or specific characteristic of the style of an author.
When analyzing the hypergraph H formed by all the sentences of the corpus under study, we obtained 127 pairs of words that appear in exactly the same sentences. Thus, for example ∂H ∂{monte, carlo} = ∂H ∂{dif f erential, rungekuttta} = ∂H ∂{oscillatory, asynchronous} = 0.
It is important to note at this point that, if three or more words in the corpus analyzed appear in exactly the same sentences, these words have collapsed into a single node. This has happened in 13 cases. Finally, and by way of illustrative example, we will point out that  Figure 3 shows the homogeneity graph corresponding to the corpus considered, in which the thickness of each edge is proportional to its weight. On the other hand, as it can be seen in the right part of Figure 3, there is no link between "features" and "properties" because ∂H ∂{f eature,properties} = +∞, and the edge joining "networks" and "complex" is thicker than the rest. Also, as it can be seen in the histogram of Figure 4, there are more than 10 3 pairs of words {i, j} such that 0 ≤ ∂H ∂{i,j} ≤ 10 and more than 10 6 pairs of words {i, j} whose derivative is +∞ (note that in Figure 4, the length of the intervals of the horizontal axis is 10).
To conclude this section, we would like to point out that the automatic extraction of the linguistic level of a corpus, the search for lexical patterns in sentences of a given author or writer of a particular specialty language, the search for similarities and differences in a set of texts and the automatic classification of texts according to these differences or similarities are some of the possible applications of the methodology underlying this model.

On lexical density and three different rankings of sentences: computational results
As far as it is known, the personalized PageRank of a individual term (node) i is the i-component of the stationary state π 0 ∈ R n ( π 0 = 1) of the random walker with transition matrix [11,13,14,38] where α ∈ (0, 1), B = (b ij ) is the adjacency matrix of the network under consideration, e T = (1, · · · , 1), v ∈ R n ( v = 1) is the personalization vector and To carry out our study on the hypergraph H in which the vertices are the lexical words of the corpus, and the hyperedges are the phrases (sets of lexical words of the corpus located between two periods), we will use the same methodology as in [22] and [23] to associate its corresponding PageRank to each node, with the idea of ranking the lexical words according to their importance [11,13,14,48,59]. For this purpose, taking into account that for the PageRank calculation used throughout this work we have used the algorithm described in [2], we will apply this algorithm on three different structures obtained from the application of three different criteria: 1. Ranking 1. To calculate this ranking, we first have built a graph on which to apply the PageRank algorithm. In order to do that, we convert each hyperedge of H into a clique to obtain the projection graph Π 2 (H). After this, taking into account that the average number of words of a sentence within the corpus under study is 5.809 and that, therefore, the local lexical density is 5.809, we can deduce that the damping factor corresponding to this configuration is 0.853, since 2. Ranking 2. To calculate this ranking, we will apply the PageRank algorithm considered on the network Π 2 (H * ) = L(H) so that, once the numerical value attributed to each phrase has been obtained, this value is distributed proportionally among the words that make up that sentence. It is important to note that, in this case, the network considered is a directed network, and that, if s 1 , s 2 ∈ L(H), these sentences will be connected if they have at least one lexical word in common, so that the edge weight w(s 1 → s 2 ) is the number of words shared by both sentences multiplied by the number of times that sentence s 2 appears repeated in the corpus. Obviously, the edge weight w(s 1 → s 2 ) may be different from w(s 2 → s 1 ). Now, using the same reasoning as in the previous case, and having in mind that the average number of sentences of a paper included in the corpus under study is 27.12, in this context, the damping factor corresponding to this configuration is 0.96. 3. Ranking 3. To calculate this ranking, we will apply the PageRank algorithm considered on the weighted graph HG(H). Taking into account that the average number of words of a sentence is 5.756 (since, after collapsing words pairs {i, j} such that ∂H ∂{i,j} = 0, the average length of sentences decreases, albeit slightly), the damping factor corresponding to this configuration is 0.852. Figure 3 shows the homogeneity graph corresponding to the corpus considered. The size of the nodes is proportional to the component of the PageRank vector corresponding to that node, and the thickness of each edge is proportional to its weight.
In all the described cases the corresponding value of q is the probability that a random walker will not vary its trajectory by moving to a node directly connected by an edge to the current node instead of jumping to another node in this network not necessarily connected to the previous one. In our situation, this jump can be understood as the end of the current sentence and the starting point of a new sentence for Ranking 1 and Ranking 3, and as the end of the current paper and the starting point of a new paper for Ranking 2. To complete the necessary elements to apply the algorithm, we will point out that for Ranking 1 and Ranking 3 the personalization vector considered is the (relative) frequency of lexical words, and for Ranking 2 the personalization vector considered is the (relative) frequency of each sentence included in the corpus under study.
As it can be seen in Table 1, there is hardly any difference at the top of the three rankings. As expected, Ranking 3 gives us the most representative words of the corpus in the sense that they are the words at the heart of the corpus linking the largest number of sentences together. In any case, the three rankings should not be very different from each other, as it is actually the case (since the first four positions are occupied by the same words in all three cases) and, as it happens in the case under study, Ranking 1 and Ranking 3 are more similar to each other than to Ranking 2. However, as the number of words considered at the top of each ranking increases, the differences between the three rankings become much more evident, as it can be seen in Figure 5, where we plot the differences between these rankings by visualizing the variation of the Kendall's  Ranking 1   Ranking 2  Ranking 3  1st  network  network  network  2nd  system  system  system  3rd  model  model  model  4th  complex  complex  complex  5th  process  number  graph  6th  number  process  process  7th information  structure  structure  8th  graph  new  information  9th  new  information  number  10th  structure  distribution  new  11th properties properties properties 12th distribution graph distribution 13th study study dynamics 14th dynamics dynamics study 15th case interaction analysis  tau coefficient (τ ) [45] regarding the number of lexical words considered in the three rankings.

Seeking for distinctive characteristics that allow distinguishing the styles of different authors and language levels
By considering several types and models of hypergraphs and hyperstructures for a given text or corpus, we can associate to that written text or corpus various features that allow us to identify it as if it were some sort of mathematical signature associated with them. For example, for a given text it is possible to consider a hypergraph in which the nodes are the words and the hyperedges are the sentences, another in which the nodes are the words and the hyperedges are the paragraphs, another in which the nodes are the sentences and the hyperedges are the paragraphs, just to mention some of the possibilities. This succession of mathematical structures and the different parameters (such as diameter, degree distribution, centrality, efficiency, among others, that characterize them) are, without a doubt, elements that configure and allow us to characterize and compare different texts, making it clear the characteristics that constitute their seal of identity in terms of style.

Conclusions
We introduce and study the derivative of a hypergraph and the homogeneity graph of a hypergraph as new and useful structures that can be used to study the degree of independence of the nodes of a hypergraph as well as to obtain a ranking of the most representative nodes of the hypergraph in the sense that the lexical words represented by these nodes link the most significant ideas and concepts of the text without necessarily being those terms usually considered as keywords.
These concepts allow us to associate not only a numerical index that allows us to quantify the heterogeneity and lack of similarity between the nodes of the hypergraph, but also to associate a graph to characterize the heterogeneity and dissimilarity of the different elements of the considered hypergraph.
Moreover, these concepts also allow us to obtain technical characteristics related to the styles of the different authors and the language competence level of any text written in English, as well as their possible application to text classification, text summarization, automated translation, stylometry and authorship detection.
Undoubtedly, the tools derived from the linguistic analysis obtained by using this new tool will provide with new models and better instruments to typify and locate the characteristics of the style of different authors together with the style and intrinsic linguistic characteristics found in specialized texts in terms of collocations, word sense desambiguation and syntagmatic structures.
Finally, it is important to mention that the construction of tools to find lexical patterns of the style of an author or a text belonging to a specialty language, the automatic classification of texts according to their style and the automatic labeling and identification/verification of lexical patterns are some possible additional applications of these new tools.