Higher-order random network models

Jinyu Huang; Youxin Hu; Weifu Li; Maoyan Lin

doi:10.1088/1367-2630/ad106a

1. Introduction

Many real-world complex systems can be modeled as complex networks. In the study of complex networks, most structural properties of networks are derived from connections or edges [1–3]. For example, the degree of a node or vertex is the number of edges incident to the node. Previous research [4] indicates that degree correlations exist in real-world networks. Community structure [5–7], which means that nodes in the same community are densely connected, but nodes between different communities are sparsely connected, is also a typical edge-based feature.

To model real-world complex systems, edge-based random network models have been extensively studied. Erdős and Rényi [8] proposed a simple random network model called the ER (Erdős–Rényi) model, which is a widely used model of real-world networks. However, researchers found that real-world networks have critical properties that do not exist in ER random networks. As a result, various complex network models other than the ER model were proposed. Specifically, Watts and Strogatz [9] introduced the WS (Watts–Strogatz) model, which can generate networks with small characteristic path lengths and large clustering coefficients. Later, Barabási and Albert [10] proposed the BA (Barabási–Albert) model, which generates scale-free networks. Then Newman et al [11] developed random graph models with arbitrary degree distributions. Besides, Papadopoulos et al [12] improved preferential attachment models such that both popularity and similarity are under consideration.

An edge represents a relationship between two nodes. In contrast, higher-order structures can characterize complex relationships between n (n > 2) nodes. Because real-world networks are complicated, it is necessary to explore network properties based on higher-order structures. Network motifs [13], statistically significant connectivity patterns, illustrate the importance of higher-order structures. For example, motif-based communities reveal critical hidden structural properties of complex networks that cannot be described by traditional edge-based community structure. To effectively identify motif-based communities, motif-based community detection was widely investigated [14–16]. The higher-order clustering coefficient [17], which generalizes the traditional edge-based clustering coefficient, is another proposed property based on higher-order structures. Graphlets, which are special subgraphs, were also extensively studied [18].

Besides properties based on higher-order structures, researchers pay attention to higher-order network or graph models [19]. Hypergraphs are classic higher-order graph models, in which hyperedges indicate relationships among an arbitrary number of nodes. Recently, the generative model of clustered hypergraphs was introduced [20]. In addition, the consensus dynamics over temporal hypergraphs were investigated [21]. Moreover, a three-body consensus dynamical model over hypergraphs was developed [22], which features asymmetric roles of interacting agents in a triangle. Simplicial complexes on n vertices [23] can also be used to develop higher-order graph models. For example, exponential random simplicial complexes [24], which generalize exponential random graphs, were proposed. Besides, growing simplicial network models for complete graphs or cliques were introduced [25]. Researchers also introduced and investigated various multilayer network models [26, 27], such as multiplex networks [28], interdependent networks [29], and temporal networks [30]. However, current higher-order network models cannot describe many essential features based on higher-order structures in real-world complex systems. For developing higher-order complex network models, we introduce a framework including higher-order stubs, adjacency tensors, higher-order degrees, and generating functions for higher-order degrees. Then we develop higher-order random networks with arbitrary higher-order degree distributions through the proposed framework. In the supplementary material, we also propose higher-order stochastic blockmodels and higher-order preferential attachment models through the proposed framework.

2. The framework for higher-order complex network models

At first, we consider the triangle. The triangle-based higher-order stub is presented in figure 1(b). In a network G with n nodes, we define the triangle-based higher-order degree of node u as the number of triangle-based higher-order stubs that u has in G. Then the triangle-based higher-order degree of u equals the number of triangle instances in which u participates. Here, a triangle (or motif) instance is an induced subgraph of G that is isomorphic to the triangle (or motif). Next, we define triangle-based adjacency tensors. We use a pure covariant tensor of mixed type $(0,3)$ to represent an adjacency tensor. Let $e^{1},\ldots,e^{n}$ corresponding to n vertices $1,\ldots,n$ be the standard basis of $\mathbb{R}^n$ . Then adjacency tensor A^Tr has the form

$\begin{equation} A^{Tr} = A^{Tr}_{ijk}e^{i}\otimes e^{j}\otimes e^{k} \end{equation} \tag{ 1 }$

where $A^{Tr}_{ijk}$ is a component of the tensor and Einstein summation notation is used. Given the basis, the adjacency tensor is uniquely determined by its n³ components. Three distinct nodes i, j, and k are said to satisfy condition Γ if induced subgraph H with node set $\{i,j,k\}$ is isomorphic to the triangle. The components of triangle-based adjacency tensor A^Tr are defined as

$\begin{equation} A^{Tr}_{ijk} = \begin{cases} 1 & \quad \text{if } i, j \text{ and } k \text{ satisfy condition } \Gamma \\ 0 & \quad \text{otherwise}. \end{cases} \end{equation} \tag{ 2 }$

Since there is only one type of the triangle-based higher-order stub, A^Tr satisfies

$\begin{equation} A^{Tr}_{ijk} = A^{Tr}_{ikj} = A^{Tr}_{jik} = A^{Tr}_{jki} = A^{Tr}_{kij} = A^{Tr}_{kji}. \end{equation} \tag{ 3 }$

Further, we define the triangle-based generating function as

$\begin{equation} G^{\left(2\right)}_0\left(x\right) = \sum_{k = 0}^{\infty}p^{\left(2\right)}_kx^k \end{equation} \tag{ 4 }$

where $p^{(2)}_k$ is the probability that a randomly chosen node has triangle-based higher-order degree k. The properties of $G^{(2)}_0(x)$ are similar to the properties of traditional edge-based generating functions. For example, $G^{(2)}_0(1) = 1$ and

$\begin{equation} z = \sum_{k}kp^{\left(2\right)}_k = \left(G^{\left(2\right)}_0\left(x\right)\right)^{^{\prime}}\mid_{x = 0} \end{equation} \tag{ 5 }$

where z is the average triangle-based higher-order degree.

**Figure 1.** Schematic illustration of higher-order stubs. (a) Triangle. (b) Triangle-based higher-order stub. (c) FFL (Feed-Forward Loop). (d) FFL-based higher-order stubs. The α-stub, the β-stub, and the γ-stub are three types of FFL-based higher-order stubs.
Download figure:
Standard image High-resolution image

The degree distributions of many real-world networks approximately follow power laws, namely, $p_k\sim k^{-\tau}$ . We observe that the higher-order degree distributions of real-world networks also have special features such as following power laws. Here we consider the Internet, the Amazon product co-purchasing network [31], and the Arxiv COND-MAT (Condensed Matter Physics) collaboration network [32]. Figure 2 shows that the triangle-based higher-order degree distributions of all three networks approximately follow power laws.

**Figure 2.** Triangle-based higher-order degree distributions of real-world networks. (a) Amazon product co-purchasing network. (b) ArXiv COND-MAT collaboration network. (c) Internet.
Download figure:
Standard image High-resolution image

In the following, we consider higher-order structures with directed edges. Specifically, we consider the FFL (feed-forward loop), whose higher-order stubs are shown in figure 1(d). For the FFL, higher-order stubs are useful gadgets to identify the roles of nodes. For example, In E. coli [13], if node u has an α-stub, it indicates that the transcription factor u regulates another transcription factor v (node v has a β-stub), and both u and v modulate the transcription rate of target gene w (node w has a γ-stub). The α-degree of node u is the number of α-stubs that u has. Here the number of α-stubs of u equals the number of FFL instances containing u such that u has out-degree 2 in each instance. We define the β-degree and the γ-degree of u in the same way as the α-degree. Then we define the adjacency tensor for FFL M₅. A triplet $(i,j,k)$ with three distinct nodes is said to satisfy condition Δ if induced subgraph H with node set $\{i,j,k\}$ is isomorphic to M₅ such that i has an α-stub, j has a β-stub and k has a γ-stub in H. Then the components of FFL-based adjacency tensor A^M₅ are defined as

$\begin{equation} A^{M_5}_{ijk} = \begin{cases} 1 & \quad \text{if } \left(i,j,k\right) \text{satisfies condition } \Delta\\ 0 & \quad \text{otherwise}. \end{cases} \end{equation} \tag{ 6 }$

Similar to adjacency matrices, we can obtain higher-order degrees of nodes from adjacency tensors. For example, the α-degree $d^{M_5}_{\alpha}(u)$ of node u is

$\begin{equation} d^{M_5}_{\alpha}\left(u\right) = \sum_{jk}A^{M_5}_{ijk}. \end{equation} \tag{ 7 }$

Finally, we introduce a generating function for the FFL. The generating function for the joint probability distribution of α-degrees, β-degrees, and γ-degrees is defined by

$\begin{equation} G^{\left(2\right)}_1\left(x,y,z\right) = \sum_{ijk}p^{\left(2\right)}_{ijk}x^iy^jz^k \end{equation} \tag{ 8 }$

where $p^{(2)}_{ijk}$ is the probability that a randomly chosen node has α-degree i, β-degree j, and γ-degree k. Thus we have

$\begin{equation} z_{\alpha} = \sum_{ijk}ip^{\left(2\right)}_{ijk} = z_{\beta} = \sum_{ijk}jp^{\left(2\right)}_{ijk} = z_{\gamma} = \sum_{ijk}kp^{\left(2\right)}_{ijk} \end{equation} \tag{ 9 }$

where z_α, z_β, and z_γ are the average α-degree, the average β-degree, and the average γ-degree respectively. Equation (9) are equivalent to the following condition:

$\begin{equation} \frac{\partial G^{\left(2\right)}_1}{\partial x}\mid_{x,y,z = 1} = \frac{\partial G^{\left(2\right)}_1}{\partial y}\mid_{x,y,z = 1} = \frac{\partial G^{\left(2\right)}_1}{\partial z}\mid_{x,y,z = 1}. \end{equation} \tag{ 10 }$

Figure 3 presents the FFL-based higher-order degree distributions of the arXiv HEP-PH (High Energy Physics Phenomenology) citation network [33]. All three FFL-based higher-order degree distributions of the arXiv HEP-PH citation network approximately follow power laws.

In comparison to hypergraphs and motif adjacency matrices, higher-order stubs and adjacency tensors provide more details about relationships based on higher-order structures. In particular, hyperedges or motif adjacency matrices cannot provide information about types of higher-order stubs. Similar to the directions of edges, the types of higher-order stubs are useful in the study of complex networks.

3. Higher-order random networks with arbitrary higher-order degree distributions

3.1. Higher-order random networks with undirected edges

We develop higher-order random networks with arbitrary higher-order degree distributions through the triangle. It is straightforward to develop other higher-order random network models based on the presented model. The process of generating a network through triangle-based higher-order stubs is as follows. We generate n random numbers according to the triangle-based higher-order degree distribution $p^{(2)}_k$ . The only requirement is that the sum of the numbers should be divisible by three. If the sum is not divisible by three, we discard the n numbers and generate a new random set with n numbers until the above condition is satisfied. After the n numbers are available, we randomly choose three triangle-based higher-order stubs and join them in the network until all higher-order stubs are chosen. Algorithm 1 presents a pseudocode to generate a network G whose triangle-based higher-order degree distribution approximately follows F. In the algorithm, a random number X is sampled from distribution F. There are many methods [34] to draw a sample from the given distribution. For example, the rejection sampling algorithm can return a random number X with the given distribution. Further, $1\unicode{x2A7D}\lfloor X\rfloor\unicode{x2A7D}(n-1)(n-2)/2$ requires the triangle-based higher-order degrees to be larger than zero and not too large. We allow loops and parallel edges, and thus the triangle-based higher-order degree distribution of the generated networks may not follow $p^{(2)}_k$ . To resolve this problem, the triangle-based higher-order degree of a node incident to loops and (or) parallel edges can be defined as follows. Without loss of generality, the node set is represented as $\{1,\cdots,n\}$ . If node i is incident to two parallel edges and a loop, we add two to $d^{(2)}(i)$ ; if i is incident to two parallel edges and adjacent to a different node j with a loop, we add one to $d^{(2)}(i)$ ; if i is incident to three loops, we add three to $d^{(2)}(i)$ . If endpoints i and j (i < j) of two parallel edges have loops, we add two to $d^{(2)}(i)$ and add one to $d^{(2)}(j)$ . When computing $d^{(2)}(i)$ , we first consider loops and remove all the relevant loops, and then we consider parallel edges (along with their corresponding loops). The above method can uniquely determine the triangle-based higher-order degrees and the modified triangle-based higher-order degrees of any network drawn from the proposed higher-order random network model are identical to the predetermined n random numbers. Figure 4 presents an example. In some cases, the networks drawn from the proposed higher-order random network model have few loops and parallel edges. For example, through numerical simulation, we find that the triangle-based higher-order degree distribution of the simple graphs drawn from the model with $p^{(2)}_k\sim k^{-3}$ (we remove loops and keep only one copy of parallel edges of the original networks) approximately follow the same distribution.

**Figure 4.** Example of triangle-based higher-order degrees. The triangle-based higher-order degrees are $d^{(2)}(u) = 2$ , $d^{(2)}(v) = 1$ , and $d^{(2)}(w) = 3$ .
Download figure:
Standard image High-resolution image

**Figure 4.** Example of triangle-based higher-order degrees. The triangle-based higher-order degrees are $d^{(2)}(u) = 2$ , $d^{(2)}(v) = 1$ , and $d^{(2)}(w) = 3$ .
Download figure:
Standard image High-resolution image

Algorithm 1. Sampling a higher-order random network with triangle-based higher-order degrees.
Input: Number of nodes n, distribution F
Output: Network G
$V\leftarrow\{1,\ldots,n\}$
$total\, Deg\leftarrow0$
Let S₁ be an empty sequence
do {
Set S₁ to be an empty sequence
$total\, Deg\leftarrow0$
while $\|S_1\|\lt n$
Sample a random number X with distribution function F
if $1\unicode{x2A7D}\lfloor X\rfloor\unicode{x2A7D}\frac{(n-1)(n-2)}{2}$
Add the integer $\lfloor X \rfloor$ into S₁
$total\, Deg\leftarrow total\,Deg+\lfloor X \rfloor$
} while( $total\, Deg\%3\neq0$ )
Let S₂ be an empty sequence
for i in V
Add $S_1[i]$ triangle-based higher-order stubs of node i into S₂
while( $\|S_2\|\gt0$ )
Randomly choose three elements $h_1,h_2,h_3$ from S₂ and remove them from S₂
Join one stub in h₁ to one stub in h₂, one stub in h₁ to one stub in h₃, and
one stub in h₂ to one stub in h₃ $\vartriangleright$ edges of G are created

Networks drawn from the developed higher-order random network models have crucial structural properties presented in real-world networks. We observe that the degree distributions of the real-world networks in figure 2 are right skewed. Similarly, figure 5(a) indicates that the degree distribution of the proposed higher-order random networks with power-law triangle-based higher-order degree distributions are also right skewed. In contrast, the triangle-based higher-order degrees of random graphs with power-law degree distributions are sparsely distributed (see figure 6(b)). It is not difficult to explain this phenomenon, since random graphs with power-law degree distributions generally have a tree-like structure, and hence contain only a few triangle instances. We also observe that the real-world networks in figure 2 have large average clustering coefficients (see table 1). In addition, table 1 shows that average clustering coefficients of four real-world networks with power-law degree distributions are between 0.6 and 0.7. Figure 7 indicates that the average clustering coefficients of the networks generated by higher-order random network models with power-law triangle-based higher-order degree distributions are larger than 0.57. However, random graphs with power-law degree distributions have small average clustering coefficient values. In particular, figure 7 shows that the average clustering coefficients of random graphs with power-law degree distributions are close to zero when the exponent is greater than 2.4.

**Figure 5.** Distributions of higher-order random networks with power-law triangle-based higher-order degree distributions. Networks with 10⁵ nodes are generated from the higher-order random network model whose triangle-based higher-order degree distribution $p^{(2)}_k$ satisfies $p^{(2)}_k\sim k^{-3}$ , where k represents a triangle-based higher-order degree. The solid line is the function $f(x) = x^{-3}$ . (a) Degree distribution of the network. (b) Triangle-based higher-order degree distribution of the network.
Download figure:
Standard image High-resolution image

**Figure 5.** Distributions of higher-order random networks with power-law triangle-based higher-order degree distributions. Networks with 10⁵ nodes are generated from the higher-order random network model whose triangle-based higher-order degree distribution $p^{(2)}_k$ satisfies $p^{(2)}_k\sim k^{-3}$ , where k represents a triangle-based higher-order degree. The solid line is the function $f(x) = x^{-3}$ . (a) Degree distribution of the network. (b) Triangle-based higher-order degree distribution of the network.
Download figure:
Standard image High-resolution image

**Figure 6.** Distributions of random networks with power-law degree distributions. Network with 10⁵ nodes are generated from the random network model whose degree distribution p(k) satisfies $p(k)\sim k^{-3}$ where k represents a degree. The solid line is the function $f(x) = x^{-3}$ . (a) Degree distribution. (b) Triangle-based higher-order degree distribution.
Download figure:
Standard image High-resolution image

**Figure 6.** Distributions of random networks with power-law degree distributions. Network with 10⁵ nodes are generated from the random network model whose degree distribution p(k) satisfies $p(k)\sim k^{-3}$ where k represents a degree. The solid line is the function $f(x) = x^{-3}$ . (a) Degree distribution. (b) Triangle-based higher-order degree distribution.
Download figure:
Standard image High-resolution image

**Figure 7.** Average clustering coefficient as a function of the exponent τ for the power-law distribution k^−τ. Each average clustering coefficient is derived from the following approximation algorithm that runs 10³ times on the network with 10⁵ nodes generated from a random network model. The algorithm randomly chooses a node u, randomly chooses two neighbors of u, and finally checks whether the chosen neighbors are connected. Then the approximate average clustering coefficient is the fraction of triangles found over the 10³ runs. The circles correspond to the networks generated by the higher-order random network model with power-law triangle-based higher-order degree distributions. The diamonds correspond to the networks generated by the random network models with power-law degree distributions.
Download figure:
Standard image High-resolution image

Table 1. Average clustering coefficients of real-world networks.

Network name	Average clustering coefficient
Amazon product co-purchasing network	0.40
Arxiv COND-MAT collaboration network	0.63
Internet	0.22
Stanford web graph	0.60
Arxiv HEP-PH collaboration network	0.61
Arxiv ASTRO-PH collaboration network	0.63

At the end of this subsection, we introduce another higher-order random network model with arbitrary higher-order degree distributions of theoretical interest. Given a network, some edges may not participate in any triangle instance. Therefore, we define 'residual network' H of G as the network obtained by removing all triangle instances from G. During the deletion of a triangle instance, only the edges of the triangle instance are deleted. Then the node sets of H and G are the same. We define the residual degree of a node in G as the degree of the node in H. Next, we define generating function $G^{(2)}_2(x,y)$ for the joint probability distribution of higher-order degrees and residual degrees as

$\begin{equation} G^{\left(2\right)}_2\left(x,y\right) = \sum_{ij}q_{ij}x^iy^j \end{equation} \tag{ 11 }$

where q_ij is the probability that a randomly chosen node from G has triangle-based higher-order degree i and residual degree j.

The procedure to generate a network from the triangle-based higher-order random network model is as follows. Initially, there are n isolated nodes. Then we generate a sequence of n pairs $(d^{(2)}(u), r(u))$ representing triangle-based higher-order degrees and residual degrees of nodes according to the distribution q_ij. We require that the sum $\sum_{u}d^{(2)}(u)$ is divisible by three and the sum $\sum_{u}r(u)$ is even. If these conditions are not satisfied, we repeat the following processes until the conditions are satisfied: randomly choose a node u; delete the pair $(d^{(2)}(u), r(u))$ ; generate a new pair from the distribution q_ij. In the following, we randomly choose three triangle-based higher-order stubs and place a triangle instance on the network by joining them until all triangle-based higher-order stubs are chosen. Next, we randomly choose two stubs and then place an edge on the network by joining the stubs until all stubs are chosen. Notice that the sum of degrees is $\sum_u(2d^{(2)}(u)+r(u))$ , which is automatically even.

3.2. Higher-order random networks with directed edges

In this subsection, we develop higher-order random network models through the FFL. Specifically, we develop higher-order random network models with arbitrary FFL-based higher-order degree distributions. Initially, we randomly generate a set of n triplets $(i_u,j_u,k_u)$ according to joint probability distribution $p^{(2)}_{ijk}$ . Here triplet $(i_u,j_u,k_u)$ indicates that the numbers of α-stubs, β-stubs and γ-stubs of node u are i_u , j_u and k_u respectively. Then we compute the sums $\sum_ui_u$ , $\sum_uj_u$ and $\sum_uk_u$ . These three sums should be equal. Otherwise, we repeat the following processes until the three sums are found to be equal: randomly choose a node u and discard triplet $(i_u,j_u,k_u)$ for u; then generate a new triplet according to distribution $p^{(2)}_{ijk}$ . After n triplets are available, we randomly choose an α-stub, a β-stub, and a γ-stub, and then place an FFL instance by connecting the three higher-order stubs until all higher-order stubs are chosen. Similar to the proposed model for undirected networks, the FFL-based higher-order degree distribution of the directed networks drawn from the model may not follow $p^{(2)}_{ijk}$ . To resolve the problem, we can revise the definition of the FFL-based higher-order degree. Figure 8 presents an example. In some cases, the networks drawn from the proposed higher-order random network model have few loops. Then the FFL-based higher-order degree distributions of the simple graph corresponding to the original sampled network approximately follow $p^{(2)}_{ijk}$ . Algorithm 2 presents a pseudocode to generate a directed network whose FFL-based higher-order degree distribution approximately follows F.

**Figure 8.** Example of FFL-based higher-order degrees. The FFL-based higher-order degrees are as follows: $d^{\alpha}(u) = 1$ , $d^{\beta}(u) = 1$ , and $d^{\gamma}(u) = 0$ ; $d^{\alpha}(v) = 0$ , $d^{\beta}(v) = 0$ , and $d^{\gamma}(v) = 1$ ; $d^{\alpha}(w) = 1$ , $d^{\beta}(w) = 0$ , and $d^{\gamma}(w) = 1$ ; $d^{\alpha}(x) = 0$ , $d^{\beta}(x) = 1$ , and $d^{\gamma}(x) = 0$ ; $d^{\alpha}(y) = 1$ , $d^{\beta}(y) = 1$ , and $d^{\gamma}(y) = 1$ .
Download figure:
Standard image High-resolution image

**Figure 8.** Example of FFL-based higher-order degrees. The FFL-based higher-order degrees are as follows: $d^{\alpha}(u) = 1$ , $d^{\beta}(u) = 1$ , and $d^{\gamma}(u) = 0$ ; $d^{\alpha}(v) = 0$ , $d^{\beta}(v) = 0$ , and $d^{\gamma}(v) = 1$ ; $d^{\alpha}(w) = 1$ , $d^{\beta}(w) = 0$ , and $d^{\gamma}(w) = 1$ ; $d^{\alpha}(x) = 0$ , $d^{\beta}(x) = 1$ , and $d^{\gamma}(x) = 0$ ; $d^{\alpha}(y) = 1$ , $d^{\beta}(y) = 1$ , and $d^{\gamma}(y) = 1$ .
Download figure:
Standard image High-resolution image

Algorithm 2. Sampling a higher-order random network with FFL-based higher-order degrees.
Input: Number of nodes n, distribution F
Output: Directed network G
$V\leftarrow\{1,\ldots,n\}$
$t_1\leftarrow t_2\leftarrow t_3\leftarrow0$
Let S₁, S₂ and S₃ be empty sequences
while $\|S_1\|\lt n$
Sample a triplet $(X,Y,Z)$ containing three random numbers with distribution
function F
if $\lfloor X\rfloor\gt0$ , $\lfloor Y\rfloor\gt0$ and $\lfloor Z\rfloor\gt0$
Add $\lfloor X\rfloor$ , $\lfloor Y\rfloor$ and $\lfloor Z\rfloor$ into S₁, S₂ and S₃ respectively
$t_1\leftarrow t_1+\lfloor X \rfloor$ , $t_2\leftarrow t_2+\lfloor Y \rfloor$ , $t_3\leftarrow t_3+\lfloor Z \rfloor$
while $t_1\neq t_2$ or $t_2\neq t_3$
Randomly and uniformly choose i from V
$t_1\leftarrow t_1-S_1[i]$ , $t_2\leftarrow t_2-S_2[i]$ , $t_3\leftarrow t_3-S_3[i]$
Discard $S_1[i]$ , $S_2[i]$ and $S_3[i]$ respectively
$X\leftarrow Y\leftarrow Z\leftarrow0$
while $X\unicode{x2A7D}0$ or $Y\unicode{x2A7D}0$ or $Z\unicode{x2A7D}0$
Sample a triplet $(X,Y,Z)$ with distribution function F
$S_1[i]\leftarrow\lfloor X \rfloor$ , $S_2[i]\leftarrow\lfloor Y \rfloor$ , $S_3[i]\leftarrow\lfloor Z \rfloor$
$t_1\leftarrow t_1+\lfloor X \rfloor$ , $t_2\leftarrow t_2+\lfloor Y \rfloor$ , $t_3\leftarrow t_3+\lfloor Z \rfloor$
Let S₄, S₅ and S₆ be empty sequences
for i in V
Add $S_1[i]$ α-stubs, $S_2[i]$ β-stubs and $S_3[i]$ γ-stubs of node i into S₄, S₅ and S₆
respectively
while( $\|S_4\|\gt0$ )
Randomly choose an α-stub s₁, a β-stub s₂, and a γ-stub s₃ from S₄, S₅, and
S₆ respectively, then remove them from S₄, S₅, and S₆
Join the α-stub s₁, the β-stub s₂ and the γ-stub s₃ $\vartriangleright$ edges of G are created

Similar to the arXiv HEP-PH citation network, the in-degree and out-degree distributions of the networks generated by the proposed higher-order random network model with power-law FFL-based higher-order degree distributions also approximately follow power laws (see figures 9(d) and (e)). In contrast, for random networks with power-law in-degree and out-degree distributions, the FFL-based higher-order degrees are sparsely distributed (see figures 10(a)–(c)).

**Figure 9.** Distributions of higher-order random networks with power-law FFL-based higher-order degree distributions. Directed networks with 10⁵ nodes are generated from the proposed higher-order random network model whose joint distribution $p^{(2)}_{ijk}$ is a product of three independent power-law distributions such that each distribution is k⁻³ where k represents an α-degree, a β-degree or a γ-degree. The solid lines correspond to function $f(x) = x^{-3}$ . (a) α-degree distribution. (b) β-degree distribution. (c) γ-degree distribution. (d) In-degree distribution. (e) Out-degree distribution.
Download figure:
Standard image High-resolution image

**Figure 9.** Distributions of higher-order random networks with power-law FFL-based higher-order degree distributions. Directed networks with 10⁵ nodes are generated from the proposed higher-order random network model whose joint distribution $p^{(2)}_{ijk}$ is a product of three independent power-law distributions such that each distribution is k⁻³ where k represents an α-degree, a β-degree or a γ-degree. The solid lines correspond to function $f(x) = x^{-3}$ . (a) α-degree distribution. (b) β-degree distribution. (c) γ-degree distribution. (d) In-degree distribution. (e) Out-degree distribution.
Download figure:
Standard image High-resolution image

**Figure 10.** Distributions of directed random networks with power-law degree distributions. Directed networks with 10⁵ nodes are generated from the random network model whose joint distribution of in-degrees and out-degrees is a product of two independent power-law distributions such that each distribution is k⁻³ where k represents either an in-degree or an out-degree. (a) α-degree distribution. (b) β-degree distribution. (c) γ-degree distribution. (d) In-degree distribution. (e) Out-degree distribution.
Download figure:
Standard image High-resolution image

In real-world complex networks, there are edges that are not contained in any FFL instance. For example, many regulatory interactions in E.coli are not contained in any FFL instance. Then we define the 'residual network' H of a directed network D as the network obtained by removing all FFL instances from D. During the deletion of an FFL instance, only edges are deleted. Thus the node sets of H and D are the same. We define the residual in-degree or out-degree of a node in D as the in-degree or out-degree of the node in H. Then we define generating function $G^{(2)}_3(v,w,x,y,z)$ for the joint probability distribution of FFL-based higher-order degrees and residual degrees as

$\begin{equation} G^{\left(2\right)}_3\left(v,w,x,y,z\right) = \sum_{ijklm}q_{ijklm}v^iw^jx^ky^lz^m \end{equation} \tag{ 12 }$

where q_ijklm is the probability that a randomly chosen node from D has α-degree i, β-degree j, γ-degree k, residual in-degree l and residual out-degree m. In fact, the residual networks of a directed network can be considered as the resulting network of the target attack. In the target attack, only edges of the FFL instances are deleted. For instance, if the regulatory interactions of all the FFL instances are malfunctioning in E.coli, then the resulting network is the residual network of E.coli.

Now we propose another FFL-based higher-order random network model of theoretical interest. Initially, there are n isolated nodes. Next, we generate n tuples $(d^{\alpha}(u),d^{\beta}(u),d^{\gamma}(u),r^{in}(u),r^{out}(u))$ , one for each node u, according to joint distribution q_ijklm. Then we compute the sums $\sum_{u}d^{\alpha}(u)$ , $\sum_{u}d^{\beta}(u)$ , $\sum_{u}d^{\gamma}(u)$ , $\sum_{u}r^{in}(u)$ and $\sum_{u}r^{out}(u)$ . If the three sums $\sum_{u}d^{\alpha}(u)$ , $\sum_{u}d^{\beta}(u)$ and $\sum_{u}d^{\gamma}(u)$ are not equal or $\sum_{u}r^{in}(u)\neq\sum_{u}r^{out}(u)$ , we repeat the following processes until the conditions are satisfied: randomly choose a node u; discard the tuple $(d^{\alpha}(u),d^{\beta}(u),d^{\gamma}(u),r^{in}(u),r^{out}(u))$ ; generate a new tuple for u from the joint distribution. We place an FFL instance by randomly choosing an α-stub, a β-stub, and a γ-stub until all higher-order stubs are chosen. Next, we place a directed edge by randomly choosing an in stub and an out stub until all stubs are chosen.

4. Conclusions

We have introduced the framework for developing higher-order complex network models that exhibit crucial structural properties of real-world complex systems. Then we have proposed higher-order random network models with arbitrary higher-order degree distributions. In the supplementary material, we have introduced additional higher-order network models. In conclusion, we believe that other edge-based notions can also be generalized to the corresponding notions based on higher-order structures by our framework. For example, k-cores [35] can be generalized to higher-order k-cores using the proposed higher-order stubs. It is also imperative to explore dynamical processes by considering higher-order interactions. A notable study in this area was conducted by Shang [36], who studied consensus formation over directed hypergraphs. Specifically, Shang innovatively introduced a consensus formation framework using Petri nets. Inspired by the work of Shang, we will investigate social dynamics over the proposed higher-order complex network models in the future.

Acknowledgments

We thank C V Cannistraci and A Muscoloni for their suggestions. This research was supported by the Scientific Research Foundation of Sichuan University of Science and Engineering with Grant No. 2021RC13.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Higher-order random network models

Article metrics

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. The framework for higher-order complex network models

3. Higher-order random networks with arbitrary higher-order degree distributions

3.1. Higher-order random networks with undirected edges

3.2. Higher-order random networks with directed edges

4. Conclusions

Acknowledgments

Data availability statement

Higher-order random network models

Article metrics

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. The framework for higher-order complex network models

3. Higher-order random networks with arbitrary higher-order degree distributions

3.1. Higher-order random networks with undirected edges

3.2. Higher-order random networks with directed edges

4. Conclusions

Acknowledgments

Data availability statement