Reconstructing mesoscale network structures

When facing complex mesoscale network structures, it is generally believed that (null) models encoding the modular organization of nodes must be employed. The present paper focuses on two block structures that characterize the mesoscale organization of many real-world networks, i.e. the bow-tie and the core-periphery ones. Our analysis shows that constraining the network degree sequence is often enough to reproduce such structures, as confirmed by model selection criteria as AIC or BIC. As a byproduct, our paper enriches the toolbox for the analysis of bipartite networks - still far from being complete. The aforementioned structures, in fact, partition the networks into asymmetric blocks characterized by binary, directed connections, thus calling for the extension of a recently-proposed method to randomize undirected, bipartite networks to the directed case.


I. INTRODUCTION
The analysis of mesoscale network structures is a topic of great interest within the community of network scientists: much attention, however, has been received by the community-detection topic [1][2][3], while the analysis of other meso-structures has remained far less explored.
The present work aims at contributing to this stream of research, by exploring the effectiveness of null models that constrain only local information in explaining complex meso-structures as the bow-tie and the core-periphery ones. Both are characterized by a central, cohesive subgraph surrounded by a loosely-connected set of nodes [4]. In the first case, however, the central part of the network has a fan-in and a fan-out component, respectively entering into and exiting from it.
In order to do so, a comparison between competing null models has been carried out and model selection criteria have been applied in order to unambiguously determine the winner. Remarkably, all null models considered in the present paper can be recovered within the same framework, i.e. the entropy-maximization one, which has been proven to be rather effective for both pattern detection and real-world networks reconstruction [5,6].
As a byproduct, our paper enriches the toolbox for the analysis of bipartite networks. Among the many, available, network representations, the bipartite one has recently received much attention [7,8]. This, in turn, has led to the definition of algorithms for randomizing [9][10][11][12], reconstructing [13] or projecting [14,15] undirected, bipartite networks. The directed case, however, has not yet been explored, thus calling for the definition of techniques to approach the study of this kind of networks as well. This is especially true when considering that bi-partite networks emerge quite naturally when studying the aforementioned mesoscale structures. It is, in fact, evident that analysing the way nodes cluster together unavoidably leads to the analysis of the way such modules interact. From an algebraic point of view, this boils down to consider matrices characterized by diagonal square blocks (i.e. the adjacency matrices of the modules themselves) and off-diagonal rectangular blocks (i.e. the adjacency matrices of the bipartite networks encoding their interactions). Our method will be employed to analyse economic and financial networks: more specifically, we will focus on two systems, the World Trade Web and the Dutch Interbank Network. As we will show, while the former can be described by a partial bowtie structure, the latter is characterized by the coexistence of a core-periphery -like structure and a proper bow-tie one, the second one carrying a larger amount of information about the system evolution than the first one.

II. DATA
Let us now describe the two systems we have considered for the present analysis.
The World Trade Web. We consider yearly bilateral data on exports and imports from the UN COMTRADE database [16], from 1992 to 2002. We limit ourselves to considering the World Trade Web (WTW hereafter) in its binary, directed representation at the aggregate level. In order to perform a temporal analysis and compare different years, we restrict ourselves to a balanced panel of N = 162 countries (present in the data throughout the considered interval). Accordingly, for a given year t, a t ij = 1 (a t ij = 0) means that country i has registered a non-null (null) export towards country j.
The Dutch Interbank Network. We consider a dataset where nodes are Dutch banks and a link from node i to node j indicates that bank i has an exposure larger than 1.5 million euros and with maturity shorter than one year, towards a creditor bank j [17]. We consider 44 quarterly snapshots of the Dutch Interbank Network (DIN hereafter), from 1998Q1 to 2008Q4. The last year in the sample represents the year during which the recent financial crisis became manifest.

A. The general framework
Let us, first, provide an algebraic representation of the mesoscale structures considered in the present paper, i.e. the bow-tie and the core-periphery ones.
Networks characterized by a core-periphery structure can be represented as follows: the adjacency matrix A is composed by four distinct blocks: while the square adjacency matrices A • and A • lying along the diagonal represent the core and the periphery modules, the two rectangular (in the most general case), off-diagonal matrices A and A ⊥ represent the (bipartite) networks through which they interact. Usually, the link densities of the matrices above satisify the chain of relationships i.e. the core module is (much) denser than the periphery module.
Notice that the two matrices A and A ⊥ bring genuinely different information: while the generic entry a cp = 1 (a cp = 0) indicates that a directed link from the node c in the core to the node p in the periphery is present (absent), the generic entry a ⊥ pc = 1 (a ⊥ pc = 0) indicates that a directed link from the periphery node p to the core node c is present (absent). In other words, in order to fully describe the topological structure of one, directed bipartite network, two matrices are, in fact, needed. Naturally, in case the network A is undirected, While the definition of core-periphery structure is quite intuitive, the definition of bow-tie structure, on the other hand, is based on the concept of node reachability: node i is reachable from node j if a path exists from node i to node j (a path being defined as a sequence of adjacent links connecting i with j). According to this definition, each node is assigned to one of the sets described in [18]. The definition of the three most relevant ones follows: • SCC: each node in the Strongly Connected Component (SCC) is reachable from any other node belonging to the SCC; • IN: each node in the SCC is reachable from any node belonging to the IN-component; • OUT: each node in OUT-component is reachable from any node belonging to the SCC.
According to the definitions above, networks characterized by a bow-tie structure can be represented by the following adjacency matrix the three blocks A s , A i and A o representing the SCC, IN-and OUT-component respectively. The off-diagonal matrices A and A , instead, represent the (bipartite) networks through which they interact.
In order to analyse the two kinds of mesoscale structures described above, we will implement several null models a brief description of which is provided below (for a detailed description see Appendix A).

B. Null models
The first class of null models we consider for the present analysis is the one including the so-called degree-informed null models. All null models in this class are defined by constraints encoding nodespecific local information (i.e. the directed degree sequences), beside the membership of nodes to specified groups (labeled by the symbols {g i }). By combining these two kinds of information, one obtains, in the most general case, block-specific directed degree sequences, definable as  . Remarkably, all null models in this class induce a probability for the generic network configuration A reading with different null models inducing different functional forms for the probability coefficients {p ij } (see Appendix A).
When analysing directed networks, however, a non-trivial piece of information to be taken into account is represented by reciprocity [19]. For this reason, a second class of null models, the one including the so-called reciprocity-informed null models, is considered as well. Null models in this class are defined by constraints encoding the (non) reciprocal degree sequences, beside the usual nodes membership. In the most general case, the constraints defining such models can be written as [19] and k rs ← → i indicating the contribution to the reciprocal degree of node i (belonging to block r) coming from block s. All models in this second class induce a probability for the network A reading as before, different null models induce different functional forms for the probability coefficients For both classes of models, the likelihood func- ). The second passage follows from the observation that each null model we consider in this paper treats different nodes pairs as independent, thus inducing a factorized form of the probability coefficient P (A) over the aforementioned blocks.

C. Model selection criteria
Although rising the number of parameters in order to better reproduce the observations is tempting, the risk of overfitting should be, nevertheless, avoided. A criterion to identify the best model out of a basket of possible ones is, thus, needed. In what follows, we will adopt the Akaike Information Criterion (AIC hereafter) and the Bayesian Information Criterion (BIC hereafter) FIG. 2. Dynamics of the in-degree (defined as hi = j( =i) aji) and of the reciprocated degree (defined as k ↔ i = j( =i) aijaji) of a sample of countries (Italy, in green; Japan, in black; China, in red; Russia, in blue; India, in brown; USA, in purple; Australia, in orange): while the in-degree remains rather stable across time, the value of the reciprocated degree keeps rising once the country has joined the SCC. Such a dynamics can be interpreted as a signal of ongoing integration [25].
whose first addendum, in both cases, is proportional to the log-likelihood of the null model under analysis, K is the number of parameters defining the model and n is the sample size (set, as usual, at N (N − 1)). Both AIC and BIC are minimum for the best explanatory model in the basket [20]. In order to make eqs. III.10 and III.11 more explicit, let us call B the number of blocks our network has been divided into (i.e. the diagonal blocks of the matrix A). While the Directed Random Graph (DRG) is defined by just one parameter, K DRG = 1, the Stochastic Block Model (SBM) is defined by K SBM = B 2 parameter (as can be verified upon inspecting definitions III.1 and III.2). Specifying the degree sequence leads to a further rise of the number of parameters: the Directed Configuration Model (DCM) is, in fact, defined by K DCM = 2N , the directed degree-corrected Stochastic Block Model (ddc-SBM) is defined by K ddc−SBM = 2N + B 2 and the Block Configuration Model (BCM) is defined by K BCM = 2N B (each node, in fact, "needs" two parameters per block). Accounting also for the information provided by the reciprocity requires a number of parameters to be specified that is K RCM = 3N for the Reciprocal Configuration Model (RCM) and K BRCM = 3N B for the Block Reciprocal Configuration Model (BRCM -each node, in fact, "needs" three parameters per block).

IV. RESULTS
The World Trade Web. Although the WTW has been deeply studied throughout the years [21][22][23][24], the analysis of its mesoscale organization has, so far, received far less attention [25,26]. Interestingly, checking for the applicability of the bow-tie definition provided above, the WTW appears as being partitioned into a SCC and an IN-component only, the OUT-component being completely missing (see fig. 1). According to the algebraic representation introduced at the beginning of the paper, the WTW mesoscale structure is represented by the following adjacency matrix  [25], where only the largest connected component was considered.
From a macroeconomic point of view, the increasing number of nodes within the SCC may evidence a sort of ongoing globalization process [25]. It is interesting to notice that the inclusion of (whole subsets of) countries within the SCC seems to be related to the existence of trade agreements. Examples are provided by Commonwealth nations -all of which are part of the SCC since 1993 -European nations (EU as a whole joined the SCC in 1994, the same year of the EEA agreement) and the case of USA (NAFTA entered into force in 1994 as well). From a purely topological perspective, an interesting dynamics takes place: as shown in fig. 2, the reciprocal degree of nodes belonging to the SCC keeps rising. Since all nodes are characterized by a rather stable in-degree value, this finding points out the tendency of such countries to reciprocate previously-established connections by creating new out-going links (i.e. to consolidate existing trade relationships).
Beside revealing that the high value of reciprocity within the SCC is one of the causes behind the existence of a large number of paths within it, the overall effect of this dynamics seems, thus, to be that of fostering trade exchanges between the members of the SCC.
Let us now analyse what kind of topological information is actually needed in order to explain the mesoscale WTW structure. To this aim, let us summing up the observations about the actual structure of the WTW by imagining a densely-connected, highly-reciprocated SCC (c(A s ) r(A s ) 0.8 throughout our temporal interval).
The need of considering a block model becomes evident when comparing the homogeneous benchmark provided by the DRG with its block-wise counterpart, i.e. the SBM (see fig. 3): the SBM outperforms the DRG since the network is "composed" by parts characterized by very different link-densities (c(A s ) ∈ [0.75, 0.9] and c(A i ) = 0) that cannot be reproduced with just one, global parameter.
Generally speaking, however, benchmarks encoding the degrees heterogeneity are to be preferred. Interestingly, (both) non-block models outperform block models, indicating that specifying additional information to the one encoded into local properties is indeed unnecessary. This is not surprising, however, when considering that the nodes belonging to the IN-component have zero in-degrees. The latter, in fact, are exactly reproduced by both the DCM and the RCM: the "peripherical" part of the network under analysis is, thus, automatically explained by a simpler kind of statistics with no need to invoke any a priori partition.
Let us compare our degree-informed models over the A and A s subgraphs. In the first case, the information carried by reciprocity is encoded into the degree sequence. (with ∆ m = AIC m − min{AIC m } m and ∆ m = BIC m − min{BIC m } m , respectively) one finds that the DCM always wins. The explanation of this result lies in the fact the WTW reciprocity is compatible with the DCM prediction, as the computation of the index ρ = r− r 1−r reveals (it amounts at 0.05 throughout our time interval) [27]. In other words, the seemingly peculiar mesoscale structure of the WTW is, to a good extent, reproduced by just specifying local constraints (in this case, the degree sequence).
The Dutch Interbank Network. According to the axiomatic model in [28] -defined by the properties A1) core banks are all bilaterally linked with each other, A2) periphery banks do not lend to each other, A3) core banks both lend to and borrow from at least one periphery bank -the DIN has been described as characterized by a well-defined coreperiphery structure [17]. However, as it has been pointed out elsewhere [29], such a mesoscale organization is compatible with the prediction either coming from the DCM or the RCM (depending on the topological quantity used as an indicator).
The DIN, however, is also characterized by a certain degree of bow-tieness, given the presence of an SCC, an IN-component and, differently from the WTW, also a non-empty OUT-component. Both the A i and the A o blocks, however, are empty and nodes belonging to the IN-and OUT-components are not directly linked with each other (in other words, the so-called "tubes" are absent [18]).
From a purely empirical point of view, the evolution of the DIN bow-tie structure is much more informative than the evolution of its core-periphery structure. As fig. 4 shows, while the number of nodes belonging to the core shows no significant variations, the size of the SCC reduces to more than half its pre-crisis value, thus providing an additional, structural indicator of the crisis. Very interestingly, however, the SCC starts shrinking well before 2008, seemingly constituting an early-warning signal of the upcoming, topological change affecting the DIN. The IN-component, in turn, shrinks as well, while the OUT-component enlarges.
In order to individuate the best model to explain the DIN bow-tie structure, let us notice that its SCC can be imagined as a weakly-connected, weaklyreciprocated subgraph (c(A s ) 0.1 and r(A s ) 0.3, except in 2008 where the SCC reciprocity drops to 0.15). More precisely, c(A s ) c(A) c(A • ), i.e. while the SCC connectance basically coincides with the one of the whole network, the core is much denser, an empirical observation that explains why the SBM provides a better explanation of the coreperiphery structure (in fact, the AIC and BIC values for the SBM and the DRG are closer when considering the bow-tie structure -see fig. 5).
Generally speaking, however, models accounting for the degree heterogeneity are to be preferred. As for the WTW, zero in-degree and zero out-degrees are exactly reproduced by non-block models models like the DCM and the RCM. On top of this, the low reciprocity value of the DIN (amounting at 0.3) allows us to imagine it playing a minor role in determining the nodes degrees. As a consequence, the DCM and the RCM can be interpreted as differ- Consistently, AIC and BIC weights let the DCM win in the vast majority of cases, although in some periods the DCM and the RCM compete. Overall, this is valid when considering the DIN coreperiphery structure too.

V. DISCUSSION
The WTW and the DIN represent two real-world systems characterized by (apparently) non-trivial mesoscale structures: while the first one is characterized by a (partial) bow-tie organization, in the second one the bow-tie partition co-exists with a core-periphery partition. Let us notice that, contrarily to what observed in the WTW case, AIC and BIC provide different answers to the question concerning the performance of block models in explaining the DIN core-periphery structure: while the Akaike criterion ranks the BCM first, the Bayesian criterion assigns the highest score to the SBM in the vast majority of temporal snapshots. If, on the one hand, this saves the role potentially played by blocks, on the other it points out that the large difference between the core and periphery connectivity values [29] provides -by itselfan effective explanation of this mesoscale organization.
A second comment about the DIN concerns the observation that, when considering the coreperiphery structure, the AIC values of block models overlap with the AIC values of the simpler models to a larger extent (see fig. 5): this may be a consequence of the fact that the core-periphery partition is, in some sense, less "neat" than the bow-tie one (the requirement that nodes within the IN-and OUT-components have zero in-or out-degree represents a quite strong constraint); only apparently, however, the core-periphery organization seems to require additional information to be explained.
A third comment concerns reciprocity: although it plays a role in the definition of the "core" parts (i.e. the SCC and the properly-defined core), its explanatory power is much more limited than expected: as a result, the degree sequence seems to encode all relevant information, thus questioning the role supposedly played by some kind of higher-level information -e.g. a partition into blocks -to explain (apparently complex) mesoscale structures.

APPENDIX A
Generally speaking, all null models considered in this paper can be recovered within the Exponential Random Graphs (ERG) framework. Following [5], a grandcanonical ensemble G of adjacency matrices must be considered, in order to maximize Shannon entropy S = − A∈G P (A) ln P (A) under a given set of constraints C(A) [5]. The probability coefficient P (A) is now assigned to every adjacency matrix in the esemble. The result of the aforementioned constrained-optimization problem is the well-known the directed degree-corrected SBM (ddc-SBM) is recovered. Upon retaining all multipliers in eq. V.1 and defining x i ≡ e −αi , y i ≡ e −βi and χ gigj ≡ e −wg i g j , one finds that although formally equivalent, the expressions V.4 and V.2 are not when coming to estimate the unknown parameters: eq. V.4 is, in fact, determined by solving the equations The ddc-SBM extends the results in [30,32] to the non-sparse case.
Directed Configuration Model (DCM). The DCM is obtained by posing α gi→gj i = α i and β gi→gj j = β j in eq. V.1. Upon defining x i ≡ e −αi and y i ≡ e −βi , the surviving multipliers induce probability coefficients reading to be numerically determined by solving the likelihood equations with the out-and in-degrees reading k i = j( =i) a ij and h i = j( =i) a ji respectively and k i = j( =i) p ij , h i = j( =i) p ji . The Directed Random Graph Model (DRG) can be recovered as a particular case of the DCM, obtained by posing α i ≡ α and β j ≡ β in eq. V.1. The only coefficient p ij ≡ p is determined by solving the equation L = L with L = i =j a ij and L = i =j p.

Reciprocity-informed null models
Reciprocal Configuration Model (RCM). The RCM is defined by the following probability coefficients to be numerically determined by solving the likelihood equations Block Reciprocal Configuration Model (BRCM). The RCM can be re-defined in a block-wise fashion, by specifying the probability coefficients defined by eqs. V.8, V.9, V.10 for each block. A Block Reciprocal Configuration Model (BRCM) remains naturally defined, being determined by the system of equations with obvious meaning of the symbols.

APPENDIX B
Let us explicitly solve the BCM in the two, offdiagonal matrices A and A ⊥ . In order to fix the formalism, let us suppose the two off-diagonal blocks A and A ⊥ to have dimensions C × P and P × C, respectively. Analogously to the undirected case [12], solving the DCM within the off-diagonal blocks of the matrix A induces the following probability coefficients and the probability that a link from a core node c to a periphery node p exists is p cp ≡ x c y p 1+x c y p and the probability that a link from a periphery node p to a core node c exists is q pc ≡ Consistently, the vector x = { x c , x ⊥ p } is coupled to the outgoing degrees, while the vector y = { y ⊥ c , y p } is coupled to the incoming degrees.
The aforementioned probability coefficients are determined via the likelihood condition in V.3. Let us notice that the out-degree of core nodes and the in-degree of periphery nodes are measured on the matrix A ; the converse is true for the matrix A ⊥ . More quantitatively, upon indicating with { k, h} the core and periphery nodes degrees, one has The SBM can be recovered by posing p cp ≡ p and q cp ≡ q, to be estimated by solving p = L C · P = c,p a cp C · P and q = L ⊥ C · P = c,p a ⊥ cp C · P (V.20) with obvious meaning of the symbols.
Inserting the information about reciprocity into a bipartite null model leads to the following probability coefficient (V.21) that "mixes" the information coming from the two biadjacency matrices A and A ⊥ (whence the choice of a different symbol, B, to indicate the bipartite network as a whole). The new variables read a → cp = a cp (1 − a ⊥ pc ), a ← cp = a ⊥ pc (1 − a cp ), a ↔ cp = a cp a ⊥ pc and a cp = (1 − a cp )(1 − a ⊥ pc ): while a → cp indicates that a non-reciprocated link is present from the core node c to the periphery node p, a ← cp indicates that a non-reciprocated link is present from the periphery node p to the core node c; naturally, a ↔ cp indicates that both links are present between nodes c and p and a cp indicates that no link is present between the same nodes.
The probability coefficients defining our bipartite, reciprocal model read x c r p 1 + x c r p + y c s p + z c t p , (V. 22) p ← cp = y c s p 1 + x c r p + y c s p + z c t p , (V.23) p ↔ cp = z c t p 1 + x c r p + y c s p + z c t p , (V.24) p cp = 1 1 + x c r p + y c s p + z c t p , (V.25) whose numerical value is determined by the following sufficient statistics, i.e. the reciprocal and nonreciprocal degrees of both core nodes (with c = 1 . . . C) and periphery nodes